Edit Anything

Professional text and hex editing
with Binary Templates technology.







010 Editor - Text/Hex Editor Homepage

Regular Expressions are a powerful syntax for finding string patterns within a file. Many different flavors of regular expressions exist and 010 Editor uses a syntax similar to Ruby/Perl. To search for a regular expression, click the Options button in the Find Bar and enable the Search with Regular Expressions toggle (see the image below). Regular expressions can be used when performing a Find, Replace, Find In Files, or Replace In Files operation. Note that the letter 'R' will appear beside the word Options when regular expressions are enabled. The full syntax of regular expressions are beyond the scope of this document but the following contains an introduction to the major features of regular expressions. Warning: Some regular expressions can be very complex and using certain regular expressions containing lots of repetition operators can cause searches to be performed very slowly.


Matching Characters

Regular expressions look just like regular Find strings. For example, to find a string such as 'Green' just use the regular expression:

    Green

Regular expressions use a number of special control characters to control how the searches are done and the special characters are: ".[]^$()/\*{}?+|". To search for any of these control characters include an extra '\' character before the control character. For example, to search for the string "5+6" use the regular expression:

    5\+6

A number of special codes can be used to match characters:

  • . - any character (except linefeeds)
  • \w - a word character include letters, numbers, '_' and unicode characters
  • \W - a non word character
  • \s - a whitespace character includes tabs and spaces
  • \S - a non whitespace chararacter
  • \d - a decimal digit chararacter [0-9]
  • \D - a non decimal digit character
  • \h - a hexadecimal character [0-9a-fA-F]
  • \H - a non hexadecimal character
  • \t - horizontal tab (0x09)
  • \n - newline (0x0A)
  • \r - return (0x0D)
  • \a - bell (0x07)
  • \e - escape (0x1B)
  • \f - form feed (0x0C)
  • \v - vertical tab (0x0B)
  • \nnn - octal character
  • \xHH - hexadecimal character

For example, to search for all phone numbers in the form 555-5555 use the regular expression:

    \d\d\d-\d\d\d\d

Note that the case-sensitivity of regular expressions is controlled by the Match Case toggle in the Find Bar Options.


Character Classes

A Character Class or Character Set provides a way to give a number of different options that a single character can match. Character Classes are denoted with '[' and ']' brackets where each character inside the brackets can match. For example, the regular expression:

    defen[cs]e

will match both the words 'defence' and 'defense'. Inside of a character class, only the characters "]\-^" are considered control characters. The '-' character can be used to indicate a range of characters. For example the character class:

    [0-9a-fA-F]

will match any of the hexadecimal characters. Using the '^' character at the beginning of a character class indicates a negated character class, meaning the regular expression will match any characters that are not in the character class. For example, the character class:

    [^abc]

will match any characters that are not a, b, or c.


Anchors

All matching so far has worked by matching a particular character. Regular expressions also support anchors which work by matching a position within a file. The following anchors are supported:

  • ^ - beginning of the line
  • $ - end of the line
  • \b - word boundary
  • \B - non word boundary

For example, the regular expression:

    ^\d\d:\d\d:\d\d

will match a timestamp only if it exists at the beginning of a line. The '\b' anchor can be used to ensure a regular expression matches a whole word. For example the regular expression 'Al' would match both the words 'Al' and 'Alpha' but the regular expression:

    \bAl\B

would match 'Al' but not 'Alpha'. The Match Whole Word toggle in the Find Bar Options can be enabled as another way to limit regular expressions to matching whole words only.


Repetition

To match multiple characters in a row, a number of different operators can be used. Some operators are greedy meaning they match the largest number of characters they can, or lazy meaning they match as few characters as they can. The following operators are supported and are by default greedy:

  • ? - 1 or 0 times
  • * - 0 or more times
  • + - 1 or more times
  • {n,m} - at least n but not more than m times
  • {n,} - at least n times
  • {,n} - at least 0 but not more than n times
  • {n} - exactly n times

To convert a greedy operator to a lazy operator include an additional '?' after the operator (for example, '??', '*?', or '{n,m}?'. In our phone number example from above for a number such as 555-5555 we could now use:

    \d{3}-\d{4}

To match both the strings 'color' and 'colour' use the regular expressions:

    colou?r

For another example, to match a simple XML tag use:

    <[A-Za-z0-9_/]+>

This regular expression matches one or more alphanumeric characters inside '<' and '>' brackets. Repetition operations can also be used with the '(' and ')' brackets to indicate what is repeating. For example:

    reg(ular)? ex(pression)?

matches both the strings 'regular expression' and 'reg ex'. Warning: Using certain combinations of repetition operators can cause searches to be performed very slowly.


Alternation

The alternation operator '|' allows matching one out of several possible regular expressions. For example to search for the colors red, green or blue, use:

    red|green|blue

Alternation can be combined with the '(' and ')' brackets to make more complex statements. For example to search for 'const int' or 'const char' use:

    const (int|char)

Matching Hex Bytes

When searching for hex bytes use the syntax '\xHH' to denote a hex byte where HH is the byte to find. This syntax must be used for regular expressions even when the Find type is set to Hex Bytes in the Find Bar. For example, to search for the bytes '3F 4D ?? 0F' use the regular expression:

    \x3F\x4D.\x0F

Hex bytes can also be used in character classes. For example to search for the first non-zero byte use:

    [^\x00]

When regular expressions are enabled, the Find type is set to Hex Bytes and no regular expression is being editing in the Find Bar, pressing Ctrl+F on the keyboard will copy the currently selected hex bytes to the Find Bar using the \x notation.


Functions

Regular expressions can be used in scripts using the FindAll, FindFirst, FindInFiles or ReplaceAll functions and the 'method=FINDMETHOD_REGEX' parameter. Regular expressions can also be used to search within strings using the RegExMatch or RegExSearch functions.


Backreferences

Backreferences are currently not supported when performing replacement operations.


This is the manual for 010 Editor, a professional hex editor and binary file editor. Use 010 Editor to edit the individual bytes of any binary file, hard drive, or process on your machine. 010 Editor contains a whole host of powerful analysis and editing tools, plus Binary Templates technology that allows any binary format to be understood.