aov-rx README ============= aov-rx is a malloc-free [1], wide-char based regular expression matching library. This piece of code is designed to be part of MPDM [2] in the future, but can also be used separately. Please see the TODO file to know the completion status of this project. Released under the public domain. Angel Ortega [1] Free as in 'without', not as in 'free()' [2] http://triptico.com/software/mpdm.html (Minimum Profit Data Manager) Goals ----- - Not using dynamic memory, - using wide chars (wchar_t) natively, - returns the matching substring as offset + size, - source code being readable, and - being completely reentrant and thread-safe. Features -------- - Start (^) and end of string ($) anchors - . (match any character) - Full quantifier support: - Curly bracket with optional minimum and maximum: - Exactly n matches ({n}) - At least n matches ({n,}) - At most m matches ({,m}) - At least n and at most m matches ({n,m}) - ? (match 0 or 1 occurrences, same as {0,1}) - * (match 0 or many occurrences, same as {0,}) - + (match 1 or many occurrences, same as {1,}) - Alternate sets (str1|str2|str3) - Sub-regexes (between parentheses, also optionally quantifiable) - Bracket-wrapped character sets, positive and negative ([0-9], [^0-9]) Features that will (probably) be, but not soon ---------------------------------------------- - Optional multiline matching (i.e. ^ and $ matching lines inside the string instead of the start and the end of the string) Features that (probably) will never be -------------------------------------- - Backreferences Usage ----- aov-rx consists only of one function: wchar_t *aov_match(wchar_t *rx, wchar_t *tx, size_t *size); And its usage is pretty straightforward: #include #include #include "aov-rx.h" int main(int argc, char *argv[]) { size_t s; wchar_t *res; res = aov_match(L"[0-9]+", L"The number is 123, indeed", &s); if (s) { /* res points to 123, and s contains 3 */ ... } ... --- Angel Ortega http://triptico.com