Large regular expressions
Hi, So I m trying to use a very large regular expression, basically I have a list of items I want to find in text, its kind of a conjunction of two regular expressions and a big list..not pretty. However everytime I try to run my code I get this exception: OverflowError: regular expression code size limit exceeded I understand that there is a Python imposed limit on the size of the regular expression. And although its not nice I have a machine with 12Gb of RAM just waiting to be used, is there anyway I can alter Python to allow big regular expressions? Could anyone suggest other methods of these kind of string matching in Python? I m trying to see if my swigged alphabet trie is faster than whats possible in Python! Many thanks, Nathan -- http://mail.python.org/mailman/listinfo/python-list
Re: Large regular expressions
Nathan Harmston, 15.03.2010 13:21: So I m trying to use a very large regular expression, basically I have a list of items I want to find in text, its kind of a conjunction of two regular expressions and a big list..not pretty. However everytime I try to run my code I get this exception: OverflowError: regular expression code size limit exceeded I understand that there is a Python imposed limit on the size of the regular expression. And although its not nice I have a machine with 12Gb of RAM just waiting to be used, is there anyway I can alter Python to allow big regular expressions? Could anyone suggest other methods of these kind of string matching in Python? If what you are trying to match is in fact a set of strings instead of a set of regular expressions, you might find this useful: http://pypi.python.org/pypi/acora Stefan -- http://mail.python.org/mailman/listinfo/python-list
Re: Large regular expressions
Nathan Harmston iwanttobeabad...@googlemail.com writes: [...] Could anyone suggest other methods of these kind of string matching in Python? I m trying to see if my swigged alphabet trie is faster than whats possible in Python! Since you mention using a trie, I guess it's just a big alternative of fixed strings. You may want to try using the Aho-Corasick variant. It looks like there are several implementations (google finds at least two). I would be surprised if any pure python solution were faster than tries implemented in C. Don't forget to tell us your findings. -- Alain. -- http://mail.python.org/mailman/listinfo/python-list
Re: Large regular expressions
Nathan Harmston wrote: Hi, So I m trying to use a very large regular expression, basically I have a list of items I want to find in text, its kind of a conjunction of two regular expressions and a big list..not pretty. However everytime I try to run my code I get this exception: OverflowError: regular expression code size limit exceeded I understand that there is a Python imposed limit on the size of the regular expression. And although its not nice I have a machine with 12Gb of RAM just waiting to be used, is there anyway I can alter Python to allow big regular expressions? Could anyone suggest other methods of these kind of string matching in Python? I m trying to see if my swigged alphabet trie is faster than whats possible in Python! There's the regex module at http://pypi.python.org/pypi/regex. It'll even release the GIL while matching on strings! :-) -- http://mail.python.org/mailman/listinfo/python-list