On Mon, 22 Oct 2007 22:29:38 +0000, patrick.waldo wrote: > I'm trying to learn regular expressions, but I am having trouble with > this. I want to search a document that has mixed data; however, the > last line of every entry has something like C5H4N4O3 or CH5N3.ClH. > All of the letters are upper case and there will always be numbers and > possibly one . > > However below only gave me none. > > […] > > test = re.compile('\u+\d+\.')
There is no '\u'. 'u' doesn't have a special meaning so the '\' is pointless. Your expression matches one or more small 'u's followed by one or more digits followed by a period. Examples are 'u1.', 'uuuuuuuu42.', etc. An expression that matches your first example would be: r'([A-Z]|\d|\.)+'. That's a non-empty sequence of upper case letters, digits and periods. To limit this to just one optional period the expression gets a little longer: r'([A-Z]|\d)+\.?([A-Z]|\d)+' Does not match your second example because there is a lower case letter in it. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list