> I have the following part of program that finds ItemID numbers. > Here, for example, are two > 146759 and 146700 . > This program works well under windows but on Linux it does not > find any number. Can you please help? > Thanks. > Ladislav > > #################### > import re > Text="""<tr BGCOLOR="#FFFFFF"> > <td valign="top" align="left"><a > href="lead.asp?ItemID=146759">[CN] Oak, Foiled & Antique > Furniture</a></td> > <td valign="top" align="center">18/12/2001</td> > </tr><tr BGCOLOR="#FFFFFF"> > <td valign="top" align="left"><a > href="lead.asp?ItemID=146700">[CN] Oak, Foiled & Antique > Furniture</a></td> > <td valign="top" align="center">18/12/2001</td> > </tr>""" > > IDs=re.compile('.*<a href="lead.asp\?ItemID=(\d{5,10}).*') > Results=re.findall(IDs,Text) > print Results
The interesting thing is, that it works at all for you. It definitely doesn't on my WinXP machine. 1.) Here > IDs=re.compile('.*<a href="lead.asp\?ItemID=(\d{5,10}).*') > Results=re.findall(IDs,Text) you call re.findall with a regular expression object as a first parameter which should be a string. What you want to do is Results = IDs.findall(Text) i. e. call the appropriate method on the re object you created. 2.) There are two whitespaces (a space and a newline - the latter may be inserted by your or my mail agent) between "<a" and "href...". So you should replace your re with something like this: IDs = re.compile('<a\s*href="lead.asp\?ItemID=(\d{5,10})', re.MULTILINE) Since you are using findall, the enclosing ".*" expressions are superfluous. BTW: Be careful reagarding backslashes in REs since the string gets interpreted two times. Regards mks _______________________________________________ ActivePython mailing list [EMAIL PROTECTED] To unsubscribe: http://listserv.ActiveState.com/mailman/listinfo/activepython