Hi all, Using Beautiful Soup and regexes.. I've noticed that all the examples used regexes like so - anchors = parseTree.fetch("a", {"href":re.compile("pattern")} ) instead of precompiling the pattern.
Myself, I have the following code - >>> z = [] >>> x = q.findNext("a", {"href":re.compile(".*?thread/[0-9]*?/.*", re.IGNORECASE)}) >>> while x: ... num = x.findNext("td", "tableColA") ... h = (x.contents[0],x.attrMap["href"],num.contents[0]) ... z.append(h) ... x = x.findNext("a",{"href":re.compile(".*?thread/[0-9]*?/.*", re.IGNORECASE)}) ... This gives me a correct set of results. However, using the following - >>> z = [] >>> pattern = re.compile(".*?thread/[0-9]*?/.*", re.IGNORECASE) >>> x = q.findNext("a", {"href":pattern)}) >>> while x: ... num = x.findNext("td", "tableColA") ... h = (x.contents[0],x.attrMap["href"],num.contents[0]) ... z.append(h) ... x = x.findNext("a",{"href":pattern} ) will only return the first found tag. Is the regex only evaluated once or similar? (Also any pointers on how to get negative lookahead matching working would be great. the regex (/thread/[0-9]*)(?!\/) still matches "/thread/28606/" and I'd assumed it wouldn't. Regards, Liam Clarke _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor