Sebastien Noel" <[EMAIL PROTECTED]> wrote > comments = soup.findAll(text=" ") > [comment.extract() for comment in comments]
Umm, why comments here and not langcanada? Just curious... > # Add some class attributes > for h1s in range(len(soup.findAll("h1"))): > le_h1 = soup.findAll("h1")[h1s] > le_h1["class"] = "heading1_main" > > for h2s in range(len(soup.findAll("h2"))): > le_h2 = soup.findAll("h2")[h2s] > le_h2["class"] = "heading2_main" You could abstract this into a function with a few parametes and put it into a loop, and thus save a load of typing! OK, Too much code to go through in detail, can you do a simple example where you try to remove some tags and it doesn't work? Also did you look at the ReplaceWith method? That may help you if you use something like a SPAN or DIV tag... I didn't see you writing anything back in that code but then I was just scanning it and may have missed it... You extract them from the parse tree but do you ever write the modified tree out? Alan G _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor