Sebastien Noel" <[EMAIL PROTECTED]> wrote

> comments = soup.findAll(text="&nbsp;")
> [comment.extract() for comment in comments]

Umm, why comments here and not langcanada?
 Just curious...

> # Add some class attributes
> for h1s in range(len(soup.findAll("h1"))):
>    le_h1 = soup.findAll("h1")[h1s]
>    le_h1["class"] = "heading1_main"
>
> for h2s in range(len(soup.findAll("h2"))):
>    le_h2 = soup.findAll("h2")[h2s]
>    le_h2["class"] = "heading2_main"

You could abstract this into a function with a few parametes
and put it into a loop, and thus save a load of typing!

OK, Too much code to go through in detail, can you do a simple example
where you try to remove some tags and it doesn't work? Also did you
look at the ReplaceWith method? That may help you if you use
something like a SPAN or DIV tag...

I didn't see you writing anything back in that code but then I was 
just scanning
it and may have missed it... You extract them from the parse tree but 
do you
ever write the modified tree out?

Alan G


_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to