Hi... I'm using quick test with libxml2dom
=============== import libxml2dom aa=libxml2dom.parseString(foo) ff=libxml2dom.toString(aa) print ff =============== ---------------------------------- when i start, foo is: <html> <body> </body> </html> <html> <body> . . . </body> </html> ------------------------------- when i print ff it's: <html> <body> </body> </html> ------------------------------- so it's as if the parseString only reads the initial "html" tree. i've reviewed as much as i can find regarding libxml2dom to try to figure out how i can get it to read/parse/handle both html trees/nodes. i know, the html is maligned/screwed-up, but i can't seem to find any app (tidy/beautifulsoup) that can "know" which one of the html trees to throw out/remove!! technically, both html trees are valid, it's just that they both shouldn't be in the file!!! thoughts/comments appreciated thanks -- http://mail.python.org/mailman/listinfo/python-list