As Andy mentioned, parsing HTML, especially non-well-formed HTML, is
not something Xerces was designed to do.  XML has certain restrictions
on it precisely to make it easier to process.

However you could look into using Tidy, found at
http://www.w3.org/People/Raggett/tidy/, to parse your document, and
then  write a program to take the output of tidy to create a
well-formed XHTML document, which you could then use Xerces to process.

- Shane 

---- you atta ur-rehman wrote: ----
> What I'm trying to do is very simple. I have an HTML document,
> HTML not XHTML, which may or may not be well formed and I need

=====
<eof aka="mailto:[EMAIL PROTECTED]";
 quote="A mirror is like a window on the other side of behind you."/>

__________________________________________________
Do You Yahoo!?
Make a great connection at Yahoo! Personals.
http://personals.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to