Hi I'd like to use Xerces to parse HTML. As HTML is not XML I need to tweak Xerces so that it could transform HTML into valid XML. I found information about NekoHTML which is just what I need but it's in Java... Do you know if there's something like NekoHTML written in C/C++? If you know better tool for this job than please let me know.
Thank you in advance for your time and help. ps I was very surprised with how little information I found on the topic of parsing HTML with C++ in the Internet. I was even more surprised with how little information on this topic I found on this list. Is there any reason for this? How is this possible while so many browsers are written in C++? -- Piotr Dobrogost *** curlpp.org - c++ wrapper for libcurl ***
