I know I missed the Friday deadline but...

 

Has anyone any recommendations for parsing html. I use Lucene and the
example has its own HTML parser but I was wondering if anyone has used an
existing project or whether there is some built in functionality in an
Apache lib to convert

 

<p>Hello <i>World</i></p>

 

To

 

Hello World

 

Your thoughts are appreciated.

Reply via email to