Let me know if that didn't fix it. Thank you, again, David! On Wed, Mar 26, 2025 at 6:44 AM Tim Allison <[email protected]> wrote:
> Y. Will fix shortly. Thank you! > > On Tue, Mar 25, 2025 at 4:50 PM David Pilato <[email protected]> wrote: > >> Hey team >> >> The page >> https://tika.apache.org/3.1.0/formats.html#HyperText_Markup_Language >> mentions: >> >> >> The output from the HtmlParser class is guaranteed to be well-formed and >> valid XHTML, and various heuristics are used to prevent things like inline >> scripts from cluttering the extracted text content. >> >> >> But HtmlParser links to a non existing class: >> https://tika.apache.org/3.1.0/api/org/apache/tika/parser/html/HtmlParser.html >> Should it be >> https://tika.apache.org/3.1.0/api/org/apache/tika/parser/html/JSoupParser.html >> instead? >> >> >> >> David Pilato >> [email protected] >> 06 13 03 08 41 >> >>
