Tagsoup needs to be embedded in your classpath -- which is the case if
BaseX is downloaded from our homepage). If you have installed BaseX
via the Debian package manager, you'll have to manually embed the
tagsoup.jar in the BaseX start scripts.

Hope this helps,
Christian

> Well all I know is that
> http://docs.basex.org/wiki/Parsers
> should mention what to do to read HTML, and on my machine there is
> $ apt-cache search tagsoup-java
> libtagsoup-java - SAX-compliant parser for real-life HTML
> libtagsoup-java-doc - API Documentation for TagSoup
>
> Mainly it is tags like <img ...> without /> that throw basex off track.
> _______________________________________________
> BaseX-Talk mailing list
> [email protected]
> https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
_______________________________________________
BaseX-Talk mailing list
[email protected]
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

Reply via email to