Hi All, I'm interested with using Tika/Boilerpipe for extracting images or other content from a web page. The page would be any kind of web page (news, blog post, etc). I've had good experience with the ImageExtractor in Boilerpipe's trunk, and wondering what is the workflow for extending / using boilerpipe in general (say I also want video /embed tags).
Thanks! -- Dotan, @jondot <http://twitter.com/jondot>
