Hi All,
Note reposting my question since looks like earlier one got posted on wrong
thread.
We are using Nutch 1.13 and Solr 6. I am trying to use one of the parsers that
come with Tika boilerpipe support. I am getting best result for pages where
there are only outlinks with
Hi All,
We are using Nutch 1.13 and Solr 6. I am trying to use one of the parsers that
come with Tika boilerpipe support. I am getting best result for pages where
there are only outlinks with CanolaExtractor in a page like this:
https://support.automationdirect.com/faq/dl205.php
But
2 matches
Mail list logo