Hello,
I would like to subscribe to the mailing list.
Best,
Manali
Hello,
I am currently trying crawl the web using nutch 1.11 trunk version from
https://github.com/apache/nutch
I am trying to use a particular property from the nutch-default.xml named:
http.agent.rotate
false
If true, instead of http.agent.name, alternating agent names are
chosen from a list p
Hello,
I am trying crawl a website using nutch trunk along with the latest tika
It gives me an error:
Can't retrieve Tika parser for mime-type text/aspdotnet
But when I try to parse the same url using the tika-app-1.10.jar using the
command
$ java -jar tika-app-1.10.jar -m url
It prints the me
3 matches
Mail list logo