Hi:
I downloaded and compiled the Nutch trunk. But when I try to make a
parsechecker I get the error: Can't retrieve Tika parser for mime-type
image/jpeg
My log file content is this:
2015-11-02 10:50:57,421 INFO parse.ParserChecker - fetching:
http://www.cubadebate.cu/wp-content/uploads/20
[
https://issues.apache.org/jira/browse/NUTCH-1911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14985436#comment-14985436
]
Michael Joyce commented on NUTCH-1911:
--
Hrm odd, I want to throw some commons-cli at
[
https://issues.apache.org/jira/browse/NUTCH-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14985431#comment-14985431
]
Michael Joyce commented on NUTCH-2155:
--
+1 sounds good to me [~sebastien0], I will up
[
https://issues.apache.org/jira/browse/NUTCH-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14985427#comment-14985427
]
Michael Joyce commented on NUTCH-2150:
--
Yes, will address in a patch shortly.
> Add
[
https://issues.apache.org/jira/browse/NUTCH-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14985280#comment-14985280
]
Sebastian Nagel commented on NUTCH-2155:
Yes, call it as
{noformat}
% nutch crawlc
5 matches
Mail list logo