Hey guys -
I'm considering using Nutch to do some indexing & searching over websites
and unfortunately I'm running into some roadblocks in "moving" from my
current system to Nutch.
I've reviewed the parsing code and honestly I'm a bit confused by it. so I
was hoping I could solicit some exp
[
https://issues.apache.org/jira/browse/NUTCH-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609415#action_12609415
]
Andrzej Bialecki commented on NUTCH-634:
-
I ran a test crawl using Hadoop 0.17.1 re
Hi Doug et al.
I was using the .8 tutorial, and was planning on doing a link
analysis for Wikipedia, but found out it was gone in 0.9 from the
release notes. Is there any public discussion (or even a reference to
a paper) of the motivation for this? I'm partly curious as I hadn't
seen papers
Lincoln Ritter wrote:
Just to clarify: Andrzej, the resolution you speak of in 0.19 - is
that resolution independent of Michael's patch?
Yes, this is something that will be submitted in a separate Hadoop JIRA
issue.
I think any solution with less code is preferable, so a configuration
chan
Just to clarify: Andrzej, the resolution you speak of in 0.19 - is
that resolution independent of Michael's patch?
I think any solution with less code is preferable, so a configuration
change seems like a great way to go. (I didn't realize one could
change hadoop parameters from the nutch config!
[
https://issues.apache.org/jira/browse/NUTCH-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609308#action_12609308
]
Michael Gottesman commented on NUTCH-634:
-
There is actually a special thing in hado
[
https://issues.apache.org/jira/browse/NUTCH-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609295#action_12609295
]
Andrzej Bialecki commented on NUTCH-634:
-
This issue will likely be fixed in Hadoop