indexing hash

2008-06-30 Thread Chris Harris
Hey guys - I'm considering using Nutch to do some indexing & searching over websites and unfortunately I'm running into some roadblocks in "moving" from my current system to Nutch. I've reviewed the parsing code and honestly I'm a bit confused by it. so I was hoping I could solicit some exp

[jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0

2008-06-30 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609415#action_12609415 ] Andrzej Bialecki commented on NUTCH-634: - I ran a test crawl using Hadoop 0.17.1 re

What replaced Link Analysis?

2008-06-30 Thread Winton Davies
Hi Doug et al. I was using the .8 tutorial, and was planning on doing a link analysis for Wikipedia, but found out it was gone in 0.9 from the release notes. Is there any public discussion (or even a reference to a paper) of the motivation for this? I'm partly curious as I hadn't seen papers

Re: [jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0

2008-06-30 Thread Andrzej Bialecki
Lincoln Ritter wrote: Just to clarify: Andrzej, the resolution you speak of in 0.19 - is that resolution independent of Michael's patch? Yes, this is something that will be submitted in a separate Hadoop JIRA issue. I think any solution with less code is preferable, so a configuration chan

Re: [jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0

2008-06-30 Thread Lincoln Ritter
Just to clarify: Andrzej, the resolution you speak of in 0.19 - is that resolution independent of Michael's patch? I think any solution with less code is preferable, so a configuration change seems like a great way to go. (I didn't realize one could change hadoop parameters from the nutch config!

[jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0

2008-06-30 Thread Michael Gottesman (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609308#action_12609308 ] Michael Gottesman commented on NUTCH-634: - There is actually a special thing in hado

[jira] Commented: (NUTCH-634) Patch - Nutch - Hadoop 0.17.0

2008-06-30 Thread Andrzej Bialecki (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609295#action_12609295 ] Andrzej Bialecki commented on NUTCH-634: - This issue will likely be fixed in Hadoop