This was a normal Nutch parse. I'm still not sure what was causing
the bug, but it stopped last week.
On 10/7/07, Dennis Kubes <[EMAIL PROTECTED]> wrote:
> This happens when two reduce tasks try to write to the same output
> folder, usually on the dfs. Was this a Nutch Parse job or a custom Map
>
This happens when two reduce tasks try to write to the same output
folder, usually on the dfs. Was this a Nutch Parse job or a custom Map
Reduce job?
Dennis Kubes
Ned Rockson wrote:
This is the second time I've run this large parse of ~64m documents.
In the reduce phase, both times through t
These are classes from plugins and therefore are in their specific
plugin src directory. For example regex url normalized is found at:
NutchTrunk\src\plugin\urlnormalizer-regex\src\java\org\apache\nutch\net\urlnormalizer\regex\RegexURLNormalizer.java
Dennis Kubes
Sagar Vibhute wrote:
Hello,
[
https://issues.apache.org/jira/browse/NUTCH-562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532972
]
chrismattmann edited comment on NUTCH-562 at 10/7/07 8:34 AM:
--
Initial patch for comm
[
https://issues.apache.org/jira/browse/NUTCH-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris A. Mattmann updated NUTCH-562:
Attachment: tika-0.1-dev.jar
Tika 0.1 unrelased jar file -- drop this in $NUTCH_SRC_HOME/lib
[
https://issues.apache.org/jira/browse/NUTCH-562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris A. Mattmann updated NUTCH-562:
Attachment: NUTCH-562.Mattmann.patch.txt
Initial patch for comments:
1. This patch removes
Hello,
Does the default provided nutch0.9 package comes with certain java packages
missing?
I could compile the source (I downloaded the tarball, not from svn) using
ant. But when I start crawling it throws ClassNotFoundException, like:
java.lang.ClassNotFoundException:
org.apache.nutch.net.urlno
I started a crawl after adding a plugin given on the wiki (
http://wiki.apache.org/nutch/WritingPluginExample-0%2e9)
When I crawled, it stopped after throwing an exception. Here is what the
hadoop.log file says:
-