[
https://issues.apache.org/jira/browse/NUTCH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473064
]
Armel Nene commented on NUTCH-437:
----------------------------------
I was wondering if this patch could fix my problem which is, if not the same,
very similar to this one. I am using Nutch 0.8.2-dev, I have made checkout
awhile ago from SVN but never updated again. I was able to crawl 10000 xml
files before with no error whatsoever. This is the following errors that I get
when I'm fetching:
INFO parser.custom: Custom-parse: Parsing content
file:/C:/TeamBinder/AddressBook/9100/(65)E110_ST A0 (1).pdf
07/02/12 22:09:16 INFO fetcher.Fetcher: fetch of
file:/C:/TeamBinder/AddressBook/9100/(65)E110_ST A0 (1).pdf failed with:
java.lang.NullPointerException
07/02/12 22:09:17 INFO mapred.LocalJobRunner: 0 pages, 0 errors, 0.0 pages/s, 0
kb/s,
07/02/12 22:09:17 FATAL fetcher.Fetcher: java.lang.NullPointerException
07/02/12 22:09:17 FATAL fetcher.Fetcher: at
org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:198)
07/02/12 22:09:17 FATAL fetcher.Fetcher: at
org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:189)
07/02/12 22:09:17 FATAL fetcher.Fetcher: at
org.apache.hadoop.mapred.MapTask$2.collect(MapTask.java:91)
07/02/12 22:09:17 FATAL fetcher.Fetcher: at
org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java:314)
07/02/12 22:09:17 FATAL fetcher.Fetcher: at
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:232)
07/02/12 22:09:17 FATAL fetcher.Fetcher: fetcher
caught:java.lang.NullPointerException
One of the problem is that my hadoop version says the following:
hadoop-0.4.0-patched. Now I don't know if it means that I am running the 0.4.0
version but it seems a little bit confusing. Once you can clarify that for me,
then I will be able to apply the patch to my version.
Best Regards,
Armel
> MapFile in Hadoop Trunk has changed, must update references
> -----------------------------------------------------------
>
> Key: NUTCH-437
> URL: https://issues.apache.org/jira/browse/NUTCH-437
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 0.8.2, 0.9.0
> Environment: windows xp and java
> Reporter: Dennis Kubes
> Assigned To: Andrzej Bialecki
> Fix For: 0.8.2, 0.9.0
>
> Attachments: nutch-hadoop-0.10.2-mapfile.patch
>
>
> The MapFile.Writer signature has changed in hadoop trunk (version 10.x +) to
> include a Configuration object. Object in the Nutch codebase that reference
> MapFile.Writer will need to be updated.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers