[jira] Commented: (NUTCH-719) fetchQueues.totalSize incorrect in Fetcher2

Steven Denny (JIRA) Mon, 13 Jul 2009 01:31:46 -0700

    [ 
https://issues.apache.org/jira/browse/NUTCH-719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730253#action_12730253
 ]


Steven Denny commented on NUTCH-719:
------------------------------------

perhaps i spoke too soon

10 threads, 15520 pages, 723 errors, 3.7 pages/s, 2972 kb/s, 
-activeThreads=10, spinWaiting=10, fetchQueues.totalSize=0, fetchQueues.count=0
Aborting with 10 hung threads.
Unable to resolve: www.countryenergy.com.au, skipping.
Exception in thread "QueueFeeder" java.lang.NullPointerException
        at 
org.apache.hadoop.fs.BufferedFSInputStream.getPos(BufferedFSInputStream.java:48)
        at 
org.apache.hadoop.fs.FSDataInputStream.getPos(FSDataInputStream.java:41)
        at 
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.readChunk(ChecksumFileSystem.java:206)
        at 
org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:238)
        at org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:177)
        at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:111)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at 
org.apache.hadoop.io.SequenceFile$Reader.readRecordLength(SequenceFile.java:1895)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1925)
        at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2062)
        at 
org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:76)
        at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
        at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
        at org.apache.nutch.fetcher.Fetcher$QueueFeeder.run(Fetcher.java:418)


It apears that the feeder hung, but I'm not sure whether the exception raised 
is the cause or the effect (i suspect it's the effect of the thread aborting)

I'm also not sure if any of these issues are vm related. Hopefully our real 
hardware will turn up soon....

> fetchQueues.totalSize incorrect in Fetcher2
> -------------------------------------------
>
>                 Key: NUTCH-719
>                 URL: https://issues.apache.org/jira/browse/NUTCH-719
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 1.0.0
>            Reporter: Julien Nioche
>
> I had a look at the logs generated by Fetcher2 and found cases where there 
> were no active fetchQueues but fetchQueues.totalSize was != 0
> fetcher.Fetcher2 - -activeThreads=200, spinWaiting=200, 
> fetchQueues.totalSize=1, fetchQueues=0
> since the code relies on fetchQueues.totalSize to determine whether the work 
> is finished or not the task is blocked until the abortion mechanism kicks in
> 2009-03-12 09:27:38,977 WARN  fetcher.Fetcher2 - Aborting with 200 hung 
> threads.
> could that be a synchronisation issue? any ideas?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (NUTCH-719) fetchQueues.totalSize incorrect in Fetcher2

Reply via email to