On 5/30/07, Manoharam Reddy <[EMAIL PROTECTED]> wrote:
> Time and again I get this error and as a result the segment remains
> incomplete. This wastes one iteration of the for() loop in which I am
> doing generate, fetch and update.
>
> Can someone please tell me what are the measures I can take to avoid
> this error? And isn't it possible to make some code changes so that
> the whole fetch doesn't have to stop suddenly when this error occurs.
> Can't we do something in the code so that, the fetch still continues
> like in case of SocketException, in which case the fetch while(1) loop
> continues.
>
> If it is not possible, please tell me how can I prevent this error
> from happening?

Are you also parsing during fetch? If you are, I would suggest running
Fetcher in non-parsing mode.

>
> ----- ERROR -----
>
> fetch of http://telephony/register.asp failed with:
> java.lang.OutOfMemoryError: Java heap space
> java.lang.NullPointerException
> at 
> org.apache.hadoop.fs.FSDataInputStream$Buffer.getPos(FSDataInputStream.java:87)
> at org.apache.hadoop.fs.FSDataInputStream.getPos(FSDataInputStream.java:125)
> ......
> at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:115)
> fetcher caught:java.lang.NullPointerException
> java.lang.NullPointerException
> at 
> org.apache.hadoop.fs.FSDataInputStream$Buffer.getPos(FSDataInputStream.java:87)
> at org.apache.hadoop.fs.FSDataInputStream.getPos(FSDataInputStream.java:125)
> .......
> at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:115)
> fetcher caught:java.lang.NullPointerException
> Fetcher: java.io.IOException: Job failed!
>   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:604)
>   at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:470)
>   at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:505)
>   at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
>   at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:477)
>


-- 
Doğacan Güney
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
Nutch-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to