Hi, I was trying to fetch DMOZ open directory using using the exact example in the nutch tutorial website. So did the following steps:
mkdir db mkdir segments bin/nutch admin db -create bin/nutch inject db -dmozfile ../nutch-0.7.1/content.rdf.u8 -subset 3000 bin/nutch generate db segments s1=`ls -d segments/2* | tail -1` echo $s1 bin/nutch fetch -showThreadID -noParsing -threads 50 $s1 bin/nutch updatedb db $s1 It starts fetching the pages, but after couple hundred pages it starts giving me this exception: "java.net.SocketException: No buffer space available" Do you have any idea why this might happen? I know it is running out of availabe buffer for new socket, but why the old socket are not closed? Even if a fetch fails its socket should be closed and the its buffer should get freed! I tried both 0.7 and 0.7.1. On example of the given Exception is like this: 051018 153727 28 fetching http://perso.wanadoo.es/largo/ java.net.SocketException: No buffer space available at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:364) at java.net.Socket.connect(Socket.java:507) at java.net.Socket.connect(Socket.java:457) at java.net.Socket.<init>(Socket.java:365) at java.net.Socket.<init>(Socket.java:238) at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.c reateSocket(DefaultProtocolSocketFactory.java:79) at org.apache.commons.httpclient.protocol.ControllerThreadSocketFactory$ 1.doit(ControllerThreadSocketFactory.java:90) at org.apache.commons.httpclient.protocol.ControllerThreadSocketFactory$ SocketTask.run(ControllerThreadSocketFactory.java:157) at java.lang.Thread.run(Thread.java:595) Nima ------------------------------------------------------- This SF.Net email is sponsored by: Power Architecture Resource Center: Free content, downloads, discussions, and more. http://solutions.newsforge.com/ibmarch.tmpl _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
