HI Ratnesh, I am crawling the internet. I am able to get all the crawl pages but this error do appear in my error log..I dont know what it mean for. I have used two filter regex and crawl for my crawling..Is something do with that??
How should i eliminate the above menitioned error.Something need to be set or modified in nutch-site.xml? Cheers, cha Ratnesh,V2Solutions India wrote: > > This socket exception normally comes , if fetcher is not able to get the > page to crawl?? > I mean there is some problem with the server connection. > if you r crawling for local stored pages, then check whether the server is > started or not?? > > I have tested the same for my local crawl, but for internet specific crawl > I don't have enough idea?? > > > Ratnesh V2Solutions India > > > cha wrote: >> >> HI ppl, >> >> when i crawl my website , it is giving me following error , though >> crawling is doing fine. >> >> Can anyone tell me what the error is about?? Do i have to set anything in >> nutch-site.xml?? >> >> Following are the error logs: >> >> [2007-04-04 16:23:21,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? java.net.SocketTimeoutException: >> Read timed out >> [2007-04-04 16:23:21,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.net.SocketInputStream.socketRead0(Native Method) >> [2007-04-04 16:23:21,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.net.SocketInputStream.read(Unknown Source) >> [2007-04-04 16:23:21,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.io.BufferedInputStream.read1(Unknown Source) >> [2007-04-04 16:23:21,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.io.BufferedInputStream.read(Unknown Source) >> [2007-04-04 16:23:21,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.io.FilterInputStream.read(Unknown Source) >> [2007-04-04 16:23:21,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.io.PushbackInputStream.read(Unknown Source) >> [2007-04-04 16:23:21,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.io.FilterInputStream.read(Unknown Source) >> [2007-04-04 16:23:21,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.protocol.http.HttpResponse.readPlainContent(HttpResponse.java:214) >> >> [2007-04-04 16:23:21,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:146) >> [2007-04-04 16:23:21,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.protocol.http.Http.getResponse(Http.java:63) >> [2007-04-04 16:23:21,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:208) >> >> [2007-04-04 16:23:21,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:144) >> [2007-04-04 16:23:22,046] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? java.net.SocketTimeoutException: >> Read timed out >> [2007-04-04 16:23:22,046] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.net.SocketInputStream.socketRead0(Native Method) >> [2007-04-04 16:23:22,046] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.net.SocketInputStream.read(Unknown Source) >> [2007-04-04 16:23:22,046] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.io.BufferedInputStream.read1(Unknown Source) >> [2007-04-04 16:23:22,046] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.io.BufferedInputStream.read(Unknown Source) >> [2007-04-04 16:23:22,046] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.io.FilterInputStream.read(Unknown Source) >> [2007-04-04 16:23:22,046] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.io.PushbackInputStream.read(Unknown Source) >> [2007-04-04 16:23:22,062] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.io.FilterInputStream.read(Unknown Source) >> [2007-04-04 16:23:22,062] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.protocol.http.HttpResponse.readPlainContent(HttpResponse.java:214) >> >> [2007-04-04 16:23:22,062] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:146) >> [2007-04-04 16:23:22,062] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.protocol.http.Http.getResponse(Http.java:63) >> [2007-04-04 16:23:22,062] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:208) >> >> [2007-04-04 16:23:22,062] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:144) >> [2007-04-04 16:23:32,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? java.net.SocketTimeoutException: >> connect timed out >> [2007-04-04 16:23:32,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.net.PlainSocketImpl.socketConnect(Native Method) >> [2007-04-04 16:23:32,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.net.PlainSocketImpl.doConnect(Unknown Source) >> [2007-04-04 16:23:32,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.net.PlainSocketImpl.connectToAddress(Unknown Source) >> [2007-04-04 16:23:32,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.net.PlainSocketImpl.connect(Unknown Source) >> [2007-04-04 16:23:32,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> java.net.SocksSocketImpl.connect(Unknown Source) >> [2007-04-04 16:23:32,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at java.net.Socket.connect(Unknown >> Source) >> [2007-04-04 16:23:32,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:94) >> [2007-04-04 16:23:32,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.protocol.http.Http.getResponse(Http.java:63) >> [2007-04-04 16:23:32,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:208) >> >> [2007-04-04 16:23:32,218] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? at >> org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:144) >> [2007-04-04 16:23:33,046] [FetcherThread] ERROR >> org.apache.nutch.protocol.http.Http:? >> >> >> Pls do reply me asap. >> >> Regards, >> cha >> >> > > -- View this message in context: http://www.nabble.com/ERROR-org.apache.nutch.protocol.http.Http%3A-java.net.SocketTimeoutException%3A-Read-timed-out-tf3525172.html#a9851037 Sent from the Nutch - User mailing list archive at Nabble.com. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
