[ https://issues.apache.org/jira/browse/NUTCH-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141334#comment-13141334 ]
Zhang JinYan edited comment on NUTCH-1138 at 11/1/11 5:14 PM: -------------------------------------------------------------- Apply the path to branch-1.4, rebuild with cmd: "ant clean build". Config to crawl websites: {quote} http://172.16.123.123/bbs/viewthread.php?tid=12345 http://172.16.123.123/bbs/attachment.php?aid=12345 http://www.jettycn.com/ {quote} The previous two sites are not available. Run crawl with cmd(platform windows): {quote} sh.exe ./bin/nutch crawl seedurl -dir crawldev -solr http://localhost:8983/solr/ {quote} Complete the crawl successfully. Query in solr admin return: {code:xml} <result name="response" numFound="320" start="0"></result> {code} Search word "ERROR" in "hadoop.log",find 3 results caused by: {code} java.net.ConnectException: Connection timed out: connect {code} Search word "Exception" in "hadoop.log", find results like this: {quote} 2011-11-02 00:39:01,821 INFO httpclient.HttpMethodDirector - I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server www.jettycn.com failed to respond 2011-11-02 00:39:01,821 INFO httpclient.HttpMethodDirector - Retrying request {quote} So there is no exception related to your patch in the "hadoop.log". The patch work fine with "branch-1.4" for me. was (Author: yearn20m): Apply the path to branch-1.4, rebuild with cmd: "ant clean build". Config to crawl websites: {quote} http://172.16.123.123/bbs/viewthread.php?tid=12345 http://172.16.123.123/bbs/attachment.php?aid=12345 http://www.jettycn.com/ {quote} The previous two sites are not available. Run crawl with cmd(platform windows): {quote} sh.exe ./bin/nutch crawl seedurl -dir crawldev -solr http://localhost:8983/solr/ {quote} Complete the crawl successfully. Query in solr admin return: {code:xml} <result name="response" numFound="320" start="0"></result> {code} Search word "ERROR" in "hadoop.log",find 3 results caused by: {code} java.net.ConnectException: Connection timed out: connect {code} Search word "Exception" in "hadoop.log", find results like this: {quote} 2011-11-02 00:39:01,821 INFO httpclient.HttpMethodDirector - I/O exception (org.apache.commons.httpclient.NoHttpResponseException) caught when processing request: The server www.jettycn.com failed to respond 2011-11-02 00:39:01,821 INFO httpclient.HttpMethodDirector - Retrying request {quote} So there is no exception related your path in the "hadoop.log". The path work fine with "branch-1.4" for me. > remove LogUtil from trunk and nutch gora > ---------------------------------------- > > Key: NUTCH-1138 > URL: https://issues.apache.org/jira/browse/NUTCH-1138 > Project: Nutch > Issue Type: Improvement > Affects Versions: 1.4, nutchgora > Reporter: Lewis John McGibbney > Assignee: Lewis John McGibbney > Priority: Minor > Fix For: nutchgora, 1.5 > > Attachments: Document1.txt, NUTCH-1138-trunk-20111023.patch > > > This should move towards the removal of the LogUtil class from both codebases > as per comments in NUTCH-1078. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira