Hi, guys Yesterday, I tried to crawl a website (a Chinese website) with some seed links like this: http://www.ccgp.gov.cn/cggg/dfbx/gkzb/default_4.shtml but the crawl process failed because of a problem shown as following: fetching http://www.ccgp.gov.cn/cggg/dfbx/gkzb/default_4.shtml (queue crawl delay=5000ms) fetch of http://www.ccgp.gov.cn/cggg/dfbx/gkzb/default_4.shtml failed with: java.io.IOException: unzipBestEffort returned null At first, I used nutch-1.5.1 to crawl the website and had the above problem, then I changed to use nutch-1.7 to do it again but it failed again. Now, I totally have no idea how to handle the problem! I would really appreciate any feedback!
-Yan Wang