Nutch 1.7 content encoding problem

2014-08-20 Thread adu
hi all, I want to crawl a json file from a url. I use wget url ,and find the result file has wrong encoding characters about Chinese words . And the I run iconv -f gbk -t utf-8 file.json , and get the correct result. Then , I use nutch. Use the readseg dump to get the result. The ParseText

Re: How to use a proxy list while nutch is crawling?

2014-08-01 Thread adu
any suggestion? 于 2014年07月31日 15:01, [Email Address Not Verified]-dujinh...@hzduozhun.com 写道: Hi all, I have a proxy list , and want to apply these proxies to nutch crawl. How to do it? Thanks.