subject:"RE\: re\-Crawl re\-fetch all pages each time"

Re: re-Crawl re-fetch all pages each time

2012-11-19 Thread vetus

Hello, Can you help me? I cannot solve it! thanks -- View this message in context: http://lucene.472066.n3.nabble.com/re-Crawl-re-fetch-all-pages-each-time-tp4020464p4020998.html Sent from the Nutch - User mailing list archive at Nabble.com.

Re: re-Crawl re-fetch all pages each time

2012-11-16 Thread vetus

No, I'm using the default nutch code, downloaded from web. I only put the gora properties to use Mysql driver, and I have modified the seed and url-filter files. I also have modified the Agent properties (name, etc) in nutch -site. Thanks A lot -- View this message in context: http://lucene

Re: re-Crawl re-fetch all pages each time

2012-11-16 Thread vetus

No, I'm new in nutch, but I think that I'm not using any backend -- View this message in context: http://lucene.472066.n3.nabble.com/re-Crawl-re-fetch-all-pages-each-time-tp4020464p4020668.html Sent from the Nutch - User mailing list archive at Nabble.com.

Re: re-Crawl re-fetch all pages each time

2012-11-15 Thread Lewis John Mcgibbney

Hi, Are you using the gora-cassandra backend with Nutch 2.1? On Thu, Nov 15, 2012 at 5:49 PM, vetus wrote: > Thanks you for you response, But it also re-fetch all webpages... > > This is the code that I'm using... >

RE: re-Crawl re-fetch all pages each time

2012-11-15 Thread vetus

Thanks you for you response, But it also re-fetch all webpages... This is the code that I'm using... status.put(Nutch.STAT_PHASE, "generate " + i); jobRes = runTool(GeneratorJob.class, args); if (jobRes != null) { subTools.put("generate " + i, jobRes); } sta

RE: re-Crawl re-fetch all pages each time

2012-11-15 Thread Markus Jelsma

Hi - this should not happen. The only thing i can imagine is that the update step doesn't succeed but that would mean nothing is going to be indexed either. You can inspect an URL using the readdb tool, check before and after. -Original message- > From:vetus > Sent: Thu 15-Nov-2012 15:

Re: re-Crawl re-fetch all pages each time

Re: re-Crawl re-fetch all pages each time

Re: re-Crawl re-fetch all pages each time

Re: re-Crawl re-fetch all pages each time

RE: re-Crawl re-fetch all pages each time

RE: re-Crawl re-fetch all pages each time

6 matches

Site Navigation

Mail list logo

Footer information