Hello,
Can you help me?
I cannot solve it!
thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/re-Crawl-re-fetch-all-pages-each-time-tp4020464p4020998.html
Sent from the Nutch - User mailing list archive at Nabble.com.
No,
I'm using the default nutch code, downloaded from web. I only put the gora
properties to use Mysql driver, and I have modified the seed and url-filter
files. I also have modified the Agent properties (name, etc) in nutch -site.
Thanks A lot
--
View this message in context:
http://lucene
No,
I'm new in nutch, but I think that I'm not using any backend
--
View this message in context:
http://lucene.472066.n3.nabble.com/re-Crawl-re-fetch-all-pages-each-time-tp4020464p4020668.html
Sent from the Nutch - User mailing list archive at Nabble.com.
Hi,
Are you using the gora-cassandra backend with Nutch 2.1?
On Thu, Nov 15, 2012 at 5:49 PM, vetus wrote:
> Thanks you for you response, But it also re-fetch all webpages...
>
> This is the code that I'm using...
>
Thanks you for you response, But it also re-fetch all webpages...
This is the code that I'm using...
status.put(Nutch.STAT_PHASE, "generate " + i);
jobRes = runTool(GeneratorJob.class, args);
if (jobRes != null) {
subTools.put("generate " + i, jobRes);
}
sta
Hi - this should not happen. The only thing i can imagine is that the update
step doesn't succeed but that would mean nothing is going to be indexed either.
You can inspect an URL using the readdb tool, check before and after.
-Original message-
> From:vetus
> Sent: Thu 15-Nov-2012 15:
6 matches
Mail list logo