Sami Siren wrote:
Patch works for me.
OK. I just committed it.
Thanks!
Doug
Doug Cutting wrote:
Jérôme Charron wrote:
> In my environment, the crawl command terminate with the following
> error: 2006-07-06 17:41:49,735 ERROR mapred.JobClient
> (JobClient.java:submitJob(273)) - Input directory
> /localpath/crawl/crawldb/current in local is invalid. Exception in
> threa
Gal Nitzan wrote:
To get the same behavior, just try to inject to a new crawldb that doesn't
exist.
The reason many doesn't get it is that crawldb already exists in their
environment.
true, I was injecting to existing crawldb.
--
Sami Siren
Jérôme Charron wrote:
In my environment, the crawl command terminate with the following error:
2006-07-06 17:41:49,735 ERROR mapred.JobClient
(JobClient.java:submitJob(273))
- Input directory /localpath/crawl/crawldb/current in local is invalid.
Exception in thread "main" java.io.IOException: I
nutch-dev@lucene.apache.org
Subject: Re: Error with Hadoop-0.4.0
Jérôme Charron wrote:
> Hi,
>
> I encountered some problems with Nutch trunk version.
> In fact it seems to be related to changes related to Hadoop-0.4.0 and JDK
> 1.5
> (more precisely since HADOOP-129 and File
Stefan Groschupf wrote:
We tried your suggested fix:
Injector
by mergeJob.setInputPath(tempDir) (instead of mergeJob.addInputPath
(tempDir))
I suspect that this is not the right solution - have you actually tested
that the resulting db contains all entries from the input dirs?
--
Best regar
Jérôme Charron wrote:
What I suggest, is simply to remove the line 75 in createJob method from
CrawlDb :
setInputPath(new Path(crawlDb, CrawlDatum.DB_DIR_NAME));
In fact, this method is only used by Injector.inject() and
CrawlDb.update()
and
the inputPath setted in createJob is not needed neit
We tried your suggested fix:
Injector
by mergeJob.setInputPath(tempDir) (instead of mergeJob.addInputPath
(tempDir))
and this worked without any problem.
Thanks for catching that, this saved us a lot of time.
Stefan
On 07.07.2006, at 16:08, Jérôme Charron wrote:
I have the same problem on a
I have the same problem on a distribute environment! :-(
So I think can confirm this is a bug.
Thanks for this feedback Stefan.
We should fix that.
What I suggest, is simply to remove the line 75 in createJob method from
CrawlDb :
setInputPath(new Path(crawlDb, CrawlDatum.DB_DIR_NAME));
In
Hi Jérôme,
I have the same problem on a distribute environment! :-(
So I think can confirm this is a bug.
We should fix that.
Stefan
On 06.07.2006, at 08:54, Jérôme Charron wrote:
Hi,
I encountered some problems with Nutch trunk version.
In fact it seems to be related to changes related to H
> I encountered some problems with Nutch trunk version.
> In fact it seems to be related to changes related to Hadoop-0.4.0 and
JDK
> 1.5
> (more precisely since HADOOP-129 and File replacement by Path).
> Does somebody have the same error?
I am not seeing this (just run inject on a single machin
Jérôme Charron wrote:
Hi,
I encountered some problems with Nutch trunk version.
In fact it seems to be related to changes related to Hadoop-0.4.0 and JDK
1.5
(more precisely since HADOOP-129 and File replacement by Path).
Does somebody have the same error?
I am not seeing this (just run injec
12 matches
Mail list logo