[
http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12361041 ]
Jerome Charron commented on NUTCH-139:
--
Ok, Chris and me will implement MetadataNames in this way.
Just some few comments:
I plan to move the MetadataNames to a class rath
Lukas,
the input folder are normally setted by the tools to you can not
change that.
However in case you use a unix box, check that the user that runs
nutch has read and write acess to all the folder defined in the nutch-
site/default.xml.
(I guess that can be the problem, nutch use e.g. /tmp
[
http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12361043 ]
Andrzej Bialecki commented on NUTCH-139:
-
Regarding the move to a class with public static fields: I don't have any
problem with that.
Regarding the Levenshtein dista
try to checkout the latest sources from the subversion server.
There will be no new nightly builds until the new western year.
Stefan
Am 20.12.2005 um 21:35 schrieb tigger .:
Hi All
The the nightly build is not working:
bin/nutch admin db -create
Exception in thread "main" java.lang.NoClassDe
[
http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12361045 ]
Jerome Charron commented on NUTCH-139:
--
Andrzej,
Do you read in my mind?
Yes of course, that's the way I want to do it: First checks for the most common
cases (lower case
Stefan,
Nutch created folders in /tmp so I think if it should able to create
files there as well. I also tried to change all /tmp* in conf file to
my home folder with the same result (i.e.: folders were created and
several files were dumped there but it yielded the same exception).
Are you able t
Yes, I'm able to run it, no problem but I'm using the step by step
commands not the crawl (allinOne) command.
Can you try a "ant test" - do all test pass?
Am 21.12.2005 um 12:52 schrieb Lukas Vlcek:
Stefan,
Nutch created folders in /tmp so I think if it should able to create
files there as w
Hi,
I'm happy to report that further tests performed on a larger index seem
to show that the overall impact of the IndexSorter is definitely
positive: performance improvements are significant, and the overall
quality of results seems at least comparable, if not actually better.
The reason wh
Hi,
I'm rather new to nutch, but is there something wrong with the idea of
creating an index with nutch (using the intranet search from the nutch
tutorial) and searching this index with Lucene? I.e. doing something
like this:
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene
On Mittwoch 21 Dezember 2005 17:13, Oliver Hummel wrote:
> java.lang.ArrayIndexOutOfBoundsException: -1
That's the error you get when you open a Lucene 1.9 index with Lucene 1.4.
But I don't know if that's also the case here.
Regards
Daniel
--
http://www.danielnaber.de
Yep, that's it. Nutch has Lucene 1.9 in its lib.
Many thanks!
Oliver
Daniel Naber wrote:
> On Mittwoch 21 Dezember 2005 17:13, Oliver Hummel wrote:
>
>
>>java.lang.ArrayIndexOutOfBoundsException: -1
>
>
> That's the error you get when you open a Lucene 1.9 index with Lucene 1.4.
> But I
I'm late, but better late than never: +1 (I thought Stefan was already a
committer, actually).
Stefan: will you be putting some of those media-style Nutch tutorials in
Nutch's own Wiki?
Otis
- Original Message
From: Andrzej Bialecki <[EMAIL PROTECTED]>
To: nutch-dev@lucene.apache.org
You can ignore mapred.input.subdir; I find it is an unneeded option.
Now that the mapred branch is merged to be the trunk, there is a need
to clarify the documentation since the a change was made to have the
input be specified as a directory and then all files in that directory
are considered inp
Hi Andrzej,
wow are really great news!
Using the optimized index, I reported previously that some of the
top-scoring results were missing. As it happens, the missing
results were typically the "junk" pages with high tf/idf but low
"boost". Since we collect up to N hits, going from higher to
I've got 400mill db i can run this against over the
next few days.
-byron
--- Stefan Groschupf <[EMAIL PROTECTED]> wrote:
> Hi Andrzej,
>
> wow are really great news!
> > Using the optimized index, I reported previously
> that some of the
> > top-scoring results were missing. As it happens,
>
Andrzej,
well I'm not ready with digging into the problem but want to ask some
more questions.
BTW I counted 195 places that use NutchConf.get(), so this will be a
bigger patch. :)
As I mentioned I would love to go the inversion of control way, so
not using nutchConf in the constructor but
Andrzej Bialecki wrote:
Hi,
I'm happy to report that further tests performed on a larger index
seem to show that the overall impact of the IndexSorter is definitely
positive: performance improvements are significant, and the overall
quality of results seems at least comparable, if not actual
nutch map reduce does not work in windows map reduce runs in a loop
---
Key: NUTCH-147
URL: http://issues.apache.org/jira/browse/NUTCH-147
Project: Nutch
Type: Bug
Components: indexer
Versions: 0
American Jeff Bowden wrote:
Andrzej Bialecki wrote:
Hi,
I'm happy to report that further tests performed on a larger index
seem to show that the overall impact of the IndexSorter is definitely
positive: performance improvements are significant, and the overall
quality of results seems at l
19 matches
Mail list logo