A record version mismatch occured Exception in "UpdateSegmentsFromDb"

2006-02-24 Thread George L
Hello All, I am facing this exception while running "UpdateSegmentsFromDb". This is in Nutch-0.7.1 Can anyone give me an idea about the role of different versions(2, 4, 5) used in the Nutch code. Also the best place to look for in fixing this problem. A record version mismatch occured. Expecting

Any way to specify how many results to retrieve in the Hits Collection

2006-02-24 Thread codejunky codejunky
Using the Lucene API is there any way when executing a search via IndexSearcher when the hits are returned to specify how many results to return per the search? Say there are 1000 documents that match the query I only want to return the first 100. Does Lucene support this in some way? So th

(AW) Re: url: search fail

2006-02-24 Thread Martin Gutbrod
Doug, Hmm, I did a re-crawl and I can find data as well, which cannot be in the old index... A search for [url:test] runs fine. Only files with 'test' in the url-string where found. A search e.g. with a dot in the url like [url:test.com] fails. Also [url:"test com"] don't work. The search [ur

Re: exception during fetch using hadoop

2006-02-24 Thread Mike Smith
Hi Doug, Unfortunately my core limit was 0 at that time and I am running another configuration with smaller number of threads since last night and it has not crashed yet. As soon as it fails I will send you more detailed log information. But this is usually happen: I'v been using three three mach

Re: url: search fail

2006-02-24 Thread Doug Cutting
0.7 and 0.8 are not compatible. You need to re-crawl. Sorry! Once we have a 1.0 release then we'll make sure things are back-compatible. Doug Martin Gutbrod wrote: I changed from 0.7.1 to one of the latest nightly builds (0.8) and now search for url: fields fail. E.g. [ url:my.doman.com ]

Re: exception during fetch using hadoop

2006-02-24 Thread Doug Cutting
It looks like the child JVM is silently exiting. The "error reading child output" just shows that the child's standard output has been closed, and the "child error" says the JVM exited with non-zero. Perhaps you can get a core dump by setting 'ulimit -c' to something big. JVM core dumps ca

Re: Getting started with standalone MapReduce

2006-02-24 Thread Jérôme Charron
> I have been looking for a Java implementation of Google's MapReduce design > and was very glad to find Nutch. However, I don't want to use it for web > crawling: I want to experiment with Nutch's MapReduce as a method for > (distributed) searching through some existing, very large datasets that

Getting started with standalone MapReduce

2006-02-24 Thread Jon Blower
Dear all, I have been looking for a Java implementation of Google's MapReduce design and was very glad to find Nutch. However, I don't want to use it for web crawling: I want to experiment with Nutch's MapReduce as a method for (distributed) searching through some existing, very large datasets th

Incremental search of a single domain

2006-02-24 Thread Steven Yelton
This has all probably been hashed out ad nauseam, but I haven't seen an end-to-end howto on what I am trying to do. If I can get all the kinks worked out (and understand all the pieces), I'll be glad to write one. I have a domain that has several hundred thousand documents. I would like to:

Re: recommended plugin example

2006-02-24 Thread Nutch Newbie
I am using 0.8-dev. probably that answers the questions. I will do some double checking tomorrow and see how to solve it, Thanks again for your reply. On 2/24/06, Vanderdray, Jacob <[EMAIL PROTECTED]> wrote: > What version of nutch are you working with? I wrote the example > based on th

Re: Nutch on Windows

2006-02-24 Thread Top100Forever
Don't worry, my english is worst of yours... :) My browser is firefox, version 1.5.0.1. Using Ethereal, I saw that the header is correct, and in more detail, appears as the follow (only interested line): Accept-Language: it-it,it;q=0.8,en-us;q=0.5,en;q=0.3 I tried also with microsoft internet e

RE: recommended plugin example

2006-02-24 Thread Vanderdray, Jacob
What version of nutch are you working with? I wrote the example based on the 0.7.1 base. The second error seems to indicate that you don't have a filter method in your indexer plugin. Check to make sure there isn't a typo in the name of the method. Good luck, Jake. -Origina

recommended plugin example

2006-02-24 Thread Nutch Newbie
Hi Jacob: I been trying to compile the recommended plugin example but having no luck. I am hitting the following error? I did "ant tar" and i added deploy and clean in the plugins/build.xml. But I am keep getting the following error.. As I am just getting started any hint will be greatly appreciat

AW: question to stefan

2006-02-24 Thread Guenter, Matthias
Hi Wouldn't it be a good idea to add URL's to all the companies on the Support page? And also the countries that they are based? It is clear that the eMails are encoded, but I thinkt some info about the background of the companies would help a lot. http://wiki.apache.org/nutch/Support Kind reg

Re: meta in search query string

2006-02-24 Thread TDLN
If you follow Jacob's advice you don't have to change anything in the search.jsp to query the two fields. Just add category:your_category and language:your_language to your query in the search box and the fields will be queried. I am afraid I cannot share my code in this stage, as it was developed

Re: question to stefan

2006-02-24 Thread Stefan Groschupf
I had noticed, that you work for a german company. Is it possible to get some nutch support from you or your company? Sure. Please note here you find a list of all people providing support. http://wiki.apache.org/nutch/Support If have some problems to get nutch running that way I want. If n

url: search fail

2006-02-24 Thread Martin Gutbrod
I changed from 0.7.1 to one of the latest nightly builds (0.8) and now search for url: fields fail. E.g. [ url:my.doman.com ] Has anybody similar experiences? Should I switch back to 0.7.1 ? Log file shows: 2006-02-24 11:17:11 StandardWrapperValve[jsp]: Servlet.service() for servlet jsp threw

question to stefan

2006-02-24 Thread Poettgen
Hi Stefan, I had noticed, that you work for a german company. Is it possible to get some nutch support from you or your company? If have some problems to get nutch running that way I want. If necessary we will pay for it ! That's not the problem. Could you contact me poettgen at acocon.de