RE: Lucene 2.9.0 / BooleanQuery problem

2009-10-29 Thread Uwe Schindler
The BooleanQuery Does not work, because the Sector field is analyzed and you are searching with a simple TermQuery which is not anylzed. So "Computing" is not lowercased and will not hit any terms (try luke and look into your terms you have indexed). Such field like the "sector" one should be made

search problem

2009-10-29 Thread m.harig
hello all i've a doubt in search , i've a word in my index welcomelucene (without spaces) , when i search for welcome lucene(with a space) , am not able to get the hits. It should pick the document welcomelucene.. is there anyway to do it ? i've used wildcard option too. but no results , ple

Re: search problem

2009-10-29 Thread Erick Erickson
Why would you expect to get a hit on your document? There are three distinct tokens here: welcomlucene welcome lucene Lucene searches for *matching* tokens, so searching for the tokens 'welcome' and 'lucene' essentially asks "are there two tokens in the document that exactly match these?" and the

Re: IO exception during merge/optimize

2009-10-29 Thread Peter Keegan
A handful of the source documents did contain the U+ character. The patch from *LUCENE-2016 *fixed the problem. Thanks Mike! Peter On Wed, Oct 28, 2009 at 1:29 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > Hmm, only a few aff

Re: IO exception during merge/optimize

2009-10-29 Thread Peter Keegan
Btw, this 2.9 indexer is fast! I indexed 4Gb (1.07 million docs) with optimization in just under 30 min. I used setRAMBufferSizeMB=1.9G Peter On Thu, Oct 29, 2009 at 3:46 PM, Peter Keegan wrote: > A handful of the source documents did contain the U+ character. The > patch from *LUCENE-2016

Re: IO exception during merge/optimize

2009-10-29 Thread Michael McCandless
I'm glad we finally got to the bottom of this :) This fix will be in 2.9.1. This is a nice fast indexing result, too... Mike On Thu, Oct 29, 2009 at 3:55 PM, Peter Keegan wrote: > Btw, this 2.9 indexer is fast! I indexed 4Gb (1.07 million docs) with > optimization in just under 30 min. > I use

Re: IO exception during merge/optimize

2009-10-29 Thread Mark Miller
Any chance I could get you to try that again with a buffer of like 800MB to a gig and do a comparison? I've been investigating the returns you get with a larger buffer size. It appears to be pretty diminishing returns over 100MB or so - at higher than that, I've gotten both slower speeds for some

Re: search problem

2009-10-29 Thread Karl Wettin
29 okt 2009 kl. 12.12 skrev m.harig: i've a doubt in search , i've a word in my index welcomelucene (without spaces) , when i search for welcome lucene(with a space) , am not able to get the hits. It should pick the document welcomelucene.. is there anyway to do it ? i've used wildcar

Re: IO exception during merge/optimize

2009-10-29 Thread Peter Keegan
Mark, With 1.9G, I had to increase the JVM heap significantly (to 8G) to avoid paging and GC hits. Here is a table comparing indexing times, optimizing times and peak memory usage as a function of the RAMBufferSize. This was run on a 64-bit server with 32GB RAM: RamSizeIndex(min)Opt

Re: IO exception during merge/optimize

2009-10-29 Thread Mark Miller
Thanks a lot Peter! Really appreciate it. Peter Keegan wrote: > Mark, > > With 1.9G, I had to increase the JVM heap significantly (to 8G) to avoid > paging and GC hits. Here is a table comparing indexing times, optimizing > times and peak memory usage as a function of the RAMBufferSize. This was

Re: IO exception during merge/optimize

2009-10-29 Thread Peter Keegan
A couple more data points: RamSizeIndex(min)Optimize(min)Peak mem 1.9G2455G 800M2454G 400M25 53.5G 100M2553G 50M 26 43G Peter On Thu, Oct 29, 2009 at 8:4

Re: search problem

2009-10-29 Thread m.harig
Thanks Erick , i understand the issue , but my doubt is when you search for a keyword which is originally a single word, for example , metacity is really single keyword . when i search for meta city am not able to get the results , this is what my doubt , if you goto google and search for m

Re: What is multiple indexing and how does it work in Lucene [Java]

2009-10-29 Thread DHIVYA M
The question is indeed wrong. Sry for the inconvenience. Actually i should have asked this way!   Am trying out executing the demo of lucene 1.4.3. When i run a file for the first time, the index is properly getting created. When i run the indexing for the second time with a different file, the fi

clucene user

2009-10-29 Thread Vithya Arumugasami
Hi, Am working with clucene.Kindly tell the forum to find my solution Thanks Vithya

Re: What is multiple indexing and how does it work in Lucene [Java]

2009-10-29 Thread Anshum
In case you are trying to say that in subsequent runs, the previous state of the index just goes off, its because the indexwriter gets opened with 'create new' flag as true. In other words, the index would be newly created overwriting any existing index at the directory location. The solution to th

soln found for overwritten problem

2009-10-29 Thread DHIVYA M
Let me try out and get back to you sir   Thanks M.Dhivya --- On Fri, 30/10/09, Anshum wrote: From: Anshum Subject: Re: What is multiple indexing and how does it work in Lucene [Java] To: java-user@lucene.apache.org Date: Friday, 30 October, 2009, 6:23 AM In case you are trying to say that in

Re: soln found for index overwritting problem

2009-10-29 Thread DHIVYA M
Thanks a lot sir. Its working out well.   But i have one more doubt. Is it possible to check whether the same documents are indexed again and again? bcos due to appending of indexes, when i search a query, the result is displayed as much number of times as the index is created for that document.