Re: SweetSpotSimilarity

2011-07-21 Thread Ian Lea
Have you tried query time boosting of title queries? title:lucene^4 content:lucene. Might be easier than fiddling with sweetspot arguments, although I see from the javadocs that "A per field min/max can be specified if different fields have different sweet spots". Not sure if that is relevant to

Re: HighFreqTerms for results set

2011-07-21 Thread Mihai Caraman
It's only available in Solr and it's based on UnInvertedField . Lucene 3.4.0 should have itimplemented too. I ran a small index in Solr and it does the job by showing

Re: Short circuiting Collector

2011-07-21 Thread Chris Bamford
Hi Simon, scorer.advance(Scorer.NO_MORE_DOCS); Hmm... doesn't seem to work :-( I tried to call it in collect() and setNextReader() - still loops to the end of the matched doc set. What have I missed? Thanks - Chris -Original Message- From: Simon Willnauer To: java-use

RE: Short circuiting Collector

2011-07-21 Thread Uwe Schindler
Hi, The reason is that some scorers passed into setScorer are "fake" scorers that only implement the "score()" method (depends on the type of BooleanQuery scorer used for result collection and if the scorer actively collects the results (happens of top-level-queries). In general to early exit coll

Re: Many File Descriptors Which Showing As Deleted Related To Lucene Indexing, But Not Emptied

2011-07-21 Thread Michael McCandless
This is expected, when you have a reader still open on a point-in-time snapshot of the index, yet the writer is still indexing/merging. The writer will delete old files, but the reader still has them open, so you see those "(deleted)" entries in the lsof output. Mike McCandless http://blog.mikem

Re: Different Index Reader creation method affecting result

2011-07-21 Thread Michael McCandless
One small correction here: NRT reopen is much cheaper than IW.commit / IR.reopen, for getting indexed changes visible to your reader, since it doesn't invoke fsync and doesn't have to write deleted docs to the directory only to then load them again in the reader. But, it's still fairly costly, so

Re: Search within a sentence (revisited)

2011-07-21 Thread Peter Keegan
Hi Mark, Here is a unit test using a version of 'SpanWithinQuery' modified for 3.2 ('getTerms' removed) . The last test fails (search for "1" and "3"). package org.apache.lucene.search.spans; import java.io.Reader; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.To

Re: optimize with num segments > 1 index keeps growing

2011-07-21 Thread v . sevel
Hi, here is a concrete example. I am starting with an index that has 19017236 docs, which takes 58989 Mb on disk: 21.07.2011 15:2120 segments.gen 21.07.2011 15:21 2'974 segments_2acy4 21.07.2011 13:58 0 write.lock 16.07.2011 02:2133'445'798'8

Re: optimize with num segments > 1 index keeps growing

2011-07-21 Thread Ian Lea
A write.lock file with timestamp of 13:58 is in all the listings. The first thing I'd try is to add some IndexWriter.close() calls. -- Ian. On Thu, Jul 21, 2011 at 4:05 PM, wrote: > Hi, > > here is a concrete example. > > I am starting with an index that has 19017236 docs, which takes 58989

Re: optimize with num segments > 1 index keeps growing

2011-07-21 Thread v . sevel
hi, closing after the 2 segments optimize does not change it. also I am running with lucene 3.1.0. cheers, vince Ian Lea 21.07.2011 17:30 Please respond to java-user@lucene.apache.org To java-user@lucene.apache.org cc Subject Re: optimize with num segments > 1 index keeps growin

Re: optimize with num segments > 1 index keeps growing

2011-07-21 Thread Simon Willnauer
so the problem here is that you have one really big segment _52aho.* and several smaller ones _7e0wz.*, _7e0xu.*, _7e1x5.* if you optimize to 2 segmetns all the smaller segments are merged into one but all the large segment remains untouched. This means that all deleted documents in the large

please help

2011-07-21 Thread Vahideh Reshadat
Hello   I am a BEGINNER  for using java, and I havent use it at all! now I need to implement a program which can retrieve TREC docs and.. I studied "Lucene in action"  and understand the topics, but when i tried to implement examples I couldnt! its just because in fact I dont now in which enviro

Re: please help

2011-07-21 Thread Donna L Gresh
This is really not the forum for questions like this (which are not related to Lucene but rather to Java) but for a very simple checklist of what you need, try this: http://download.oracle.com/javase/tutorial/getStarted/cupojava/win32.html But I ask that any further questions which are purely j

Re: please help

2011-07-21 Thread Mihai Caraman
Before you get into Java, you should know that it's posible to find a lucene implementation for your language. Lucene is available in python, c# .net, etc... Search for that first. If you chose Java, you'll need to make baby steps. Install yourself an IDE (eclipse, netbeans...) to get rid of all t

Re: optimize with num segments > 1 index keeps growing

2011-07-21 Thread v . sevel
Hi, thanks for this explanation. so what is the best solution: merge the large segment (how can I do that) or work with many segments (10?) so that I will avoid have this "large segment" issue? thanks, vince Vincent Sevel Lombard Odier Darier Hentsch & Cie 11, rue de la Corraterie - 1204 Genèv

Re: Search within a sentence (revisited)

2011-07-21 Thread Mark Miller
Hey Peter, Getting sucked back into Spans... That test should pass now - I uploaded a new patch to https://issues.apache.org/jira/browse/LUCENE-777 Further tests may be needed though. - Mark On Jul 21, 2011, at 9:28 AM, Peter Keegan wrote: > Hi Mark, > > Here is a unit test using a versio

Re: Search within a sentence (revisited)

2011-07-21 Thread Peter Keegan
Does this patch require the trunk version? I'm using 3.2 and 'AtomicReaderContext' isn't there. Peter On Thu, Jul 21, 2011 at 3:07 PM, Mark Miller wrote: > Hey Peter, > > Getting sucked back into Spans... > > That test should pass now - I uploaded a new patch to > https://issues.apache.org/jira

Re: Search within a sentence (revisited)

2011-07-21 Thread Mark Miller
Yeah, it's off trunk - I'll submit a 3X patch in a bit - just have to change that to an IndexReader I believe. - Mark On Jul 21, 2011, at 4:01 PM, Peter Keegan wrote: > Does this patch require the trunk version? I'm using 3.2 and > 'AtomicReaderContext' isn't there. > > Peter > > On Thu, Jul

RE: optimize with num segments > 1 index keeps growing

2011-07-21 Thread Uwe Schindler
There is also expungeDeletes()... - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: v.se...@lombardodier.com [mailto:v.se...@lombardodier.com] > Sent: Thursday, July 21, 2011 8:39 PM > To: java-user@lucene

Re: Search within a sentence (revisited)

2011-07-21 Thread Mark Miller
I just uploaded a patch for 3X that will work for 3.2. On Jul 21, 2011, at 4:25 PM, Mark Miller wrote: > Yeah, it's off trunk - I'll submit a 3X patch in a bit - just have to change > that to an IndexReader I believe. > > - Mark > > On Jul 21, 2011, at 4:01 PM, Peter Keegan wrote: > >> Does

Re: Search within a sentence (revisited)

2011-07-21 Thread Peter Keegan
The 3X patch works great, Mark! (how do you get your head around spans so quickly after 2.5 years? :) ) Thanks, Peter On Thu, Jul 21, 2011 at 5:23 PM, Mark Miller wrote: > > I just uploaded a patch for 3X that will work for 3.2. > > On Jul 21, 2011, at 4:25 PM, Mark Miller wrote: > > > Yeah, it