Re: Understanding Lucene Slop

2006-07-20 Thread Erick Erickson
Have you looked at SpanNearQuery? From what you describe, it looks to be what you want. The constructor takes slop as well as a boolean whether order is relevant. The array of SpanQuerys would probably consist of a bunch of SpanTermQuerys. Best Erick

Understanding Lucene Slop

2006-07-20 Thread Walt Stoneburner
Hello, I'm trying to understand Lucene's slop value a little better, as what I'm able to Google about it seems a little ambiguous. My main goal is to search for a linear sequence of keywords in a specific order over a given range. For instant I'd like a query of "fate ships" to find

Re: Performance question

2006-07-20 Thread Doron Cohen
> Does it matter what order I add the sub-queries to the BooleanQuery Q. > That is, is the execution speed for the search faster (slower) if I do: > Q.add(Q1, BooleanClause.Occur.MUST); > Q.add(Q2, BooleanClause.Occur.MUST); > Q.add(Q3, BooleanClause.Occur.MUST);

Re: Lock obtain time out (&OT: Mailing list settings)

2006-07-20 Thread Paul Borgermans
Hello I suppose that you are using gmail? It is just a property of gmail, take a look at thee archives after a few hours, you will find it back ;-) for example: http://mail-archives.apache.org/mod_mbox/lucene-java-user/ hth --paul On 7/19/06, Pasquale Imbemba <[EMAIL PROTECTED]> wrote: Mi

Performance question

2006-07-20 Thread Scott Smith
I was reading a book on SQL query tuning. The gist of it was that the way to get the best performance (fastest execution) out of a SQL select statement was to "create" execution plans where the most selective term in the "where" clause is used first, the next most selective term is used next, etc.

RE: NFS/iSCSI SAN with Lucene

2006-07-20 Thread Peter Kim
Hi Mike, Thanks for the information! Peter > -Original Message- > From: Michael McCandless [mailto:[EMAIL PROTECTED] > Sent: Wednesday, July 19, 2006 5:49 PM > To: java-user@lucene.apache.org > Subject: Re: NFS/iSCSI SAN with Lucene > > > > I did a search on the Lucene list archives,

RE: Date ranges - getting the approach right

2006-07-20 Thread Rob Staveley (Tom)
Wow. Looking at the implementation of http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexReader.h tml#open(org.apache.lucene.store.Directory) I've now realised that when you create an IndexReader (clue it is abstract), you actually instantiate a MultiReader, with an IndexReader for

RE: Date ranges - getting the approach right

2006-07-20 Thread Mike Streeton
This is how we solve the range query problem using filters. The nice part about it is you can use a range in a query so several ranges can be ORed/ANDed or NOTed together if required, instead of applying a range filter to the who query. (Assumes dates in MMDD format) Hope this helps Mike. Ext

RE: Date ranges - getting the approach right

2006-07-20 Thread Rob Staveley (Tom)
Sorry for the delayed response. It takes me a while to get my head around Lucene. I've got parallel indexes, which means that chorological ordering by doc ID would need to be a bit more sophisticated. It strikes me that there must be some performance advantage doing it though. I'll see if I can

Re: Empty fields ...

2006-07-20 Thread Erick Erickson
What? You actually want me to put forth some effort? That's crazy talk .. Thanks, I think I've got it now. Best Erick

Re: Query does not work past 26 characters?!

2006-07-20 Thread Michael Prichard
ARRRGH!!! That's it. Darn, I was half asleep last night when I was experimenting. I totally feel like a dope. It worksThanks! -Michael On Thursday, July 20, 2006, at 00:36AM, Doron Cohen <[EMAIL PROTECTED]> wrote: >> doc.add(new Field("to", >> "[EMAIL PROTECTED]", >> ... >> PrefixQu

Re: PDF documents with "MoreLikeThis" class

2006-07-20 Thread mark harwood
>>Do I have to extract text from PDF file and then pass an InputStream with the >>text inside? Yes. Although technically you could pass the content unparsed it will contain a lot of unintelligible garbage in the form of markup and images. All Lucene classes deliberately try and avoid the mucky

PDF documents with "MoreLikeThis" class

2006-07-20 Thread Davide
Hi, I'm using MoreLikeThis class to find similar documents... but I'm not sure if it is correct to pass as argument a Pdf file to *MoreLikeThis.like()* method. Trying to be more clear: 1) In my Lucene index I add some PDF files (I use PDFBox to extract text and add fields to index) 2) Now I want

Re: Index-Format difference between 1.4.3 and 2.0

2006-07-20 Thread Miles Barr
Andrzej Bialecki wrote: lude wrote: As Luke was release with a Lucene-1.9 Where did you get this information? From all I know Luke is based on Lucene Version 1.4.3. The latest version of Luke was released with an early snapshot of 1.9. I plan to release a 2.0-based version in a f

Re: Index-Format difference between 1.4.3 and 2.0

2006-07-20 Thread Andrzej Bialecki
lude wrote: As Luke was release with a Lucene-1.9 Where did you get this information? From all I know Luke is based on Lucene Version 1.4.3. The latest version of Luke was released with an early snapshot of 1.9. I plan to release a 2.0-based version in a few days. -- Best regards,

Re: Index-Format difference between 1.4.3 and 2.0

2006-07-20 Thread yueyu lin
I'm using Luke to manage Lucene 1.9's index On 7/20/06, lude <[EMAIL PROTECTED]> wrote: > As Luke was release with a Lucene-1.9 Where did you get this information? From all I know Luke is based on Lucene Version 1.4.3. On 7/19/06, Nicolas Lalevée <[EMAIL PROTECTED]> wrote: > > Le Mercre

Re: Lucene support for OpenDocument?

2006-07-20 Thread Andrzej Bialecki
Daniel Noll wrote: marbux wrote: Hello, The OpenDocument Fellowship attempts to maintain a directory of applicatiopns supporting OpenDocument file formats. < http://www.opendocumentfellowship.org/applicationsa>. I have been attempting, without success, to determine whether Lucene supports OpenD

Re: Index-Format difference between 1.4.3 and 2.0

2006-07-20 Thread lude
As Luke was release with a Lucene-1.9 Where did you get this information? From all I know Luke is based on Lucene Version 1.4.3. On 7/19/06, Nicolas Lalevée <[EMAIL PROTECTED]> wrote: Le Mercredi 19 Juillet 2006 12:32, lude a écrit: > Hi Nicolas, > > thanks for answering. > > You wrote:

Re: Empty fields ...

2006-07-20 Thread Chris Hostetter
: Thanks much for that clarification, it helps a lot. The original request was : to find docs wthat were NOT NULL, so I'm glad I'm not the only one who : But with your RangeFilter comment, that seems unnecessary. You can use a : RangeFilter with null, null as bounds, then just flip the bits in t