Re: FuzzyLikeThisQuery what does maxNumTerms mean

2007-05-09 Thread bhecht
Thanks again. I wasn't aware of the problematic part with updating posts. Sorry for that, and thanks for the answer. Good day. -- View this message in context: http://www.nabble.com/FuzzyLikeThisQuery-what-does-maxNumTerms-mean-tf3716547.html#a10407540 Sent from the Lucene - Java Users mailing l

Re: Lock obtain timed out while searching

2007-05-09 Thread Otis Gospodnetic
Laxmilal - this could be a left-over lock. Try removing it "manually" and re-running your search app. Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Laxmilal Menaria <[EMAIL PROTECTE

add in an existing document

2007-05-09 Thread STEFANOS STEFANOS
Hello, I would like to ask you about the function add of lucene 2.0. When i am trying to add a document in an existed index the index's documents are replaced by the new document. I did it according to the book. cheers, Stefanos -

Re: Keyphrase Extraction (via Lingo)

2007-05-09 Thread Bill Janssen
> Dawid Weiss wrote: > > You could also try splitting the document into paragraphs and use Carrot2's > > Lingo algorithm (www.carrot2.org) on a paragraph-level to extract clusters. > > Labelling routine in Lingo should extract 'key' phrases; this analysis is > > heavily frequency-based, but... y

Re: Sorting on a field that can have null values

2007-05-09 Thread Theodan
Chris Hostetter wrote: > > > : If i rememebr correctly (you'll have to test this) sorting on a field > : which doesn't exist for every doc does what you would want (docs with > : values are listed before docs without) > > : The actual behavior is different than described above. I modified > :

Re: FuzzyLikeThisQuery what does maxNumTerms mean

2007-05-09 Thread markharw00d
bhecht wrote: Thanks Mark, I have updated my previous post I guess, before you had a chance to read it. Did you edit your post on Nabble? That edit didn't come through as a message to java-user so I didn't see it. You shouldn't need to call rewrite on your FuzzyLikeThisQuery unless you wan

Re: FuzzyLikeThisQuery what does maxNumTerms mean

2007-05-09 Thread Chris Hostetter
: I have updated my previous post I guess, before you had a chance to read it. : Can you please re-read my post again? : Sent from the Lucene - Java Users mailing list archive at Nabble.com. please do not "update" posts ... independent of any features that third party companies may provide for a

RE: Locking in Lucene 2.1

2007-05-09 Thread Andreas Guther
I opened an issue: https://issues.apache.org/jira/browse/LUCENE-877 -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Wednesday, May 09, 2007 1:37 PM To: java-user@lucene.apache.org Subject: Re: Locking in Lucene 2.1 On Wednesday 09 May 2007 21:18, Andreas Guther wro

Re: search problem/odd results

2007-05-09 Thread Andrzej Bialecki
John Powers wrote: Lucene 1.4 is what I use for the application and how I compile things But in the help/about it says "null" And I usually use the webstart, ya. But I've tried from a downloaded jar as well. and I'm using java1.6 from sun Ah, all is clear now. Luke will not work with Lucene <

RE: search problem/odd results

2007-05-09 Thread John Powers
Lucene 1.4 is what I use for the application and how I compile things But in the help/about it says "null" And I usually use the webstart, ya. But I've tried from a downloaded jar as well. and I'm using java1.6 from sun -Original Message- From: Andrzej Bialecki [mailto:[EMAIL PROTECTED]

Re: search problem/odd results

2007-05-09 Thread Andrzej Bialecki
John Powers wrote: java.lang.NoSuchFieldException: IMPL at java.lang.Class.getDeclaredField(Unknown Source) That's a strange one ... Which version of Lucene are you using with Luke (you can check this in Help -> About box, look for "Lucene version:"). Are you running this from Java We

Re: FuzzyLikeThisQuery what does maxNumTerms mean

2007-05-09 Thread bhecht
Thanks Mark, I have updated my previous post I guess, before you had a chance to read it. Can you please re-read my post again? Thanks -- View this message in context: http://www.nabble.com/FuzzyLikeThisQuery-what-does-maxNumTerms-mean-tf3716547.html#a10402974 Sent from the Lucene - Java Users

Re: FuzzyLikeThisQuery what does maxNumTerms mean

2007-05-09 Thread markharw00d
The shortlisting isn't based on stop words - a score is produced to prioritise term selections. The score uses the IDF (inverse document frequency) of the original term and mixes in the "edit-distance" for each of the fuzzy variations of original terms. Care is taken to ensure that in the query

RE: search problem/odd results

2007-05-09 Thread John Powers
java.lang.NoSuchFieldException: IMPL at java.lang.Class.getDeclaredField(Unknown Source) at org.getopt.luke.Luke.openDirectory(Unknown Source) at org.getopt.luke.Luke.openIndex(Unknown Source) at org.getopt.luke.Luke.openOk(Unknown Source) at sun.reflect.Nati

Re: Locking in Lucene 2.1

2007-05-09 Thread Daniel Naber
On Wednesday 09 May 2007 21:18, Andreas Guther wrote: > Do I miss something here or is the documentation not updated? Looks like that part of the documentation isn't up-to-date. The file is called write.lock and it's stored in the index directory. Could you file an issue so the documentation ge

Re: search problem/odd results

2007-05-09 Thread Daniel Naber
On Wednesday 09 May 2007 16:17, John Powers wrote: > Yes, it doesn't work.     it gives an error modal dialog box that says > "IMPL". Is there a more useful error message when you start Luke from the command line and try to open the index? Regards Daniel -- http://www.danielnaber.de ---

Re: FuzzyLikeThisQuery what does maxNumTerms mean

2007-05-09 Thread bhecht
Thanks Mark for the detailed explanation. So one more question if I may: How is the list shortened to to include terms only? In your example you had 2 stop words which of course are not included in the token stream. But what happens if you get more than maxNumTerms terms, how are the maxNumTerms

Locking in Lucene 2.1

2007-05-09 Thread Andreas Guther
I am in the process to migrate from Lucene 2.0 to Lucene 2.1. >From reading the Changes document I understand that the write locks are now written into the index folder instead of the java.io.tmpdir. In the "Apache Lucene - Index File Formats" document in section "6.2 Lock File" I read that ther

Re: Automatic analyzer resolving based on Locale

2007-05-09 Thread Chris Hostetter
: - Use an IndexEverythingAnalyzer for writing, : so "werk", "werkte", "gewerkt" and "en" is indexed as-is when they are : encountered. : : - And then use a DutchAnalyzer for reading, : which if I ask "werk" searches for "werk", "werkte" and "gewerkt", : and also ignores stop words like "en" in th

Re: Automatic analyzer resolving based on Locale

2007-05-09 Thread Erick Erickson
Well, I don't see how this can work. In your example, you'd index "werkte". But how are you going to search such that this matches "werk"? No matter what analyzers you use? It looks like you're thinking about either stemming or wildcarding, but I really suspect that stemming is language dependent.

Re: FuzzyLikeThisQuery what does maxNumTerms mean

2007-05-09 Thread mark harwood
FuzzyLikeThis is effectively the same as MoreLikeThis but adds fuzzy variations for the selected terms. >>Will it only use the first 3 terms or what? It works the same way as MoreLikeThis in that it is selective about which input terms are used for querying - words like "the" will typically get

Re: Proximity searching with subexpressions

2007-05-09 Thread Mark Miller
The ~ syntax can only be applied to a single phrase, i.e. "the greate one"~3. This sets the slop allowed for the phrase to be 3. The defintion of slop can be found by searching the archive, but it will not easily allow for what you are looking for. The Lucene query language has not yet recived Spa

FuzzyLikeThisQuery what does maxNumTerms mean

2007-05-09 Thread bhecht
Hello all, I am new to lucene and want to use the FuzzyLikeThisQuery. I have read the documentation for this class, and read the following for what maxNumTerms means: "maxNumTerms - The total number of terms clauses that will appear once rewritten as a BooleanQuery". In addition in the classes d

Re: Automatic analyzer resolving based on Locale

2007-05-09 Thread Geoffrey De Smet
We 'd use a different index for each locale's language that is configured, however this might have an impact on performance. Would this be attainable (maybe some day in lucene)? - Use an IndexEverythingAnalyzer for writing, so "werk", "werkte", "gewerkt" and "en" is indexed as-is when they are

RE: search problem/odd results

2007-05-09 Thread John Powers
Yes, it doesn't work. it gives an error modal dialog box that says "IMPL". -Original Message- From: Daniel Naber [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 08, 2007 4:45 PM To: java-user@lucene.apache.org Subject: Re: search problem/odd results On Tuesday 08 May 2007 23:42, John

Proximity searching with subexpressions

2007-05-09 Thread Walt Stoneburner
According to the documentation for Lucene's Query Parser Syntax, the tilde operator provides a proximity search. For instance, "Harry Hallows"~6 should match the text 'Harry Potter and the Deathly Hallows'. And while this is fine for two token phrases, I was wondering if it worked well for subex

Re: Scoring results?!

2007-05-09 Thread Grant Ingersoll
Hi Eric, On May 9, 2007, at 2:39 AM, supereric wrote: How I can get the tag word score in lucene. suppose that you have searched a tag word and 3 hit documents are now found. 1 -How someone could find number of occurrences in any document so it could sort the results. Span Queries tell