Re: Closing IndexWriter can be very slow on large indexes

2011-07-26 Thread Michael McCandless
Which method (abort or close) do you see taking so much time? It's odd, because IW.abort should quickly stop any running BG merges. Can you get a dump of the thread stacks during this long abort/close and post that back? Can't answer if Lucene 3.x will improve this situation until we find the so

Re: Text Categorization with Lucene (N-Gram technique)

2011-07-26 Thread Grant Ingersoll
Lucene has support for ngrams during indexing and querying. The rest would have to be done for you. Taming Text chapter 7 has some basic implementations using Lucene to do categorization. http://www.manning.com/ingersoll -Grant On Jul 24, 2011, at 12:38 PM, Saurabh Gokhale wrote: > Hi All

AW: implicit closing of an IndexWriter

2011-07-26 Thread Clemens Wyss
Ok, I just read the java doc ... Is there a possibility to just revert the pending writes of an IR? > -Ursprüngliche Nachricht- > Von: Clemens Wyss [mailto:clemens...@mysign.ch] > Gesendet: Dienstag, 26. Juli 2011 17:25 > An: java-user@lucene.apache.org > Betreff: AW: implicit closing of a

Closing IndexWriter can be very slow on large indexes

2011-07-26 Thread Chris Bamford
Hi I think I must be doing something wrong, but not sure what. I have some long running indexing code which sometimes needs to be shutdown in a hurry. To achieve this, I set a shutdown flag which causes it to break from the loop and call first abort() and then close(). The problem is that w

AW: implicit closing of an IndexWriter

2011-07-26 Thread Clemens Wyss
> If yes, the IW is closed afterwards ;) why?, shouldn't this be the "difference" to IW.close(), i.e. NOT closing afterwards? > -Ursprüngliche Nachricht- > Von: Uwe Schindler [mailto:u...@thetaphi.de] > Gesendet: Dienstag, 26. Juli 2011 17:21 > An: java-user@lucene.apache.org > Betreff: R

RE: implicit closing of an IndexWriter

2011-07-26 Thread Uwe Schindler
OK then you are fine! Something else: Are you using IndexWriter.rollback()? If yes, the IW is closed afterwards. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Clemens Wyss [mailto:clemens...@mysign.ch

AW: implicit closing of an IndexWriter

2011-07-26 Thread Clemens Wyss
The patch "says" + * If your application uses either {@link Thread#interrupt()} or + * {@link Future#cancel(boolean)} you should use {@link SimpleFSDirectory} in + * favor of {@link NIOFSDirectory}. and I am using SimpleFSDirectory. What else am I missing/overseeing? > -Ursprüngliche Nachricht

RE: implicit closing of an IndexWriter

2011-07-26 Thread Uwe Schindler
Please read the whole issue. The issue is fixed by "adding documentation", the original JVM problem itself is still there and cannot be fixed (if you interrupt threads). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Messag

AW: implicit closing of an IndexWriter

2011-07-26 Thread Clemens Wyss
I am using Lucene 3.3 > -Ursprüngliche Nachricht- > Von: Mark Miller [mailto:markrmil...@gmail.com] > Gesendet: Dienstag, 26. Juli 2011 16:05 > An: java-user@lucene.apache.org > Betreff: Re: implicit closing of an IndexWriter > > > On Jul 26, 2011, at 9:52 AM, Clemens Wyss wrote: > > >

Re: implicit closing of an IndexWriter

2011-07-26 Thread Mark Miller
On Jul 26, 2011, at 9:52 AM, Clemens Wyss wrote: > Side note: I am using threads when writing and theses threads are (by design) > interrupted (from time to time) Perhaps you are seeing this: https://issues.apache.org/jira/browse/LUCENE-2239 - Mark Miller lucidimagination.com

implicit closing of an IndexWriter

2011-07-26 Thread Clemens Wyss
Under which circumstances is an IndexWriter "implcitly" closed? I have an IndexWriter member in one of my helper classes which ist openened in the constructor. I never ever close this member explicitly. Nevertheless I encounter AlreadyClosedException's when writing through the IndexWriter ...

Re: Search within a sentence (revisited)

2011-07-26 Thread Mark Miller
As long as you are happy with the results, I'm good. Always nice to have an excuse to dip back into Lucene. Just don't want you to feel over confident with the code without proper testing of it - I coded to fix the broken tests rather than taking the time to write a bunch more corner case tests

Re: Search within a sentence (revisited)

2011-07-26 Thread Peter Keegan
Thanks Mark! The new patch is working fine with the tests and a few more. If you have particular test cases in mind, I'd be happy to add them. Thanks, Peter On Mon, Jul 25, 2011 at 5:56 PM, Mark Miller wrote: > Sorry Peter - I introduced this problem with some kind of typo type issue - > I some

Re: Strange StopFilter and stop words behaviour

2011-07-26 Thread Dawn Zoë Raison
Are you using QueryAnalyser...? If so remember that NOT is a reserved word. Dawn On 26/07/2011 04:25, SBS wrote: If I enter a query of just the word "not" I get no matches. If I run a query with just the word "included" I get lots of matches. If I run the query "not included" (without surroun

Re: boolean score calculation

2011-07-26 Thread Ian Lea
Have you tried CustomScoreQuery/CustomScoreProvider? Complicated but powerful. -- Ian. On Mon, Jul 25, 2011 at 9:29 AM, Pavel Goncharik wrote: > Hi, > > as far as I can see, boolean scorers always sum up scores of their > sub-scorers. It works, but in case of my application it's required to >

Re: Strange StopFilter and stop words behaviour

2011-07-26 Thread Ian Lea
I think that passing an empty set or null to StandardAnalyzer should do what you want. There are useful tips at http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3F. My guess would be that you aren't using a no-stop-words version of StandardAnalyzer at bo

Strange StopFilter and stop words behaviour

2011-07-26 Thread SBS
My goal is to be able to get meaningful results from search queries that include some words that are on the default stop words list, especially "not". I am using the StandardAnalyzer and I have tried passing in null and an empty set for the set of stop words to use in the constructor hoping that n