boolean query or

2008-07-08 Thread Cam Bazz
Hello, Is it possible to make a boolean query where a word is equal to fieldA or fieldB? in other words, I like to search a word in two fields, if word passes in fieldA or fieldB, then it is a hit. Best, -C.B.

How to handle frequent updates.

2008-07-08 Thread miztaken
Hi there, I know lucene is for indexing and not for frequent updates and delete. But i have been using lucene to store my matrix as a document. Since with my algorithm the value of matrix can change so i am updating the value. But for this i have to close and reopen indexReader and in additional t

Re: Readers synchronization

2008-07-08 Thread Michael McCandless
Not that I know of. Mike Eric Diaz wrote: Is there any plan to change this behavior? meaning that by default a reader will see the current index? Thanks in advance --- On Tue, 7/8/08, Michael McCandless <[EMAIL PROTECTED]> wrote: From: Michael McCandless <[EMAIL PROTECTED]> Subject:

Re: Readers synchronization

2008-07-08 Thread Eric Diaz
Is there any plan to change this behavior? meaning that by default a reader will see the current index? Thanks in advance --- On Tue, 7/8/08, Michael McCandless <[EMAIL PROTECTED]> wrote: > From: Michael McCandless <[EMAIL PROTECTED]> > Subject: Re: Readers synchronization > To: java-user@lucen

Re: Readers synchronization

2008-07-08 Thread Michael McCandless
No other techniques that I know of... But there is ongoing discussions/work towards making reopening a reader much less costly. EG repopulating the field cache after reopen is a costly operation now, but this issue: https://issues.apache.org/jira/browse/LUCENE-1231 would make that co

Re: Readers synchronization

2008-07-08 Thread Eric Diaz
Besides the warm up that the faq section suggests (used on solr), is there another technique or solution to have an IndexReader/Search with an updated view of an index under a concurrent scenario (web app)? Thanks --- On Tue, 7/8/08, Michael McCandless <[EMAIL PROTECTED]> wrote: > From: Michae

Re: Move from RAMDirectory to FSDirectory causing problem sometimes

2008-07-08 Thread Michael McCandless
OK I opened: https://issues.apache.org/jira/browse/LUCENE-1331 Mike Paul Taylor wrote: Michael McCandless wrote: Hmmm, you should not close the directory if you are then going to use it to instantiate a searcher. how come it works ? Your code below never closes the searcher? I th

Re: Readers synchronization

2008-07-08 Thread Michael McCandless
No, that's not changed. You must still reopen an IndexReader to see changes to the index. An IndexReader always searches a point-in-time snapshot of the index. LUCENE-1044 does mean that you should call IndexWriter.commit() (or, close the writer) to ensure all changes you've made become

Re: Move from RAMDirectory to FSDirectory causing problem sometimes

2008-07-08 Thread Michael McCandless
It works because Lucene doesn't currently check for it, and, because closing an FSDirectory does not actually make it unusable. In fact it also doesn't catch a double-close call. But it may cause subtle problems, because FSDirectory has this invariant: only a single instance of FSDirecto

Readers synchronization

2008-07-08 Thread Eric Diaz
According to SVN history on the next version this will be available: LUCENE-1044: IndexWriter with autoCommit=true now commits (such that a reader can see the changes) far less often than it used to. Previously, every flush was also a commit. You can always force a commit by calling I

Re: How to make documents clustering and topic classification with lucene

2008-07-08 Thread Glen Newton
Use Carrot2: http://project.carrot2.org/ For Lucene + Carrot2: http://project.carrot2.org/faq.html#lucene-integration -glen 2008/7/7 Ariel <[EMAIL PROTECTED]>: > Hi everybody: > Do you have Idea how to make how to make documents clustering and topic > classification using lucene ??? Is there a

Re: 'deletable' indexing files are not deleted on RHEL5

2008-07-08 Thread Erick Erickson
Assuming your indexing completes, after the whole thing is done and the process terminates, what is the size of your index? Is it possible that your old box had lots more disk space and you just never noticed the (perhaps temporary) disk space usage? Best Erick 2008/7/8 Zhou Lin Dai <[EMAIL PROT

Re: Move from RAMDirectory to FSDirectory causing problem sometimes

2008-07-08 Thread Paul Taylor
Michael McCandless wrote: Hmmm, you should not close the directory if you are then going to use it to instantiate a searcher. how come it works ? Your code below never closes the searcher? I think that is most likely the source of your file descriptor leaks. Ok fixed paul --

Re: Move from RAMDirectory to FSDirectory causing problem sometimes

2008-07-08 Thread Michael McCandless
Also, if possible, you should share the IndexSearcher across multiple searches (ie, don't open/close a new one per search). Opening an IndexSearcher can be a resource intensive operation, so you'll see better throughput if you share. (Though in your particular situation it may not matte

Re: Move from RAMDirectory to FSDirectory causing problem sometimes

2008-07-08 Thread Michael McCandless
Hmmm, you should not close the directory if you are then going to use it to instantiate a searcher. Your code below never closes the searcher? I think that is most likely the source of your file descriptor leaks. Mike Paul Taylor wrote: Michael McCandless wrote: Technically you shou

Re: Move from RAMDirectory to FSDirectory causing problem sometimes

2008-07-08 Thread Paul Taylor
Michael McCandless wrote: Technically you should call directory.close() as well, but missing that will not lead to too many open files. How often is that RuntimeException being thrown? EG if a single document is frequently hitting an exception during analysis, your code doesn't close the I

Re: Move from RAMDirectory to FSDirectory causing problem sometimes

2008-07-08 Thread Michael McCandless
Technically you should call directory.close() as well, but missing that will not lead to too many open files. How often is that RuntimeException being thrown? EG if a single document is frequently hitting an exception during analysis, your code doesn't close the IndexWriter in that situa

Re: 'deletable' indexing files are not deleted on RHEL5

2008-07-08 Thread Michael McCandless
What do you mean by "deletable" indexing files? Moving to RHEL5 should have no effect (vs other platforms) on how much disk space is used. However, Lucene's disk usage can be surprising. While merging segments it will temporarily require free space equal to the size of the resulting mer

'deletable' indexing files are not deleted on RHEL5

2008-07-08 Thread Zhou Lin Dai
Hi I'm using Lucene on a RHEL5 box. The indexing folder is growing extremely large, more than 20 GB, with a lot 'deletable' indexing files. It runs out of the disk. I have to clear the entire folder and start indexing from blank. The code ran fine before I moved it onto RHEL5. Does that matter? C

Move from RAMDirectory to FSDirectory causing problem sometimes

2008-07-08 Thread Paul Taylor
Hi, I have been using a RAMDirectory for indexing without any problem, but I then moved to a file based directory to reduce memory usage. this has been working fine on Windows and OSX and my version of linux (redhat) but is failing on a version of linux (archlinux) with 'Too many files opened'