Re: IndexDeletionPolicy and optimized indices

2008-07-02 Thread Shalin Shekhar Mangar
Hi Michael, Thanks for the response. Looking at the general way the filenames are organized: IndexCommit.getFileNames() without optimize (after IW.close()) [segments_4, _0.cfs, _1.cfs, _2.cfs] IndexCommit.getFileNames() after optimize+close [segments_5, _4.cfs] We can compare the latest commit

Re: Incorrect Token Offset when using multiple fieldable instance

2008-07-02 Thread Michael McCandless
This would actually be a fairly large change: it's a change to the index format and all APIs that handle offsets during indexing searching/retrieving. We could alternatively extend TokenStream so you could query it for the final offset, then fix indexing to use that value instead of the

Re: IndexDeletionPolicy and optimized indices

2008-07-02 Thread Michael McCandless
OK I think that makes sense. I'll take it. I'll add an isOptimized() to IndexCommit. Mike Shalin Shekhar Mangar wrote: Ok, so there is no reliable way which can work across releases. Actually, we are implementing replication feature for Solr (SOLR-561) and we'd like the user to

Re: Incorrect Token Offset when using multiple fieldable instance

2008-07-02 Thread Toph
Michael McCandless-2 wrote: This would actually be a fairly large change: it's a change to the index format and all APIs that handle offsets during indexing searching/retrieving. For now I just changed the offset calculation in DocumentWriter as specified here by the OP:

Re: IndexDeletionPolicy and optimized indices

2008-07-02 Thread Shalin Shekhar Mangar
That's great. Thanks! On Wed, Jul 2, 2008 at 6:04 PM, Michael McCandless [EMAIL PROTECTED] wrote: OK I think that makes sense. I'll take it. I'll add an isOptimized() to IndexCommit. Mike Shalin Shekhar Mangar wrote: Ok, so there is no reliable way which can work across releases.

Re: Incorrect Token Offset when using multiple fieldable instance

2008-07-02 Thread Michael McCandless
Toph wrote: Michael McCandless-2 wrote: We could alternatively extend TokenStream so you could query it for the final offset, then fix indexing to use that value instead of the endOffset of the last token that it saw. Querying the tokenstream for the final offset would good, but then

Re: IndexDeletionPolicy and optimized indices

2008-07-02 Thread Michael McCandless
OK I opened this one: https://issues.apache.org/jira/browse/LUCENE-1325 Mike Shalin Shekhar Mangar wrote: That's great. Thanks! On Wed, Jul 2, 2008 at 6:04 PM, Michael McCandless [EMAIL PROTECTED] wrote: OK I think that makes sense. I'll take it. I'll add an isOptimized() to

Do Lucene Deletes delete the physical file? If yes, is there a way not to?

2008-07-02 Thread David Lee
Is it possible to delete a document from the index, but not the physical file. And also I'm wondering what the functions are that will alter the physical files being indexed. On a side note: what is the best way to look up information like this so I don't have to bug the java-user mailing list

Re: Do Lucene Deletes delete the physical file? If yes, is there a way not to?

2008-07-02 Thread Karl Wettin
2 jul 2008 kl. 19.59 skrev David Lee: Is it possible to delete a document from the index, but not the physical file. And also I'm wondering what the functions are that will alter the physical files being indexed. Documents are not deleted until you optimize the index. Perhaps they are

Match all documents with non empty field

2008-07-02 Thread Patrick
Hi, Can't seem to wrap my head around how to go about it. I want to retrieve all documents where a certain field in not empty. What would be the best way to do it? Should I search with a MatchAllDocQuery and a Filter? Should I go through all terms in the field and create a TermQuery with it?

Re: Match all documents with non empty field

2008-07-02 Thread Erick Erickson
You can certainly use a filter and MatchAllDocs. You can also index a special value for the field in question (nothere) and combine MatchAllDocs with a NOT field:nothere or some such. Best Erick On Wed, Jul 2, 2008 at 5:25 PM, Patrick [EMAIL PROTECTED] wrote: Hi, Can't seem to wrap my head

Re: Match all documents with non empty field

2008-07-02 Thread Daniel Noll
Patrick wrote: Hi, Can't seem to wrap my head around how to go about it. I want to retrieve all documents where a certain field in not empty. What would be the best way to do it? The most trivial way would be to use a PrefixQuery with an empty string. It won't be efficient unless you wrap

Re: Match all documents with non empty field

2008-07-02 Thread Mark Miller
Daniel Noll wrote: Patrick wrote: Hi, Can't seem to wrap my head around how to go about it. I want to retrieve all documents where a certain field in not empty. What would be the best way to do it? The most trivial way would be to use a PrefixQuery with an empty string. It won't be

Lucene Error : java.io.FileNotFoundException

2008-07-02 Thread yugana
Hi, I am indexing content and searching using lucene. It is working fine when I use the simple servlet and jsp mechanism. I am able to search on the indexed content. I tried to implement the same using JBoss Portal. When I try to run the search, I get the below error: Please help me to resolve

Re: Lucene Error : java.io.FileNotFoundException

2008-07-02 Thread saikrishna venkata pendyala
Please check the path set for lucene-index in configuration file. On Thu, Jul 3, 2008 at 10:11 AM, yugana [EMAIL PROTECTED] wrote: Hi, I am indexing content and searching using lucene. It is working fine when I use the simple servlet and jsp mechanism. I am able to search on the indexed

Re: Lucene Error : java.io.FileNotFoundException

2008-07-02 Thread yugana
I haven't set the path in the configuration file. I have hardcoded the locations. //the directory that stores html files private final String dataDir = d:\\dataDir; //the directory that is used to store lucene index private final String indexDir = d:\\indexDir; saikrishna