Re: Using lucene as a database... good idea or bad idea?

2008-07-31 Thread Karsten F.
Hi Grant, you made mention of jackrabbit as example of storing data in lucene. I did not find something like that in source-code. I found LocalFileSystem and DatabaseFileSystem. (I found lucene for indexing and searching.) Have I overlooked something? Best regards Karsten Grant

Re: Modifying a document by updating a payloads?

2008-07-31 Thread Michael McCandless
Antony Bowesman wrote: Hi Mike, Unfortunately you will have to delete the old doc, then reindex a new doc, in order to change any payloads in the document's Tokens. This issue: https://issues.apache.org/jira/browse/LUCENE-1231 which is still in progress, could make updating stored (but

Re: Using lucene as a database... good idea or bad idea?

2008-07-31 Thread Grant Ingersoll
Hmmm, I thought it did. Can't say I've studied the code though, so I'll take your word for it. Never mind on the Jackrabbit suggestion :-) Cheers, Grant On Jul 31, 2008, at 4:54 AM, Karsten F. wrote: Hi Grant, you made mention of jackrabbit as example of storing data in lucene. I did

Re: Bug in Sun's 1.6 hotspot compiler that can cause index corruption

2008-07-31 Thread Ian Lea
Excellent. And to confirm - my always-fails index load runs to successful completion on this release. Thanks Mike. -- Ian. On Wed, Jul 30, 2008 at 7:09 PM, Michael McCandless [EMAIL PROTECTED] wrote: FYI -- there is a nasty bug that affects Lucene in Sun's 1.6 hotspot compiler, starting

Re: Bug in Sun's 1.6 hotspot compiler that can cause index corruption

2008-07-31 Thread Michael McCandless
Excellent! Thank you for originally raising this issue; I think you were first to stumble on this unfortunate bug. Because you raised it here, and iterated like crazy to narrow it down, we were able to eventually track it down resolve it. Lucene progress would not happen without such

Re: Using lucene as a database... good idea or bad idea?

2008-07-31 Thread Ganesh - yahoo
which one will be the best to use as storage server. Lucene or Jackrabbit. My requirement is to provide support to 1) Archive the documents 2) Do full text search on the documents. 3) Do backup the index store and archive store. [periodical basis] 4) Remove the documents after certain period

Re: Using lucene as a database... good idea or bad idea?

2008-07-31 Thread Karsten F.
Hi Ganesh, in this Thread nobody said, that lucene is a good storage server. Only it could be used as storage server (Grant: Connect data storage with simple, fast lookup and Lucene..) I don't now about automatic rentention. But for the rest in your list of features I suggest to take a deep

The best strategy to How store multiple fields of same document

2008-07-31 Thread Sergey Kabashnyuk
The best strategy. Hello. I want to ask you opinion about to How store multiple fields of same document. I see now two possibility's. 1. Multiple fields in document 2. One filed: for example named PROPERTIES, with multiple instances. And values combined with name for example [EMAIL PROTECTED]

Re: Using lucene as a database... good idea or bad idea?

2008-07-31 Thread Andy Liu
If essentially all you need is key-value storage, Berkeley DB for Java works well. Lookup by ID is fast, can iterate through documents, supports secondary keys, updates, etc. Lucene would work relatively well for this, although inserting documents might not be as fast, because segments need to

Re: The best strategy to How store multiple fields of same document

2008-07-31 Thread Erick Erickson
I'd go with option 1 unless and until you could demonstrate performance problems. Speaking of which, you'd get a more informed answer if you provided a bit more data, like how many fields are we talking, how many documents, etc. If you're indexing 10,000 documents, go with the simplest. If you're

Re: The best strategy to How store multiple fields of same document

2008-07-31 Thread Sergey Kabashnyuk
Thank you Erick. I'm talking about more then 10,000 documents and 95% less then 10 fields. Maximum number of fields per document is unlimited. But in practice it's no more the 20. I'm interesting: does Lucene have any internal optimization, which depend of the fields count or fields

RE: too many clause exception when using a filter

2008-07-31 Thread Chris Hostetter
: Or, to avoid the clause limit issue altogether, you could use a : PrefixFilter instead of a QueryWrapperFilter around a : {Wildcard,Prefix}Query: right ... the missconception here seems to be that if you use a Filter, all your Too Many Clauses problems are sovled -- but if that Filter is

Re: The best strategy to How store multiple fields of same document

2008-07-31 Thread Erick Erickson
Haven't a clue G. Erick On Thu, Jul 31, 2008 at 11:29 AM, Sergey Kabashnyuk [EMAIL PROTECTED]wrote: Thank you Erick. I'm talking about more then 10,000 documents and 95% less then 10 fields. Maximum number of fields per document is unlimited. But in practice it's no more the 20. I'm

AUTO: Zhou Lin Dai is out of the office. (returning 2008-08-07)

2008-07-31 Thread Zhou Lin Dai
I am out of the office until 2008-08-07.. I will take vacation and will not return until Aug.7. Raja (He Kun Wang) will be my backup during my leave. I will check emails at night. For anything emergent, you can call my cell phone (86) 131 6290 0375. Note: This is an automated response to your

FileNotFoundException during indexing

2008-07-31 Thread Wojtek212
Hi, I'm sometimes receiving FileNotFoundExceptions during indexing. java.io.FileNotFoundException: /tmp/content/3615.0-3618.0/_3p.fnm (No such file or directory) at com.test.vcssearch.DefaultServiceIndexer$2.run(DefaultServiceIndexer.java:245) at

Re: FileNotFoundException during indexing

2008-07-31 Thread Michael McCandless
Are you only creating one instance of IndexManager and then sharing that instance across all threads? Can you put some logging/printing where you call IndexReader.unLock, to see how often that's happening? That method is dangerous because if you unlock a still-active IndexWriter it

SpanRegexQuery

2008-07-31 Thread Christopher M Collins
Hello, I'm trying to use SpanRegexQuery as one of the clauses in my SpanQuery. When I give it a regex like: L[a-z]+ing and do a rewrite on the final query I get terms like Labinger and Lackonsingh along with the expected terms Labeling, Lacing, etc. It's as if the regex is treated as a find()

Re: SpanRegexQuery

2008-07-31 Thread Daniel Noll
Christopher M Collins wrote: Hello, I'm trying to use SpanRegexQuery as one of the clauses in my SpanQuery. When I give it a regex like: L[a-z]+ing and do a rewrite on the final query I get terms like Labinger and Lackonsingh along with the expected terms Labeling, Lacing, etc. It's as if the

Re: Using lucene as a database... good idea or bad idea?

2008-07-31 Thread Ganesh - yahoo
Thanks Andy and Karsten. - Original Message - From: Andy Liu [EMAIL PROTECTED] To: java-user@lucene.apache.org Sent: Thursday, July 31, 2008 8:16 PM Subject: Re: Using lucene as a database... good idea or bad idea? If essentially all you need is key-value storage, Berkeley DB for