Re: BooleanQuery TooManyClauses in wildcard search

2007-11-30 Thread Ruchi Thakur
Erick/John, thank you so much for the reply. I have gone through the mailing list u have redirected me to. I know i need to read more, but some quick questions. Please bear with me if they appear to be too simple. Below is the code snippet of my current search. Also i need to get score inf

IndexReader locking index

2007-11-30 Thread Ruslan Sivak
I am using MoreLikeThis functionality in my code. This code is running on four separate servers. When I ran tests, it seemed to be fine, but looks like under heavy use, the index file is always locked, and when I reindex all the docs, it doubles the size of the index (my guess is the old file

Re: BooleanQuery TooManyClauses in wildcard search

2007-11-30 Thread Erick Erickson
John's answer is spot-on. There's a wealth of information in the user group archives that you should be able to search on discussing ways of providing the functionality. One thread titled "I just don't get wildcards at all" is one where the folks who know generously helped me out. Once you find ou

Re: BooleanQuery TooManyClauses in wildcard search

2007-11-30 Thread John Byrne
Hi, Your problem is that when you do a wildacrd search, Lucene expands the wildacrd term into all possible terms. So, searching for "stat*" produces a list of terms like "state", "states", "stating" etc. (It only uses terms that actually occur in your index, however). These terms are all adde

BooleanQuery TooManyClauses in wildcard search

2007-11-30 Thread Ruchi Thakur
Hi there. I am a new Lucene user and I have been searching the group archives but couldn't solve the problem. I have just joined a project that uses Lucene. We use the StandardAnalyzer for indexing our documents and our query is as follows when we issue a search string oft* for exa

Re: FieldSelector

2007-11-30 Thread Grant Ingersoll
On Nov 30, 2007, at 10:58 AM, Timo Nentwig wrote: On Friday 30 November 2007 12:59:13 Grant Ingersoll wrote: Hmmm, I think you should be able to rely on the fact that Fields are stored in order of indexing and then read back in that same order. Yeah, tought about that for a moment but this i

Re: lucene-core-2.2.0.jar broken? CorruptIndexException?

2007-11-30 Thread Bill Janssen
> Your errors seem to happen around the same area (~20K docs). If you > skip the first say ~18K docs does the error still happen? We need to > somehow narrow this down. I'm trying to boil down the documents to a set which I can deploy on a DVD-ROM, so I can move the same set around from machine

Re: FieldSelector

2007-11-30 Thread Timo Nentwig
On Friday 30 November 2007 12:59:13 Grant Ingersoll wrote: > Hmmm, I think you should be able to rely on the fact that Fields are > stored in order of indexing and then read back in that same order. Yeah, tought about that for a moment but this is just way to fragile. > Otherwise, the reading twi

Re: FSDirectory Again

2007-11-30 Thread Donna L Gresh
In general it is much nicer to say "I did not make myself clear" than "you are not getting me" If you look on the java doc page for FSDirectory it tells you what do do instead of the deprecated method: getDirectory(File file, boolean create) Deprecated. Use IndexWriter's create flag

FSDirectory Again

2007-11-30 Thread Liaqat Ali
No you are not getting me. I have this original code. What i should use instead of this code to create a directory, because the dir =FSDirectory.getDirectory(indexDir, true) is deprecated. import org.apache.lucene.store.Directory; import org.apache.lucene.store.FSDirectory; protected Directo

Re: FSDirectory

2007-11-30 Thread Erick Erickson
A directory that your program can modify . I assume that you are running on a unix-like system and the directory you've specified is protected from modification by whatever user your process is identified as. But two other things: 1> do you really want to create a new index as the "true" indicates

FSDirectory

2007-11-30 Thread Liaqat Ali
I m facing problem with this code.. dir = new FSDirectory(); dir.getDirectory(indexDir, true); i get error that FSDirectory has protected access. So what i should use instead of it... Liaqat - To unsubscribe, e-mail: [E

Re: FieldSelector

2007-11-30 Thread Grant Ingersoll
Hmmm, I think you should be able to rely on the fact that Fields are stored in order of indexing and then read back in that same order. Thus, index your documents making sure that the documentType is the first Field on the Document (and for performance reasons, the other fields you want to

IndexHTML and UTF-8

2007-11-30 Thread Ognyan Kulev
Hi, I'm trying run org.apache.lucene.demo.IndexHTML on HTML files with UTF-8. In JavaCC FAQ, it's said that just UTF-8 Reader should be given to SimpleCharStream: http://www.engr.mun.ca/~theo/JavaCC-FAQ/javacc-faq-moz.htm#tth_sEc3.21 . I made the following change in HTMLDocument.java: //old: File

FieldSelector

2007-11-30 Thread Timo Nentwig
Hi! I do have different document types (Books, Magazines, Author whatever) in the index and a FieldSelector is document type specific (for Books LOAD isbn and title for Author name, ...). The document type can be determined by a field surprisingly called documentType. How am I going to do this

Re: prefix query search problem if a hyphen exist in the search word

2007-11-30 Thread reeja
Yes, i am using standard analyzer both at indexing and query time. Erick Erickson wrote: > > What analyzers are you using both at index time and > query time? StandardAnalyzer will, for instance, split > the words at the hyphen. > > I would recommend that you get a copy of Luke (google > lucen