RE: when indexing, java.io.FileNotFoundException

2005-02-03 Thread Will Allen
Increase the minMergeDocs and use the compact file format when creating your index. http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWriter.html http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWriter.html#setUseCompoundFile(boolean) -Original Mes

RE: modifying existing index

2004-11-23 Thread Will Allen
To update a document you need to insert the modified document, then delete the old one. Here is some code that I use to get you going in the right direction (it wont compile, but if you follow it closely you will see how I take an array of lucene documents with new properties and add them, then

RE: Too many open files issue

2004-11-22 Thread Will Allen
If you are on linux the number of file handles for a session is much lower than that for the whole machine. "ulimit -n" will tell you. There are instructions on the web for changing this setting, it involves the /etc/security/limits.conf and setting the values for "nofile". (bulkadm is my use

RE: java.io.FileNotFoundException: ... (No such file or directory)

2004-11-18 Thread Will Allen
I have gotten this a few times. I am also using a NFS mount, but have seen it in cases where a mount wasn't involved. I cannot speak to why this is happening, but I have posted to this forum before a way of repairing your index by modifying the segments file. Search for "wallen". The other t

API request: isOpen on indexwriter and searcher

2004-11-18 Thread Will Allen
Could a developer consider adding an isOpen method to the writer and searcher? I have looked at doing it myself, but not sure what I am doing. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PR

RE: Best Implementation of Next and Prev in Lucene

2004-11-18 Thread Will Allen
See the demo jsp pages. -Original Message- From: Ramon Aseniero [mailto:[EMAIL PROTECTED] Sent: Tuesday, November 16, 2004 9:26 PM To: [EMAIL PROTECTED] Subject: Best Implementation of Next and Prev in Lucene Hi All, What's the best implementation of displaying the Next and Prev sear

RE: QueryParser: "[stopword] AND something" throws Exception

2004-11-12 Thread Will Allen
Holy cow! This does happen! -Original Message- From: Peter Pimley [mailto:[EMAIL PROTECTED] Sent: Friday, November 12, 2004 11:52 AM To: Lucene Users List Subject: QueryParser: "[stopword] AND something" throws Exception [this is using lucene-1.4-final] Hello. I have just encountered

RE: Acedemic Question About Indexing

2004-11-11 Thread Will Allen
11:37 AM To: Lucene Users List Cc: Will Allen Subject: Re: Acedemic Question About Indexing Will, could you give more details about your architecture? -each time update o create new indexes -data stored at each index etc. because it is quite interesting, and I would like to test it. Sodel

RE: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-11 Thread Will Allen
Any wildcard search will automatically expand your query to the number of terms it find in the index that suit the wildcard. For example: wild*, would become wild OR wilderness OR wildman etc for each of the terms that exist in your index. It is because of this, that you quickly reach the 1024

RE: Acedemic Question About Indexing

2004-11-10 Thread Will Allen
I have an application that I run monthly that indexes 40 million documents into 6 indexes, then uses a multisearcher. The advantage for me is that I can have multiple writers indexing 1/6 of that total data reducing the time it takes to index by about 5X. -Original Message- From: Luke

RE: Highlighting in Lucene

2004-11-04 Thread Will Allen
There is a highlighting tool in the sandbox (3/4 of the way down): http://jakarta.apache.org/lucene/docs/lucene-sandbox/ -Original Message- From: Ramon Aseniero [mailto:[EMAIL PROTECTED] Sent: Thursday, November 04, 2004 3:40 PM To: 'Lucene Users List' Subject: Highlighting in Lucene Hi

RE: Searching for a phrase that contains quote character

2004-10-28 Thread Will Allen
rching for a phrase that contains quote character On Oct 28, 2004, at 2:02 PM, Will Allen wrote: > I am using a NullAnalyzer for this field. Which means that each field is added exactly as-is as a single term? Then trying the PhraseQuery directly is a good first step - if you can get that t

RE: Searching for a phrase that contains quote character

2004-10-28 Thread Will Allen
you are indexing the text with the quotes (no built-in analyzer besides WhitespaceAnalyzer would do that for you). Erik > ... > > > > On Thu, 28 Oct 2004 12:02:48 -0400, Will Allen > <[EMAIL PROTECTED]> wrote: >> >> I am having this same problem, but ca

Searching for a phrase that contains quote character

2004-10-28 Thread Will Allen
I am having this same problem, but cannot find any help! I have a keyword field that sometimes includes double quotes, but I am unable to search for that field because the escape for a quote doesnt work! I have tried a number of things: myfield:"lucene is \"cool\"" AND myfield:"lucene is \\"

RE: Multi + Parallel

2004-10-15 Thread Will Allen
I am using 6 indexers / indexes to balance the speed of indexing against query performance for 40+ million documents. I came to this number through trial and error, and performance testing on the indexing side with a fast 4 processor machine. The trick is to max out the I/O throughput. -Will

RE: -- TomCat/Lucene, filesystem

2004-09-08 Thread Will Allen
I think you might be refering to the xml files you keep in C:\Program Files\Apache\Tomcat\conf\Catalina\localhost I have a file with the contents (myapp.xml): -Original Message- From: Rupinder Singh Mazara [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 31, 2004 12:36 PM To: Lucene

RE: Spam:too many open files

2004-09-07 Thread Will Allen
I will deploy and test through the end of the week and report back Friday if the problem persists. Thank you! -Original Message- From: Dmitry Serebrennikov [mailto:[EMAIL PROTECTED] Sent: Tuesday, September 07, 2004 8:40 PM To: Lucene Users List Subject: Re: Spam:too many open files Hi

RE: too many open files

2004-09-07 Thread Will Allen
I suspect it has to do with this change: --- jakarta-lucene/src/java/org/apache/lucene/index/SegmentMerger.java 2004/08/08 13:03:59 1.12 +++ jakarta-lucene/src/java/org/apache/lucene/index/SegmentMerger.java 2004/08/11 17:37:52 1.13 I wouldn't know where to start to reproduce the pro

Query performance on a 315 Million document index (1TB)

2004-05-06 Thread Will Allen
scales? Memory? = Cost is not a concern, so what would be the shortcomings of a theoretical = machine with 16GB of ram, 4-16 cpus and 1-2 terabytes of space? Would it be = better to cluster machines to break apart the query? Thank you for your serious responses, Will Allen