Software License

2002-02-25 Thread Rafael Luque
Hi all, I know Lucene is a free project, however I think its use is under Apache Software License (ASL) terms, so someone using Lucene should reference the project, use the logo 'powered by Lucene', ... I have suspects about a company releasing a commercial search engine based on Lucene and

RE: Googlifying lucene querys

2002-02-25 Thread Howk, Michael
In the Lucene build that we've got (2/21) the question mark does not do a single-character replace. Does anyone know why? We're using the StandardAnalyzer and the default QueryParser. -Original Message- From: Peter Carlson [mailto:[EMAIL PROTECTED]] Sent: Saturday, February 23, 2002 5:23

RE: Googlifying lucene querys

2002-02-25 Thread Doug Cutting
If you put the title in a separate field from the contents, and search both fields, matches in the title will usually be stronger, without explicit boosting. This is because the scores are normalized by the length of the field, and the title tends to be much shorter than the contents. So even

RE: Googlifying lucene querys

2002-02-25 Thread Joshua O'Madadhain
You cannot, in general, structure a Lucene query such that it will yield the same document rankings that Google would for that (query, document set). The reason for this is that Google employs a scoring algorithm that includes information about the topology of the pages (i.e., how the pages are

Build index using RAMDirectory out of memory errors

2002-02-25 Thread Kurt Vaag
I have been using Lucene for 3 weeks and it rules. The indexing process can be slow. So I searched the mailgroup archives and found example code using RAMDirectory to improve indexing speed. The example code I found was indexing 100,000 files at a time to the RAMDirectory before writing to disk.

RE: Googlifying lucene querys

2002-02-25 Thread Doug Cutting
From: Joshua O'Madadhain [mailto:[EMAIL PROTECTED]] You cannot, in general, structure a Lucene query such that it will yield the same document rankings that Google would for that (query, document set). The reason for this is that Google employs a scoring algorithm that includes

Re: Build index using RAMDirectory out of memory errors

2002-02-25 Thread Winton Davies
java -Xmx1000m Sorry if you already tried resizing your heap. Actually with 1.3.1 you could go up above a gig, but really swapping aint gonna help much. Winton I have been using Lucene for 3 weeks and it rules. The indexing process can be slow. So I searched the mailgroup archives

is there any way to create and manage a controlled vocabulary in lucene?

2002-02-25 Thread Philipp Chudinov
subj? -- To unsubscribe, e-mail: mailto:[EMAIL PROTECTED] For additional commands, e-mail: mailto:[EMAIL PROTECTED]

Re: Performance Tuning

2002-02-25 Thread Otis Gospodnetic
You could try playing with a merge factor... Otis --- Aruna Raghavan [EMAIL PROTECTED] wrote: Hi, Are there any ways to finetune the CPU performance with Lucene? I know of the usage of optimize() calls but I am wondering if there are any other ways to improve the CPU time/Disk space

Re: Build index using RAMDirectory out of memory errors

2002-02-25 Thread Ian Lea
Have you tried different values for IndexWriter.mergeFactor? Setting it to 1000 gave me a 10* speed improvement on some large index some time ago. Not RAMDirectory though. Your mileage may vary. -- Ian. Kurt Vaag wrote: I have been using Lucene for 3 weeks and it rules. The indexing

RE: Build index using RAMDirectory out of memory errors

2002-02-25 Thread Kurt Vaag
Thanks Winton, Thats what it was. I just assumed java would take all the 1G that it needed. Didn't realize the default was 64M. Also thanks for not saying RTFM (which I had done but didn't know what TF to do with the -Xmx option). -Kurt -Original Message- From: Winton Davies

RE: Index Locked For Write

2002-02-25 Thread Hayes, Mark
I am not a Lucene expert but I would like to understand the threading issues also, and I'm wondering if the following is true when using Lucene in a multithreaded application. I understand there are three modes for using IndexReader and IndexWriter: A- IndexReader for reading only, not deleting

Re: is there any way to create and manage a controlled vocabularyin lucene?

2002-02-25 Thread Peter Carlson
Hi, Are you just trying to have Lucene index terms that are in your Vocaulary. If you, then you can great your own analyzer returns words in your vocabulary. Also, you could use the StandardAnalyzer, and then you could create your own Lucene Document and only add words that match your