Re: Sorting and tokenization

2004-07-01 Thread Praveen Peddi
The solution you suggested is exactly as I expected and I already thought about implementing it. But the problem is the memory in efficiency. Somce times titles are huge. And with i18n, title can be in japanese, chinese or any language which takes mroe memory than english. Ok. how about taking

Re: question on setting boost factor

2004-07-01 Thread Erik Hatcher
On Jun 22, 2004, at 7:30 AM, Anson Lau wrote: Hi guys, Lets say I want to search the term hello world over 3 fields with different boost: ((hello:field1 world:field1)^0.001 (hello:field2 world:field2)^100 (hello:field3 world:field3)^2)) Note I've given field1 a really low boost, a heavy boost

Re: question on setting boost factor

2004-07-01 Thread Steven Rowe
Repaired URL (was extra space before Similarity.html): http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/Similarity.html#coord(int,%20int) Corresponding Tiny URL: URL:http://tinyurl.com/3bo8y Erik Hatcher wrote: On Jun 22, 2004, at 7:30 AM, Anson Lau wrote: Hi guys, Lets say I

Re: Building query to match a sub-string of a field

2004-07-01 Thread Erik Hatcher
On Jun 29, 2004, at 5:28 PM, Terence Lai wrote: Hi Everyone, I am trying to construct a query which matches a sub-string of a field. As an illustration, I would like to search the following words by using the sub-string test: - test - testing - contest - contestable I realize that Lucene does

Re: Running OutOfMemory while optimizing and searching

2004-07-01 Thread Doug Cutting
What do your queries look like? The memory required for a query can be computed by the following equation: 1 Byte * Number of fields in your query * Number of docs in your index So if your query searches on all 50 fields of your 3.5 Million document index then each search would take

languages lucene can support

2004-07-01 Thread Praveen Peddi
I have read many emails in lucene mailing list regarding analyzers. Following is the list of languages lucene supports out of box. So they will be supported with no change in our code but just a configuration change. English German Russian Following is the list of languages that are available

Visualization of Lucene search results with a treemap

2004-07-01 Thread David Spencer
Inspired by these guys who put results from Google into a treemap... http://google.hivegroup.com/ I did up my own version running against my index of OSS/javadoc trees. This query for thread pool shows it off nicely: http://www.searchmorph.com/kat/tsearch.jsp?s=thread%20poolside=300goal=500 This

Re: Visualization of Lucene search results with a treemap

2004-07-01 Thread Stefan Groschupf
Dave, cool stuff, think aboout to contribute that to nutch.. ;-)! Do you know: http://websom.hut.fi/websom/comp.ai.neural-nets-new/html/root.html ? Cheers, Stefan Am 01.07.2004 um 23:28 schrieb David Spencer: Inspired by these guys who put results from Google into a treemap...

Re: languages lucene can support

2004-07-01 Thread Ernesto De Santis
Hi Praveen You can develope your SpanishAnalyzer easily (or another language)with SnowballAnalyzer. I send you my SpanishAnalyzer. Bye, Ernesto. - Original Message - From: "Praveen Peddi" [EMAIL PROTECTED] To: "lucenelist" [EMAIL PROTECTED] Sent: Thursday, July 01, 2004 6:13 PM

search multiple indexes

2004-07-01 Thread Toby Tremayne
Possibly a silly question - but how would I go about searching multiple indexes using lucene? Do I need to basically repeat the code I use to search one index for each one, or is there a better way to do it? Toby ---

Re: search multiple indexes

2004-07-01 Thread Stefan Groschupf
Possibly a silly question - but how would I go about searching multiple indexes using lucene? Do I need to basically repeat the code I use to search one index for each one, or is there a better way to do it? Take a look to the nutch.org sourcecode. It does what you are searching for. HTH Stefan

RE: search multiple indexes

2004-07-01 Thread Toby Tremayne
thanks for that - I'll take a look at nutch as well. I was hping to find some examples as I'm unfortunately a java newbie :) cheers, Toby -Original Message- From: David Spencer [mailto:[EMAIL PROTECTED] Sent: Friday, 2 July 2004 9:13 AM To: Lucene Users List Subject: Re: search

Re: search multiple indexes

2004-07-01 Thread Stefan Groschupf
http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/ MultiSearcher.html 100% Right. I personal found code samples more interesting then just java doc. That why my hint, here the code snippet from nutch: /** Construct given a number of indexed segments. */ public

Re: search multiple indexes

2004-07-01 Thread David Spencer
Stefan Groschupf wrote: http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/ MultiSearcher.html 100% Right. I personal found code samples more interesting then just java doc. Good point. That why my hint, here the code snippet from nutch: But - warning - in normal use of Lucene

Re: Visualization of Lucene search results with a treemap

2004-07-01 Thread David Spencer
Stefan Groschupf wrote: Dave, cool stuff, think aboout to contribute that to nutch.. ;-)! Well the code is very generic - basically 1 method that takes a Searcher, a Query, the # of cells to show, and the size of the diagram. Technically I think it would be a Lucene sandbox contribution - but -

Re: search multiple indexes

2004-07-01 Thread Peter M Cipollone
Toby, Check http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/MultiSearcher.html and http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/ParallelMultiSearcher.html If you need sample code, check the test cases in the source distribution. They show usage examples

RE: search multiple indexes

2004-07-01 Thread Toby Tremayne
thank you muchly - I'll poke about with the test cases and see how I go -Original Message- From: Peter M Cipollone [mailto:[EMAIL PROTECTED] Sent: Friday, 2 July 2004 10:35 AM To: Lucene Users List Subject: Re: search multiple indexes Toby, Check