Re: how to estimate how much memory is required to support the large index search

2008-11-17 Thread Chris Lu
So looks like you are not really doing much sorting? This index divisor affects reader.terms(), but not too much with sorting. -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Da

Re: how to estimate how much memory is required to support the large index search

2008-11-17 Thread Zhibin Mai
It is a cache tunning setting in IndexReader. It can be set via method setTermInfosIndexDivisor(int). Thanks, Zhibin From: Chris Lu <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Monday, November 17, 2008 7:07:21 PM Subject: Re: how to estimate ho

Re: how to estimate how much memory is required to support the large index search

2008-11-17 Thread Chris Lu
Calculation looks right. But what's the "Index divisor" that you mentioned? -- Chris Lu - Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com

Re: how to estimate how much memory is required to support the large index search

2008-11-17 Thread Zhibin Mai
Aleksander, I figured it out that most of heap was consumed by the Term cache. In our case, the index has 233 millions of terms and 6.4 millions of them were loaded into the cache when we did the search. I roughly did a calculation that each term will need how much memory, it is about 16 bytes

Lucene 2.4 Token Stream error

2008-11-17 Thread bhupesh bansal
Hey folks, I saw this error in my code base after upgrading lucene-2.4 from lucene 2.3. have folks seen this before and any idea ?? is it related to fix of https://issues.apache.org/jira/browse/LUCENE-1333 java.lang.IllegalArgumentException: length 11 exceeds the size of the termBuffer (10)

Re: Using AND with MultiFieldQueryParser

2008-11-17 Thread Rafael Cunha de Almeida
On Mon, 17 Nov 2008 16:29:29 -0200 Rafael Cunha de Almeida <[EMAIL PROTECTED]> wrote: > On Mon, 17 Nov 2008 13:07:35 -0200 > Rafael Cunha de Almeida <[EMAIL PROTECTED]> wrote: > > > On Thu, 13 Nov 2008 12:12:17 -0500 > > Matthew Hall <[EMAIL PROTECTED]> wrote: > > > > > Which Analyzer have you a

Re: Using AND with MultiFieldQueryParser

2008-11-17 Thread Rafael Cunha de Almeida
On Mon, 17 Nov 2008 13:07:35 -0200 Rafael Cunha de Almeida <[EMAIL PROTECTED]> wrote: > On Thu, 13 Nov 2008 12:12:17 -0500 > Matthew Hall <[EMAIL PROTECTED]> wrote: > > > Which Analyzer have you assigned per field? > > > > The PerFieldAnalyzerWrapper uses a default analyzer (the one you passed

Re: Using AND with MultiFieldQueryParser

2008-11-17 Thread Rafael Cunha de Almeida
On Thu, 13 Nov 2008 12:12:17 -0500 Matthew Hall <[EMAIL PROTECTED]> wrote: > Which Analyzer have you assigned per field? > > The PerFieldAnalyzerWrapper uses a default analyzer (the one you passed > during its construction), and then you assign specific analyzers to each > field that you want t

Re: Scoped Search and Facets generation using Lucene

2008-11-17 Thread Aleksander M. Stensby
Yes, you have a lot of fields but if you want to do faceting and all possible fields you will probably have to index each node in your document as a separate field. You should look at solr, which supports facets out of the box. Keep in mind what type of tokenizer(s) you use as this effects

Re: Searching across multiple fields

2008-11-17 Thread prabin meitei
Hi, Try using Boolean Query. You can effectively use boolean operators using it. make seperate queries for each field. Boolean query is meant for having a series of queries with boolean operators defined. For eg. lets say you have 3 diff queries A, B, C and you want a final query which behav

Searching across multiple fields

2008-11-17 Thread Aditi Goyal
Hi, Lets say I have an index with the following fields: field1, field2, field3 and field4 where all the fields can have same values. Now I want to search a document where "basket" and "apple" are part of the whole document but "orange" is not. I have tried using MultiFieldQueryParser but it is n

Searching across multiple fields

2008-11-17 Thread Aditi Goyal
Hi, Lets say I have an index with the following fields: field1, field2, field3 and field4 where all the fields can have same values. Now I want to search a document where "basket" and "apple" are part of the whole document but "orange" is not. I have tried using MultiFieldQueryParser but it is n

RE: Scoped Search and Facets generation using Lucene

2008-11-17 Thread Bapat, Mayur
My xml format similar to as follows - 0.00.08254true8438127718443Ver-dir(TOP)0.0standardA001A3Carbon film10just classified0default000.0TRIMMER0101ctrld00.00.00.02008-07-16T08:34:20INWORK84360.02008-08-01T10:44:35True29310.00.0856separable0.0c/i00falsetrue0IBRTYPE|Part~~WCP|22037|10.020

Software Announcement: LuSql: Database to Lucene indexing

2008-11-17 Thread Glen Newton
LuSql is a simple but powerful tool for building Lucene indexes from relational databases. It is a command-line Java application for the construction of a Lucene index from an arbitrary SQL query of a JDBC-accessible SQL database. It allows a user to control a number of parameters, including the SQ

Re: how to estimate how much memory is required to support the large index search

2008-11-17 Thread Aleksander M. Stensby
One major factor that may result in heap space problems is if you are doing any form of sorting when searching. Do you have any form of default sort in your application? Also, the type of field used for sorting is important with regard to memory consumption. This issue has been discussed be

Re: Scoped Search and Facets generation using Lucene

2008-11-17 Thread Aleksander M. Stensby
I think that the closest you get to "scoped" search in your case would be to use filters. (If you index your paths, or if the documents have some standarized format, I assume you could just use one field per element in your document.) Maybe you could say a bit about you document structure?