date:20080111

RE: Prioiritze new documents

2008-01-11 Thread Chris Hostetter

: IMHO it would be nice if Lucene's Similarity formula took the : indexed-date of the document into account. Ideally as an optional : setting, where the user can provide a date field as well. It really wouldn't make sense to incorporate this into the Similarity class. : Some of the other searc

How to model hierarchy info to be searched related to a document

2008-01-11 Thread Roger Camargo

I'm trying to index information related to Olap Cubes. Each cube I'm trying to model it like a document. The cube have the following information: ID - Unique identifier for the cube Name - Name of the cube Description - Description of the cube (There can be many dimensions per cube) Dimensi

RE: how do I get my own TopDocHitCollector?

2008-01-11 Thread Beard, Brian

Thanks for all this. We're doing warmup searching also, but just for some common date searches. The warmup would be a good place to add some pre-caching capability. I'll plan for this eventually and start with the partial cache for now. Thanks, Brian Beard -Original Message- From: Antony

Re: Retrieve the number of deleted documents

2008-01-11 Thread Shai Erera

Thanks I guess I should have looked in the code before asking those silly questions :-) I wonder why there isn't a specific API for that though ... On Jan 11, 2008 7:36 PM, Steven A Rowe <[EMAIL PROTECTED]> wrote: > Hi Shai, > > On 01/11/2008 at 7:42 AM, Shai Erera wrote: > > Will IndexReader.max

Re: Lucene sorting case-sensitive by default?

2008-01-11 Thread Erick Erickson

I've often stored a special sort field that's lower-cased. On Jan 11, 2008 11:40 AM, Alex Wang <[EMAIL PROTECTED]> wrote: > Hi All, > > > > I was searching my index with sorting on a field called "Label" which is > not tokenized, here is what came back: > > > > Extended Sites Catalog Asset Store

RE: Retrieve the number of deleted documents

2008-01-11 Thread Steven A Rowe

Hi Shai, On 01/11/2008 at 7:42 AM, Shai Erera wrote: > Will IndexReader.maxDocs() - IndexReader.numDocs() give the > correct result? or this is just a heuristic? I think your expression gives the correct result - the abstract IndexReader.numDocs() method is implemented in SegmentReader as: pu

Lucene 2.3 RC2 available for testing

2008-01-11 Thread Michael Busch

Hi Lucene Users, good news: we are planning to release Lucene 2.3 in about ten days from now! Lucene 2.3 will have significant performance improvements and various other new features. (see http://people.apache.org/~buschmi/staging_area/lucene_2_3/CHANGES.txt for a full list of new features and API

Re: Lucene sorting case-sensitive by default?

2008-01-11 Thread Tom Emerson

String fields are sorted using natural (lexicographic) order. For characters in ASCII range this means uppercase letters will sort before lowercase letters (e.g., 'A' U+0041 sorts before 'a' U+0061). This behaviour is documented on in the JavaDocs for org.apache.lucene.search.Sort. -tree On

Lucene sorting case-sensitive by default?

2008-01-11 Thread Alex Wang

Hi All, I was searching my index with sorting on a field called "Label" which is not tokenized, here is what came back: Extended Sites Catalog Asset Store Extended Sites Catalog Asset Store SALES Print Catalog 2 Print catalog test Test Print Catalog Test refresh catalog print test 3

Re: Prioiritze new documents

2008-01-11 Thread Tom Emerson

You can utilize the CustomScoreQuery introduced in Lucene 2.2 to provide this type of functionality. This is quite straight forward to do and works really well. Since "recentness" is a function of the time the search was made, we store the appropriate date in an index field and use a CustomScoreQue

Re: Question about Search formula

2008-01-11 Thread Grant Ingersoll

Have a look at the Similarity class and also the Scoring section of the website (Documentation-> Scoring on the left hand side) This is a classic problem of dealing with TF/IDF and length normalization. Lucene makes general assumptions about what is best, but does allow you to tune as wel

Re: Design questions

2008-01-11 Thread Erick Erickson

See below On Jan 11, 2008 9:36 AM, <[EMAIL PROTECTED]> wrote: > Hi, > > > > You could even store all of the page offsets in your > > meta-data document > > in a special field if you wanted, then lazy-load that field > > rather than > > dynamically counting. > > How can I lazy load a field? > See

Question about Search formula

2008-01-11 Thread thrgroovyboy

Hi, When I am searching with lucene, the formula takes care of the number of total words in the document. For exemple, an indexed one power-point slide with the term "JAVA" is most relevent than a 50 pages Word document on JAVA. It is a problem for me, the Word document on Java should be most r

RE: Design questions

2008-01-11 Thread spring

Hi, > You could even store all of the page offsets in your > meta-data document > in a special field if you wanted, then lazy-load that field > rather than > dynamically counting. How can I lazy load a field? > You'd have to be careful that your offsets > corresponded to the data *after* it

Retrieve the number of deleted documents

2008-01-11 Thread Shai Erera

Hi I didn't find a proper API on InderWriter or IndexReader to retrieve the total number of deleted documents. Will IndexReader.maxDocs() - IndexReader.numDocs() give the correct result? or this is just a heuristic? Thanks, Shai

Re: CachingWrapperFilter: why cache per IndexReader?

2008-01-11 Thread Toke Eskildsen

On Tue, 2008-01-01 at 15:06 -0500, Mark Miller wrote: > Perhaps, in some esoteric case, multiple readers is the right idea > (monster, monster, super IO system, static index?? maybe...)...but > unless you have run into this case and have some data to show it, I > would stick with what the commun

RE: Prioiritze new documents

How to model hierarchy info to be searched related to a document

RE: how do I get my own TopDocHitCollector?

Re: Retrieve the number of deleted documents

Re: Lucene sorting case-sensitive by default?

RE: Retrieve the number of deleted documents

Lucene 2.3 RC2 available for testing

Re: Lucene sorting case-sensitive by default?

Lucene sorting case-sensitive by default?

Re: Prioiritze new documents

Re: Question about Search formula

Re: Design questions

Question about Search formula

RE: Design questions

Retrieve the number of deleted documents

Re: CachingWrapperFilter: why cache per IndexReader?

16 matches

Site Navigation

Mail list logo

Footer information