Re: Lucene index question

2009-08-22 Thread marquinhocb
Hi Anshum, Thanks so much for the quick response! I think that pretty much covers it. I was worried that having to delete the document and re-add it simply because a date field has been updated would make my indexing quite slow. Seems however that's not something I'll have to worry about. Than

Re: Using HitCollector to Collect First N Hits

2009-08-22 Thread Simon Willnauer
Hey Len, one more thing... while I have no idea about your usecase if you don't care about the score you could you expand the terms yourself just like PrefixQuery does. simon On Sat, Aug 22, 2009 at 6:19 PM, Simon Willnauer wrote: > Hi Len, > > On Sat, Aug 22, 2009 at 5:51 PM, Len Takeuchi wrote:

Re: Using HitCollector to Collect First N Hits

2009-08-22 Thread Simon Willnauer
Hi Len, On Sat, Aug 22, 2009 at 5:51 PM, Len Takeuchi wrote: > Hello, > > I have attached the original thread from where I got my information at the > very bottom in case it is of any help. In regards to whether I want just a > boolean retrieval model, in the usage we are currently discussing, the

Re: Using HitCollector to Collect First N Hits

2009-08-22 Thread Len Takeuchi
Hello, I have attached the original thread from where I got my information at the very bottom in case it is of any help. In regards to whether I want just a boolean retrieval model, in the usage we are currently discussing, the answer is yes (I don't care about the score). However, we also do

Re: How to het the score in percentage

2009-08-22 Thread Shashi Kant
Chris & Erick's arguments are persuasive , however we do live in an imperfect world. Most of our users want to see the relative importance of a results vis-a-vis the rest Relative Importance (%) = (d - dmin)/(dmax-dmin) * 100 Where dmax is the highest Lucene score (score of top result) and dm

Re: Incremental Indexing in PyLucene

2009-08-22 Thread Simon Willnauer
this is the java users mailing list - you will get help on the user mailinglist of pylucene: http://lucene.apache.org/pylucene/resources/mailing_lists.html simon On Sat, Aug 22, 2009 at 2:56 PM, mayank juneja wrote: > Hi, > > I am building a database of text files using PyLucene. I need to add n

RE: Incremental Indexing in PyLucene

2009-08-22 Thread Uwe Schindler
This mailing list is about Lucene Java! Please ask your question in the PyLucene list. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: mayank juneja [mailto:mayankjune...@gmail.com] > Sent: Saturday, Augu

Re: How to het the score in percentage

2009-08-22 Thread jlman
hossman wrote: > > > : here ie, in our existing system we are showing the search score in > : percenetage but lucene provides the search score in numbers which is > derived > : from some internal logic. Can anybody give some tips for converting the > : lucene score to percentage or is there any

Incremental Indexing in PyLucene

2009-08-22 Thread mayank juneja
Hi, I am building a database of text files using PyLucene. I need to add new text files to the index at regular intervals. Since I am a beginner, I do not know how to build the index incremently. Can anyone guide me how to accomplish the task ? Any kind of help would be appreciated. Thanks, Maya

Re: Using HitCollector to Collect First N Hits

2009-08-22 Thread Simon Willnauer
Hi Len, what kind of query do you execute when you collect the hits. HitCollector should be called for each document by the time it is scored. Is it possible that you run a query that could be expensive in terms of term expansion like WildcardQuery? simon On Sat, Aug 22, 2009 at 7:09 AM, Len Tak

Re: Query parser fails on Hangul/Korean

2009-08-22 Thread Simon Willnauer
Paul, my frist guess would be that your source file encoding is set to something else than UTF-8. Those characters should be supported by lucene - none of them are > 16bit so I don't see why this should be caused by lucene. I'm pretty sure thats a encoding issues. R u running on windows?! hope th

Re: Using HitCollector to Collect First N Hits

2009-08-22 Thread Rafis
Len Takeuchi-2 wrote: > > I’m using Lucene 2.4.1 and I’m trying to use a custom HitCollector to > collect > only the first N hits (not the best hits) for performance. I saw another > e-mail in this group where they mentioned writing a HitCollector which > throws > an exception after N hits to d

Re: Using HitCollector to Collect First N Hits

2009-08-22 Thread AHMET ARSLAN
> I’m using Lucene 2.4.1 and I’m trying to use a custom > HitCollector to collect only the first N hits (not the best hits) for > performance.  You mean that you do not need score calculation therefore you do not want results sorted by relevancy. Just you need is a Boolean Retrieval Model, right?

Query parser fails on Hangul/Korean

2009-08-22 Thread Paul Taylor
public class Issue3341Test extends TestCase { public void testMatchHangul() throws Exception { Analyzer analyzer = new StandardAnalyzer(); RAMDirectory dir = new RAMDirectory(); IndexWriter writer = new IndexWriter(dir, analyzer, true, IndexWriter.MaxFieldLength.LIMITED); Document doc = new Doc