How Lucene Search

2008-06-26 Thread blazingwolf7
hi, I am fairly new to Lucene and is currently going over its source code. I had read through the code for a few times, mapping it and all but I seems to be facing a problem. I could go all the way to the calculation of score for each result obtain, but strangely I did not managed to locate the

Highlight an Greek

2008-06-26 Thread jim
Hello i am having the following code to highlight a text public String highlight(String text, String query ) throws IOException { TermQuery query = new TermQuery(new Term("f", query)); QueryScorer scorer = new QueryScorer(query); SimpleHTMLFormatter formatter = new SimpleHTMLForm

Re: case insensitivity

2008-06-26 Thread John Byrne
Chris Hostetter wrote: the enumeration is in lexigraphical order, so "Dell" is no where near "dell" in the enumeration. even if we added a boolean property to Terms indicating that it's case insensitive Term the "seeking" along that enumeration would be ... lss optimal ... then it can be now.

Highlight and greek characters

2008-06-26 Thread jim
Hello i am having the following code to highlight a text public String highlight(String text, String query ) throws IOException { TermQuery query = new TermQuery(new Term("f", query)); QueryScorer scorer = new QueryScorer(query); SimpleHTMLFormatter formatter = new SimpleHTMLForm

Re: IndexDeletionPolicy to delete commits after N minutes

2008-06-26 Thread Michael McCandless
The unit test for DeletionPolicy has an example called "ExpirationTimeDeletionPolicy". You can see its source here: http://svn.apache.org/viewvc/lucene/java/tags/lucene_2_3_2/src/test/org/apache/lucene/index/TestDeletionPolicy.java?revision=653677&view=markup Note that the DeletionPolicy is g

Re: Score 0

2008-06-26 Thread Toke Eskildsen
On Wed, 2008-06-25 at 21:47 +0200, Paolo Valleri wrote: > For take docid of all document in the index I need to write a class > that implement indexReader or there is an other method ? Something along the following should work and be quite fast. However, it might be overly complex. // Do this

Re: How Lucene Search

2008-06-26 Thread Grant Ingersoll
The index is opened via the IndexReader and then it is a variety of places that factor into scoring, such as the Query, Weight and Scorer classes. Probably the easiest place to start is the TermQuery and TermScorer. You might also have a read through http://lucene.apache.org/java/2_3_2/sc

Re: Highlight an Greek

2008-06-26 Thread Grant Ingersoll
Because there are no matches? Have you checked your index, etc? Do you get matches for that query normally in Greek against your index (nevermind highlighting)? Are your analyzers the same? Are your English Fields stored and the Greek ones not? Does field "f" contain Greek? It could b

Class for serializing TokenStream to IndexOutput

2008-06-26 Thread Jason Rutherglen
Is there a class to do this?

Re: IndexDeletionPolicy to delete commits after N minutes

2008-06-26 Thread Erick Erickson
Please don't post the same message multiple times to the user list within a few minutes of each other. It's pretty rude. Best Erick On Thu, Jun 26, 2008 at 5:34 AM, Michael McCandless < [EMAIL PROTECTED]> wrote: > The unit test for DeletionPolicy has an example called > "ExpirationTimeDeletionPo

Preventing index corruption

2008-06-26 Thread Eran Sevi
Hi, I'm looking for the correct way to create an index given the following restrictions: 1. The documents are received in batches of variable sizes (not more then 100 docs in a batch). 2. The batch insertion must be transactional - either the whole batch is added to the index (exists physically o

Re: Preventing index corruption

2008-06-26 Thread Erick Erickson
How big is your index? The simpleminded way would be to copy things around as your batches come in and only switch to the *real* one after the additions were verified. You could also just maintain two indexes but only update one at a time. In the 99.99% case where things went well, it would just b

Re: Preventing index corruption

2008-06-26 Thread Eran Sevi
Thanks Erick. You might be joking, but one of our clients indeed had all his servers destroyed in a flood. Of course in this rare case, a solution would be to keep the backup on another site. However I'm still confused about normal scenarios: Let's say that in the middle of the batch I got an exc

Re: How Lucene Search

2008-06-26 Thread Alex Cheng
the debugger that came with eclipse is pretty good for this purpose. You can create a small project and then attach Lucene source for the purpose of debugging. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e

Re: Problem with search an exact word and stemming

2008-06-26 Thread Matthew Hall
You could also add another data field to the index, with an untokenized version of your data, and then use a multifield query to go against both the stemmed and exact match parts of your search at the same time. This is a technique I've used quite often on my project with various different req

possible to read index into memory?

2008-06-26 Thread Darren Govoni
Hi, Is there a lucene index reader that will load a disk-based index into memory and perform searches on it from RAM? Sorry if I missed this in the docs somewhere. Darren - To unsubscribe, e-mail: [EMAIL PROTECTED] For additio

Re: possible to read index into memory?

2008-06-26 Thread Erick Erickson
>From the docs... RAMDirectory public *RAMDirectory*(Directory dir) throws IOException Creates a new RAMDirectory instance from a different Directoryimplementation. This can be used to load a disk-based index into mem

Re: How Lucene Search

2008-06-26 Thread blazingwolf7
Thanks for the reply. I had try to start a new project already. Like I had mention, I actually go through the code from the start of the application and till the end where the scoring is done. But unfortunately, I still fail to locate the part where Lucene open the index to perform the search. A

Can you create a Field that is a copy of another Field?

2008-06-26 Thread Bill.Chesky
Hello Lucene Gurus, I'm new to Lucene so sorry if this question basic or naïve. I have a Document to which I want to add a Field named, say, "foo" that is tokenized, indexed and unstored. I am using the "Field(String name, TokenStream tokenStream)" constructor to create it. The TokenStr

Can we know "number-of-documents-that-will-be-flushed"?

2008-06-26 Thread java_is_everything
Hi all. Is there a way to know "number-of-documents-that-will-be-flushed", just before giving a call to flush() method? I am currently using Lucene 2.2.0 API. Looking forward to replies. Ajay Garg -- View this message in context: http://www.nabble.com/Can-we-know-%22number-of-documents-that-w