Re: Highlight the searched word when full-text searching performed

2005-11-27 Thread Jerry Stern
Thanks! Yes you're right. Highlight all hits at one time may cause problems. A hits paging method is needed to avoid this. Another, if we read the contents of the original file into a string, passing it to the highlighter at the searching stage, this also could cause problems when large orig

Lucene and DB interactions

2005-11-27 Thread Assad Jarrahian
I have some specific questions (Using a pure java solution with PostgreSQL). Any comments/info would be much appreciated. 1) I build an index. Then as the db gets more "documents" inserted into it, it needs to be added to the index. I was thinking of checking for X documents (will store in a vecto

How to Use Memoryindex for Lots of Queries With Sort?

2005-11-27 Thread Victor Lee
Hi, I am using Memeoryindex as described here: http://dsd.lbl.gov/nux/api/org/apache/lucene/index/memory/MemoryIndex.html . I am using it to match lots(10 thousands) of queries with one document. Then I want to rank them based on score and some other variables. I want to know if there i

Re: MS-Word docs.

2005-11-27 Thread Steven Bell
I did run down the issue. And it's a case of tired coder. I wasn't creating a new document object in the method I was using to handle word documents. Thanks very much for the links guys, I appreciate it! steve. Chris Hostetter wrote: : I dump the doc files into a text file with the same var

Re: Highlight the searched word when full-text searching performed

2005-11-27 Thread Erik Hatcher
On 27 Nov 2005, at 00:24, Jerry Stern wrote: I wonder how to highlight the searched word when full-text searching performed based on Lucene. At the indexing stage, the contents of a original file is regarded as a FIELD of a Lucene document: private static void indexFile(File file, Ind

Re: MS-Word docs.

2005-11-27 Thread Chris Hostetter
: I dump the doc files into a text file with the same variable I use in : the Lucene doc.add(Field.UnStored("content", textStr));| and they look : fine in the file. However searches return nothing. if i'm reading that sentence correctly, then you are saying that you've tried isolating your MS-Wor

IndexReader locking

2005-11-27 Thread Daniel Noll
While writing a simple stress testing exercise, I came across the strange condition that the IndexReader locks the index even though it's only supposed to be reading. Now, I understand that IndexReader can in fact modify the index (no matter how unintuitive that is) but it seems to me that a l

Re: MS-Word docs.

2005-11-27 Thread Otis Gospodnetic
Hello Steven, There is a small ready-to-do framework in Lucene in Action that you can use to indes MS Word, PDF, RTF, XML, and plain0text docs - http://lucenebook.com/ . I suggest you stick with POI libraries, as it looks like Textmining code is no longer maintained. Otis --- Steven Bell <[EMAI