Re: SpanXXQuery Usage

2004-03-22 Thread Terry Steichen
Otis, Can you give me/us a rough idea of what these are supposed to do? It's hard to extrapolate the terse unit test code into much of a general notion. I searched the archives with little success. Regards, Terry - Original Message - From: Otis Gospodnetic [EMAIL PROTECTED] To:

Indexing japanese PDF documents

2004-03-22 Thread Chandan Tamrakar
I am using latest PDFbox library for parsing . I can parse a english documents successfully but when I parse a document containing english and japanese I do not get as I expected . Have anyone tried using PDFBox library for parsing a japanese documents ? Or do i need to use other parser like xPDF

Re: CJK Analyzer indexing japanese word document

2004-03-22 Thread Chandan Tamrakar
hi scott, Tnks for ur advise now i am using POI to convert word documents and made sure that i convert into unicode before I put into lucene for indexing . and working perfectly fine. Which parser is best for parsing PDF documents i tried pdfbox but seems it doesnt work well with japanese

Re: Final Hits

2004-03-22 Thread Erik Hatcher
How exactly would you take advantage of a subclassable Hits class? On Mar 21, 2004, at 6:01 AM, Terry Steichen wrote: Does anyone know why the Hits class is final (thus preventing it from being subclassed)? Regards, Terry -

Re: Indexing japanese PDF documents

2004-03-22 Thread Otis Gospodnetic
I have not tried these other tools yet. Have you asked Ben Litchfield, the PDFBox author, about handling of Japanese text? Otis --- Chandan Tamrakar [EMAIL PROTECTED] wrote: I am using latest PDFbox library for parsing . I can parse a english documents successfully but when I parse a document

Re: Indexing japanese PDF documents

2004-03-22 Thread Ben Litchfield
Yes he did, but I was away the past couple days. As this is more of a PDFBox issue I responded in the PDFBox forums, please follow the thread there if you are interested. Ben On Mon, 22 Mar 2004, Otis Gospodnetic wrote: I have not tried these other tools yet. Have you asked Ben

Re: Demoting results

2004-03-22 Thread Boris Goldowsky
On Fri, 2004-03-19 at 11:58, Doug Cutting wrote: Doug Cutting wrote: On Thu, 2004-03-18 at 13:32, Doug Cutting wrote: Have you tried assigning these very small boosts (0 boost 1) and assigning other query clauses relatively large boosts (boost 1)? I don't think you understood my

Re: Specifation of the Key words to be searched

2004-03-22 Thread Otis Gospodnetic
Re-directing to lucene-user list. One way of doing this is by writing a custom Analyzer that throws away words you don't want to index (see an example of custom Analyzer in jGuru FAQ). Another way would be to just re-use the existing Analyzers and add words you don't want indexed to the

Re: Final Hits

2004-03-22 Thread Terry Steichen
Erik, There are a number of different possibilities which I'm still evaluating. But if there is some significant reason for *not* subclassing Hits (performance?), that will have a major bearing on whether the approach I'm evaluating makes sense. So, let me rephrase my question: Is the final

Re: Final Hits

2004-03-22 Thread Erik Hatcher
Terry, I'm still quite curious how you plan to take advantage of a subclassable Hits. Are you going to create your own IndexSearcher with returns your subclass somehow? You could use a HitCollector (which is what is used under the covers of the Hits returning methods anyway) to emulate

code works with 1.3-rc1 but not with 1.3-final??

2004-03-22 Thread Dan
I have some code that creates a lucene index. It has been working fine with lucene-1.3-rc1.jar but I wanted to upgrade to lucene-1.3-final.jar. I did this and the indexer breaks. I get the following error when running the index with 1.3-final: Optimizing the index IOException:

termPosition does not iterate properly in Lucene 1.3 rc1

2004-03-22 Thread Allen Atamer
Lucene does not iterate through the termPositions on one of my indexed data sources. It used to iterate properly through this data source, but not anymore. I tried on a different indexed data source and it iterates properly. The Lucene index directory does not have any lock files either. My code

Re: code works with 1.3-rc1 but not with 1.3-final??

2004-03-22 Thread Kevin A. Burton
Dan wrote: I have some code that creates a lucene index. It has been working fine with lucene-1.3-rc1.jar but I wanted to upgrade to lucene-1.3-final.jar. I did this and the indexer breaks. I get the following error when running the index with 1.3-final: Optimizing the index IOException:

Re: code works with 1.3-rc1 but not with 1.3-final??

2004-03-22 Thread Matt Quail
Or use IndexWriter.setUseCompundFile(true) to reduce the number of files created by Lucene. http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/index/IndexWriter.html#setUseCompoundFile(boolean) =Matt Kevin A. Burton wrote: Dan wrote: I have some code that creates a lucene index. It

Lock timeout should show the index it failed on...

2004-03-22 Thread Kevin A. Burton
Just an RFE... if a lock times out we should probably throw the name of the FSDirectory (or if it's a RAMDirectory) ... I'm lazy so this is a reminder for either myself to do this or wait until one of you guys take care of it :) Kevin -- Please reply using PGP.