question about (problem with?) use of FieldCache$StringIndex

2009-06-25 Thread Ulf Dittmer
Hello- We're looking at memory issues we're having with a fair-sized web app that uses Lucene for search. While looking at heap dumps, we discovered that there were 3 instances of org.apache.lucene.search.FieldCache$StringIndex, each about 110MB in size (out of a total of 1 GB). Looking

Re: question about (problem with?) use of FieldCache$StringIndex

2009-06-25 Thread Ulf Dittmer
Otis Gospodnetic wrote: FieldCache class is used for sorting. Are you sorting by a few different fields by any chance? Yes, we're sorting for one or two fields, depending on user settings. Uwe Schindler wrote: This class is used, when you sort your result against a field, which contains

differences in deleting docs using IndexWriter and IndexModifier?

2008-04-18 Thread Ulf Dittmer
Hello all- While adapting some code to use IndexWriter instead of IndexModifier (as indicated by the deprecation warnings), I stumbled upon an issue that I at first thought was a bug, but I'm sure it's only because I don't fully understand how Lucene works. Basically, I'm using the deleteDocument

Re: differences in deleting docs using IndexWriter and IndexModifier?

2008-04-18 Thread Ulf Dittmer
Thanks for the explanation. You're right, IndexReader reports the correct number of documents. That might be a worthwhile addition to the IndexModifier javadocs - that the IndexWriter method of the same name is not a drop-in replacement. Of course, that's moot if docCount gets deprecated anyway.

Re: Searching through a single XML document

2008-04-20 Thread Ulf Dittmer
The search can't return more than one document, because only a single document is ever added to the index. You might want to think about structuring the index differently, e.g. by creating one Document for each SPEECH element. The search for "the" in particular won't find anything, because that's

Re: Binding lucene instance/threads to a particular processor(or core)

2008-04-21 Thread Ulf Dittmer
This sounds odd. Why would restricting it to a single core improve performance? The point of using multiple cores (and multiple threads) is to improve performance isn't it? I'd leave thread scheduling decisions to the JVM. Plus, I don't think there is anything in Java to facilitate this (short of u

Re: Really dumb search problem

2008-04-25 Thread Ulf Dittmer
Have you tried double-quoting the postcode instead of using parentheses: postcode:"M11 1LQ" Ulf --- Chris Mannion <[EMAIL PROTECTED]> wrote: > "(postcode:(M11 1LQ) )" > > However, the postcode search never returns any results. __

search problem - not finding field values ending in "X"

2008-05-16 Thread Ulf Dittmer
Hello- I'm experiencing a weird issue searching an index. The index has information about books, and one of the fields is the ISBN number. It is stored in the index in untokenized form to enable searches by ISBN. So a query like "isbn:0071490833" would return the Document for that book. But it doe

Re: search problem - not finding field values ending in "X"

2008-05-16 Thread Ulf Dittmer
D'oh! Of course - I'm using StandardAnalyzer. Changing to a PerFieldAnalyzerWrapper with a KeywordAnalyzer for that field fixes the issue. Thanks so much for fast response. Ulf --- Ian Lea <[EMAIL PROTECTED]> wrote: > Hi > > > I bet you are using an analyzer that is downcasing > isbn:00714

question about ScoreDocComparator

2007-03-01 Thread Ulf Dittmer
Hello- One of the fields in my index is an ID, which maps to a full text description behind the scenes. Now I want to sort the search results alphabetically according to the description, not the ID. This can be done via SortComparatorSource and a ScoreDocComparator without problems. But t

Re: Package org.apache.lucene.search.highlight

2007-03-04 Thread Ulf Dittmer
The contrib/highlighter directory contains the jar file that is needed. Ulf On 04.03.2007, at 10:58, WATHELET Thomas wrote: How can I add the Package org.apache.lucene.search.highlight into my projects because the standart Lucene api 2.1.0 do not content this package? --

Re: question about ScoreDocComparator

2007-03-06 Thread Ulf Dittmer
at search time? You can have a Hits object or TopFIeldDocs object returned (the Filter in some of these calls can be null). Best Erick On 3/1/07, Ulf Dittmer <[EMAIL PROTECTED]> wrote: Hello- One of the fields in my index is an ID, which maps to a full text description behind the scen

Re: indexing pdfs

2007-03-08 Thread Ulf Dittmer
For DOC files you can use the Jakarta POI library. Text extraction is outlined here: http://jakarta.apache.org/poi/hwpf/quick-guide.html Ulf On 08.03.2007, at 10:37, ashwin kumar wrote: hi can some one help me by giving any sample programs for indexing pdfs and .doc files ---