Note that ISOLatin1AccentFilter converts accent characters only from
ISO-8859-1 character set. Which means that if you need to convert
accents of eastern European languages you need to write your own
accent filter.
wojtek
2008/7/16 Petite Abeille [EMAIL PROTECTED]:
On Jul 16, 2008, at 10:58 AM,
Thank you for the answer. So it means that I can without any problems
iterate over index documents using this algoritm (I don't want to use
MatchAllQuery):
- check maxDoc()
- iterate from 0 to maxDoc() and process doc if it is not deleted
Am I right?
Best,
wojtek
2008/4/12, Chris Hostetter
Hi all,
I am wondering if there are possible holes in set of index documents
ids. Being more specific - is it possible that there exist integer i
between 0 and IndexReader.maxDoc() such that
reader.document(i) == null
and
reader.isDeleted(i)==false
???
Regards,
wojtek
Hi all,
Snowball stemmers are part of Lucene, but for few languages only. We
have documents in various languages and so need stemmers for many
languages (in particular polish). One of the ideas is to use ispell
dictionaries. There are ispell dicts for many languages and so this
solution is good
Hi all,
our problem is to choose the best (the fastest) way to iterate over huge set
of documents (basic and most important case is to iterate over all documents
in the index). Some slow process accesses documents and now it is done via
repeating query (for instance MatchAllDocsQuery). It
noticeable difference between the first and last request unless you're
doing something like accessing the documents before you get to
the first one you expect to return. And a TopDocs should even
preserve scoring...
Best
Erick
On Wed, Mar 26, 2008 at 5:48 AM, Wojtek H [EMAIL
Hi all,
Suppose my query has normal part for which I want score as usual and
other part which is big disjunction (OR) query for which I just want
documents to match and don't care about scoring. Is there a way to
make it fast?
As far as I understand if 'no-score' part was the same in many queries