Hello folks,
I got some Lucene indexes in my project, mostly of them are created once and
updated, not so frequently, about once a week or monthly. The indexes sizes are
about 20GB and as more inserts are done the indexes grow, so I'd like to know
what the best index optimization strategy or e
Optimize is rarely useful. It can give some performance gains, but is quite an
expensive operation. Pre Solr 7.5, optimizing had some behaviors that weren’t
obvious, see:
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/
Post 7.5, the behavior has changed.
I
i tested this on Lucene 7.7.2 and got the same answer MAINS cannot find
MAIN but all other consonant combos at the end can be found.
i am now confident that this is a bug with Lucene.
Best regards
PS. Lucene 8.1 has drastic changes such as StandardFilter is removed in
one of the packages and
Hi,-
do You mean there is a backward compatibility factory in Lucene for
these kinds of cases?
i think it can be fixed like this, In other words is the following
first line redundant then?
TokenStream filter = new StandardFilter(tokenizer); -> redundant
(tokenizer is actually a StandardT
Corrected a typo below in the new code.
Best regards
On 6/25/19 5:01 PM, baris.ka...@oracle.com wrote:
Hi,-
do You mean there is a backward compatibility factory in Lucene for
these kinds of cases?
i think it can be fixed like this, In other words is the following
first line redundant t
Yeah, that code looks right to me.
The factory we use for keeping backwards compatibility is entirely
ours. I think CustomAnalyzer is a similar-looking API to what we have
but we made ours much earlier and it supports analysis stuff all the
way back to Lucene 3 which we migrated all the way to whe
Hi,-
i really want to know why the scoring works this way: search String is
either MAINO or MAINS: MAIN appears as the 276th entry in the results.
NEW HAMPSHIRE in results: city="NASHUA" municipality="HILLSBOROUGH"
region="NEW HAMPSHIRE" country="UNITED STATES" in the 0 th result
NEW HAMPSHI
You can use IndexSearcher#explain to see how scores are computed.
On Wed, Jun 26, 2019 at 12:48 AM wrote:
>
> Hi,-
>
> i really want to know why the scoring works this way: search String is
> either MAINO or MAINS: MAIN appears as the 276th entry in the results.
>
> NEW HAMPSHIRE in results: ci