[ANNOUNCE] Apache Lucene 5.4.0 released

2015-12-14 Thread Upayavira
14 December 2015, Apache Lucene™ 5.4.0 available The Lucene PMC is pleased to announce the release of Apache Lucene 5.4.0 Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires ful

different handling of multiterm within a SpanNot Query in 5.3.1 vs 5.4.0?

2015-12-14 Thread Allison, Timothy B.
Great to see 5.4.0 is out. I tried to update my fork of LUCENE-5205, and found that multiterms within a SpanNotQuery don't seem to be processed correctly. [fever bieb*]!~2,5 Find "fever" but not if a multiterm hit on bieb* appears within 2 words before or 5 words after. In 5.3.1, this worked

Re: Jensen–Shannon divergence

2015-12-14 Thread Jack Krupansky
Is there any particular reason that you find Lucene's builtin TF/IDF and BM25 similarity models insufficient for your needs? In any case, examination of their source code should get you started if you with to do your own: https://lucene.apache.org/core/5_3_0/core/org/apache/lucene/search/similarit

RE: Jensen–Shannon divergence

2015-12-14 Thread Uwe Schindler
Hi, Next to BM25 and TF-IDF, Lucene also privides many more similarity implementations: https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/LMDirichletSimilarity.html https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/LMJelinekMercerSimila

Re: Jensen–Shannon divergence

2015-12-14 Thread will martin
cool list. Thanks Uwe. Opportunities to gain competitive advantage in selected domains. > On Dec 14, 2015, at 6:02 PM, Uwe Schindler wrote: > > Hi, > > Next to BM25 and TF-IDF, Lucene also privides many more similarity > implementations: > > https://lucene.apache.org/core/5_4_0/core/org/apac