Re: AnalyzingInfixSuggestor and PrefixQuery performance difference

2017-03-09 Thread Michael McCandless
AnalyzingInfixSuggester does not use an FST; it uses a Lucene index. It's faster because 1) it indexes leading ngrams (up to 4 characters by default) so that short suggestions map to a TermQuery (longer suggestions still use PrefixQuery), which is much faster than PrefixQuery, but also 2) it uses

Re: A flush exception in lucene 4.10.0

2017-03-09 Thread Michael McCandless
This seems likely to be a Lucene bug, and it seems vaguely familiar. I tried to find the issue / commit that may have fixed it, but so far failed. But 4.10.0 is truly ancient; you should at least try upgrading to 4.10.4? Mike McCandless http://blog.mikemccandless.com On Wed, Mar 8, 2017 at 6:1

Re: A flush exception in lucene 4.10.0

2017-03-09 Thread Steve Rowe
Maybe (though it was committed in Lucene 4.5)? Robert Muir pointed to this issue as fixing , which contains a similar stack track to yours. -- Steve www.lucidworks.com > On Mar 9, 2017, at 6:

Range queries get misinterpreted when parsed twice via the "Standard" parsers

2017-03-09 Thread Michael Peterson
Hello, At Rocana we have a search system that builds a Lucene query on a front end (web) system and sends the query string to a backend system. The query typed in by the user on the front end first gets parsed (for rewriting and adding additional hidden clauses), turned back into a Lucene query st

Dynamic Numeric Range Faceting

2017-03-09 Thread Chitra
Hey mike, I will extend our discussion here... http://blog.mikemccandless.com/2013/05/dynamic-faceting-with-lucene.html?showComment=1489073190241#c6096569735718999485 We need some clarifications regarding dynamic range Faceting. 1.In earlier version like lucene-4-10-4(we are using) mul

Re: Range queries get misinterpreted when parsed twice via the "Standard" parsers

2017-03-09 Thread Erick Erickson
There has never been a guarantee that going back and forth between a parsed query and its string representation is idempotent. so this isn't supported. Best, Erick On Thu, Mar 9, 2017 at 5:58 AM, Michael Peterson wrote: > Hello, > > At Rocana we have a search system that builds a Lucene query on

codec: accessing term dictionary

2017-03-09 Thread Jürgen Jakobitsch
hi, i'd like to ask users for their experiences with the fastest way to access the term dictionary. what i want to do is to implement some algorithms to find phrases (e.g. mutual rank ratio [1]) (and other statistics on term distribution, generally: corpus related stuff) the idea would be to do

Re: Range queries get misinterpreted when parsed twice via the "Standard" parsers

2017-03-09 Thread Trejkaz
On Fri, 10 Mar 2017 at 01:19, Erick Erickson wrote: > There has never been a guarantee that going back and forth between a > parsed query and its string representation is idempotent. so this > isn't supported. Maybe delete the toQueryString method... There is a fundamental design problem with

Re: Range queries get misinterpreted when parsed twice via the "Standard" parsers

2017-03-09 Thread Michael Peterson
Everyone - thanks for the feedback. Trejkaz, I agree. The [ts:X ts:Y] range syntax seems odd at best and broken at worst. If the field name for the range has to be the same for both the lower and upper bound why put it there twice inside the braces? In addition, a user cannot type that syntax and

Re: A flush exception in lucene 4.10.0

2017-03-09 Thread Yonghui Zhao
My version is 4.10.0 which is later than 4.5, but I didn't find the fix like https://issues.apache.org/jira/secure/attachment/12593180/LUCENE-5116.patch in indexwriter, neither in 4.10.4. Not sure which version has fixed it. public void addIndexes(IndexReader... readers) throws IOException {