In Japanese, compounds are just decompositions of the input string. In
other languages, compounds can manufacture entire tokens from thin
air. In those cases, it's something of a question how to decide on the
offsets. I think that you're right, eventually, insofar as there's
some offset in the
On Sat, Sep 7, 2013 at 7:44 AM, Benson Margulies ben...@basistech.com wrote:
In Japanese, compounds are just decompositions of the input string. In
other languages, compounds can manufacture entire tokens from thin
air. In those cases, it's something of a question how to decide on the
offsets.
On Sat, Sep 7, 2013 at 8:39 AM, Robert Muir rcm...@gmail.com wrote:
On Sat, Sep 7, 2013 at 7:44 AM, Benson Margulies ben...@basistech.com wrote:
In Japanese, compounds are just decompositions of the input string. In
other languages, compounds can manufacture entire tokens from thin
air. In
Hi@all
I am getting strange performance measures on Lucene 4.4.0, maybe someone can
explain this:
The following syntax leads to pretty slow queries on my machine(16ms for every
execution):
theSearcher.search(theQuery, null, theSearcher.getIndexReader().maxDoc());
but the following syntax
nextToken() calls peekToken(). That seems to prevent my lookahead
processing from seeing that item later. Am I missing something?
On Fri, Sep 6, 2013 at 9:15 PM, Benson Margulies ben...@basistech.com wrote:
I think that the penny just dropped, and I should not be using this class.
If I call
Something is wrong; I'm not sure what offhand, but calling peekToken
10 times should not stack all tokens @ position 0; it should stack the
tokens at the positions where they occurred. Are you sure the posIncr
att is sometimes 1 (i.e., the position is in fact moving forward for
some tokens)?
I think I had better build you a test case for this situation, and
attach it to a JIRA.
On Sat, Sep 7, 2013 at 3:33 PM, Michael McCandless
luc...@mikemccandless.com wrote:
Something is wrong; I'm not sure what offhand, but calling peekToken
10 times should not stack all tokens @ position 0; it
That would be awesome, thanks!
Mike McCandless
http://blog.mikemccandless.com
On Sat, Sep 7, 2013 at 3:40 PM, Benson Margulies ben...@basistech.com wrote:
I think I had better build you a test case for this situation, and
attach it to a JIRA.
On Sat, Sep 7, 2013 at 3:33 PM, Michael
LUCENE-5202. It seems to show the problem of the extra peek. I'm still
struggling to make sense of the 'problem' of not always calling
afterPosition(); that may be entirely my own confusion.
On Sat, Sep 7, 2013 at 4:21 PM, Michael McCandless
luc...@mikemccandless.com wrote:
That would be
Thanks Benson, I'll have a look.
Mike McCandless
http://blog.mikemccandless.com
On Sat, Sep 7, 2013 at 4:33 PM, Benson Margulies ben...@basistech.com wrote:
LUCENE-5202. It seems to show the problem of the extra peek. I'm still
struggling to make sense of the 'problem' of not always calling
10 matches
Mail list logo