Gregory,
that was indeed my problem. Thank you very much for your support.
Martin
This is a reply to
http://mail-archives.apache.org/mod_mbox/lucene-java-user/201404.mbox/%3CCAASL1-8jRbEG%3DLi96eDLY-Pr_zwev6vk4vk4BW_ryKF1Dnb4KA%40mail.gmail.com%3E
On 1 April 2014 23:52, Martin Líška wrote:
>
I agree entirely with Robert about not doubling up on the filter, wrapper. To
stop unigrams, consider setOutputUnigrams(false).
-Original Message-
From: Robert Muir [mailto:rcm...@gmail.com]
Sent: Wednesday, April 02, 2014 2:50 PM
To: java-user
Subject: Re: Strange behavior of ShingleFi
either remove the shingleanalyzer or the additional filter...
On Wed, Apr 2, 2014 at 2:44 PM, Natalia Connolly
wrote:
> Hi Robert,
>
>No, I did not… I just needed the filter to stop it from outputting
> unigrams; otherwise I was getting "This", "this is", "is", "is a ", and so
> on. Is ther
Did you really mean to shingle twice (shingleanalyzerwrapper just
wraps the analyzer with a shinglefilter, then the code wraps that with
another shinglefilter again) ?
On Wed, Apr 2, 2014 at 1:42 PM, Natalia Connolly
wrote:
> Hello,
>
>I am very confused about what ShingleFilter seems to be d
Hi Robert,
No, I did not… I just needed the filter to stop it from outputting
unigrams; otherwise I was getting "This", "this is", "is", "is a ", and so
on. Is there another way I could do it?
Thank you,
Natalia
On Wed, Apr 2, 2014 at 2:40 PM, Robert Muir wrote:
> Did you really
Hello,
I am very confused about what ShingleFilter seems to be doing in Lucene
4.6. What I would like to do is extract all possible bigrams from a
sentence. So if the sentence is "This is a dog", I want "This is", "is a
", "a dog".
Here is my code:
StringTokenizer itr = new StringTok
April 2014, Apache Lucene™ 4.7.1 available
The Lucene PMC is pleased to announce the release of Apache Lucene 4.7.1
Apache Lucene is a high-performance, full-featured text search engine library
written entirely in Java. It is a technology suitable for nearly any
application that requires full-