Re: Is there an EdgeSingleFilter already?

xavier jmlucjav Sat, 16 Mar 2013 14:35:30 -0700

I read too fast your reply, so I thought you meant configuring the
LimitTokenPositionFilter. I see you mean I have to write one, ok...




On Sat, Mar 16, 2013 at 10:33 PM, xavier jmlucjav <jmluc...@gmail.com>wrote:

> Steve,
>
> Yes, I want only "one", "one two", and "one two three", but nothing else.
> Cool if this can be achieved without java code even better, I'll check that
> filter.
>
> I need this for building a field used for suggestions, the user
> specifically wants no match only from the edge.
>
> thanks!
>
> On Sat, Mar 16, 2013 at 10:22 PM, Steve Rowe <sar...@gmail.com> wrote:
>
>> Hi xavier,
>>
>> It's not clear to me what you want.  Is the "edge" you're referring to
>> the beginning of a field? E.g. raw text "one two three four" with
>> EdgeShingleFilter configured to produce unigrams, bigrams and trigams would
>> produce "one", "one two", and "one two three", but nothing else?
>>
>> If so, I suspect writing a LimitTokenPositionFilter (which would stop
>> emitting tokens after the token position exceeds a specified limit) would
>> be better, rather than subclassing ShingleFilter.  You could use
>> LimitTokenCountFilter as a model, especially its "comsumeAllTokens" option.
>>  I think this would make a nice addition to Lucene.
>>
>> Also, what do you plan to use this for?
>>
>> Steve
>>
>> On Mar 16, 2013, at 5:02 PM, xavier jmlucjav <jmluc...@gmail.com> wrote:
>> > Hi,
>> >
>> > I need to use shingles but only keep the ones that start from the edge.
>> >
>> > I want to confirm there is no way to get this feature without
>> subclassing
>> > ShingleFilter, cause I thought someone would have already encountered
>> this
>> > use case....
>> >
>> > thanks
>> > xavier
>>
>>
>

Re: Is there an EdgeSingleFilter already?

Reply via email to