[
https://issues.apache.org/jira/browse/LUCENE-7960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468292#comment-16468292
]
Robert Muir commented on LUCENE-7960:
-------------------------------------
{quote}
I made the min/max parameters required on the factory because the constructor
without any size parameters is deprecated. Is this something you don't like at
all, or something you would only want to see in master?
{quote}
what does it mean "not making that change in the backport to 7x" ?
As i suggested above: consider making the patch against master fully backwards
compatible. We can review it, then it can be committed, merged cleanly and
safely back to 7.x. After that, remove the deprecations in master in a separate
dedicated commit.
It seems like more work, but I think its less work than trying to do a
shortcut, because you can have confidence you don't break stuff. "Making
changes during backports" seems like trouble, and having a confusing patch
makes the code review hard. The current one is confusing because it isn't
really appropriate for either master (it has deprecations) nor 7x (it breaks
backwards)
> NGram filters -- preserve the original token when it is outside the min/max
> size range
> --------------------------------------------------------------------------------------
>
> Key: LUCENE-7960
> URL: https://issues.apache.org/jira/browse/LUCENE-7960
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/analysis
> Reporter: Shawn Heisey
> Priority: Major
> Attachments: LUCENE-7960.patch, LUCENE-7960.patch, LUCENE-7960.patch
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> When ngram or edgengram filters are used, any terms that are shorter than the
> minGramSize are completely removed from the token stream.
> This is probably 100% what was intended, but I've seen it cause a lot of
> problems for users. I am not suggesting that the default behavior be
> changed. That would be far too disruptive to the existing user base.
> I do think there should be a new boolean option, with a name like
> keepShortTerms, that defaults to false, to allow the short terms to be
> preserved.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]