[
https://issues.apache.org/jira/browse/SOLR-5332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13834285#comment-13834285
]
Robert Muir commented on SOLR-5332:
-----------------------------------
Just because WordDelimiterFilter has an option doesnt mean other filters should
have it, its hardly a model citizen. Probably even more reason to really think
about what is happening and question if its the right thing to do.
For the use case described in the issue, a separate field suffices and is
likely more flexible and just as efficient.
I admit i dont fully understand what James is doing.
I'm just saying I dont think our filters need options like "preserve" or
"inject" because I see generally no value versus just using another field: its
typically just users who dont understand that the underlying cost in an
inverted index is the same.
> Add "preserve original" setting to the EdgeNGramFilterFactory
> -------------------------------------------------------------
>
> Key: SOLR-5332
> URL: https://issues.apache.org/jira/browse/SOLR-5332
> Project: Solr
> Issue Type: Wish
> Affects Versions: 4.4, 4.5, 4.5.1, 4.6
> Reporter: Alexander S.
>
> Hi, as described here:
> http://lucene.472066.n3.nabble.com/Help-to-figure-out-why-query-does-not-match-td4086967.html
> the problem is in that if you have these 2 strings to index:
> 1. facebook.com/someuser.1
> 2. facebook.com/someveryandverylongusername
> and the edge ngram filter factory with min and max gram size settings 2 and
> 25, search requests for these urls will fail.
> But search requests for:
> 1. facebook.com/someuser
> 2. facebook.com/someveryandverylonguserna
> will work properly.
> It's because first url has "1" at the end, which is lover than the allowed
> min gram size. In the second url the user name is longer than the max gram
> size (27 characters).
> Would be good to have a "preserve original" option, that will add the
> original string to the index if it does not fit the allowed gram size, so
> that "1" and "someveryandverylongusername" tokens will also be added to the
> index.
> Best,
> Alex
--
This message was sent by Atlassian JIRA
(v6.1#6144)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]