[ 
https://issues.apache.org/jira/browse/SOLR-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793690#action_12793690
 ] 

Robert Muir commented on SOLR-1670:
-----------------------------------

bq. Duplicated tokens aside (that should have been tested another way), I 
prefer to test the intended semantics rather than the exact behavior

then it would be better to use actual queries and such to test this filter, 
like (some of the) WordDelimiterFilter tests do with assertQ and friends.

But I think things like the order of tokens in the stream is still something i 
would like to preserve when migrating to the new tokenstream API. Maybe it 
doesn't seem important but it would still be a change in behavior (probably 
only break you if you ran positionfilter after, or something like that).

I guess in my opinion, overall its better for the tests to be overly strict, 
and if in the future we make a valid change to the implementation that breaks a 
test, we can discuss it during said change, and people can comment on whether 
this behavior was actually important: for example the aa/a versus a/aa i would 
probably say not a big deal, but the aa versus aa/aa/aa thing to me is a big 
deal.

The alternative, being overly lax, presents the possibility of introducing 
incorrect behavior without being caught, and I think this is very dangerous.


> synonymfilter/map repeat bug
> ----------------------------
>
>                 Key: SOLR-1670
>                 URL: https://issues.apache.org/jira/browse/SOLR-1670
>             Project: Solr
>          Issue Type: Bug
>          Components: Schema and Analysis
>    Affects Versions: 1.4
>            Reporter: Robert Muir
>            Assignee: Yonik Seeley
>         Attachments: SOLR-1670.patch, SOLR-1670_test.patch
>
>
> as part of converting tests for SOLR-1657, I ran into a problem with 
> synonymfilter
> the test for 'repeats' has a flaw, it uses this assertTokEqual construct 
> which does not really validate that two lists of token are equal, it just 
> stops at the shorted one.
> {code}
>     // repeats
>     map.add(strings("a b"), tokens("ab"), orig, merge);
>     map.add(strings("a b"), tokens("ab"), orig, merge);
>     assertTokEqual(getTokList(map,"a b",false), tokens("ab"));
>     /* in reality the result from getTokList is ab ab ab!!!!! */
> {code}
> when converted to assertTokenStreamContents this problem surfaced. attached 
> is an additional assertion to the existing testcase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to