[ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475055#comment-16475055
 ] 

Dennis Gove commented on SOLR-12355:
------------------------------------

I have a fix for this where instead of calculating the string value's hashCode 
we just use the string value as the key in the hashed set of tuples. I'm 
creating a few test cases to verify this gives us what we want.

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> ------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-12355
>                 URL: https://issues.apache.org/jira/browse/SOLR-12355
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrJ
>    Affects Versions: 6.0
>            Reporter: Dennis Gove
>            Assignee: Dennis Gove
>            Priority: Major
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to