[jira] [Commented] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16481209#comment-16481209
 ] 

ASF subversion and git services commented on SOLR-12355:


Commit 1e661ed97aed0cc77869b01134d80c761c6b5295 in lucene-solr's branch 
refs/heads/branch_7x from [~dpgove]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1e661ed ]

SOLR-12355: Fixes hash conflict in HashJoinStream and OuterHashJoinStream


> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-12355.patch, SOLR-12355.patch
>
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16481205#comment-16481205
 ] 

ASF subversion and git services commented on SOLR-12355:


Commit f506bc9cb7f1e82295c9c367487d49a80e767731 in lucene-solr's branch 
refs/heads/master from [~dpgove]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=f506bc9 ]

SOLR-12355: Fixes hash conflict in HashJoinStream and OuterHashJoinStream


> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
> Attachments: SOLR-12355.patch, SOLR-12355.patch
>
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-15 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475973#comment-16475973
 ] 

Dennis Gove commented on SOLR-12355:


Initial patch attached. I have not yet run the full suite of tests against this.

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
> Attachments: SOLR-12355.patch
>
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-14 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475060#comment-16475060
 ] 

Dennis Gove commented on SOLR-12355:


This also impacts OuterHashJoinStream.

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12355) HashJoinStream's use of String::hashCode results in non-matching tuples being considered matches

2018-05-14 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16475055#comment-16475055
 ] 

Dennis Gove commented on SOLR-12355:


I have a fix for this where instead of calculating the string value's hashCode 
we just use the string value as the key in the hashed set of tuples. I'm 
creating a few test cases to verify this gives us what we want.

> HashJoinStream's use of String::hashCode results in non-matching tuples being 
> considered matches
> 
>
> Key: SOLR-12355
> URL: https://issues.apache.org/jira/browse/SOLR-12355
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0
>Reporter: Dennis Gove
>Assignee: Dennis Gove
>Priority: Major
>
> The following strings have been found to have hashCode conflicts and as such 
> can result in HashJoinStream considering two tuples with fields of these 
> values to be considered the same.
> {code:java}
> "MG!!00TNGP::Mtge::".hashCode() == "MG!!00TNH1::Mtge::".hashCode() {code}
> This means these two tuples are the same if we're comparing on field "foo"
> {code:java}
> {
>   "foo":"MG!!00TNGP::Mtge::"
> }
> {
>   "foo":"MG!!00TNH1::Mtge::"
> }
> {code}
> and these two tuples are the same if we're comparing on fields "foo,bar"
> {code:java}
> {
>   "foo":"MG!!00TNGP"
>   "bar":"Mtge"
> }
> {
>   "foo":"MG!!00TNH1"
>   "bar":"Mtge"
> }{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org