[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread mrkm4ntr
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 @gatorsmile Thanks! I will close it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20568 @mrkm4ntr Thank you for your contribution! The PR has been merged using your Github account. Could you close this? --- - To

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20568 I think we can close this now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20568 Submitted the PR https://github.com/apache/spark/pull/20630 to take this over. --- - To unsubscribe, e-mail:

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20568 To speedup the work here, I will take this over. All the contributions should be given to @mrkm4ntr Thanks for your work! @mrkm4ntr ---

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread jkbradley
Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/20568 I think this should block RC4 : ( For ML, it's really important that MurmurHash3 behave consistently across platforms. However, for ML, we'll need to maintain the old implementation of

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread sameeragarwal
Github user sameeragarwal commented on the issue: https://github.com/apache/spark/pull/20568 @hvanhovell just to make sure, given the dependency on `FeatureHasher`, should this block RC4? --- - To unsubscribe,

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20568 @mrkm4ntr this is legitimate failure. Can you fix the python tests? --- - To unsubscribe, e-mail:

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20568 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87509/ Test FAILed. ---

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20568 **[Test build #87509 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87509/testReport)** for PR 20568 at commit

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20568 **[Test build #87509 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87509/testReport)** for PR 20568 at commit

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20568 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-16 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20568 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20568 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87501/ Test FAILed. ---

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20568 **[Test build #87501 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87501/testReport)** for PR 20568 at commit

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20568 **[Test build #87501 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87501/testReport)** for PR 20568 at commit

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/20568 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20568 Retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20568 @mrkm4ntr Do not worry about these failures. Since we know there are some unstable tests, our community is trying to fix them. For a while, we have to kick test. ---

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread mrkm4ntr
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 I cannot reproduce this failure of the test in my environment. It seems to me that this is not related to this change... ---

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/20568 Retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20568 **[Test build #87472 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87472/testReport)** for PR 20568 at commit

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87472/ Test FAILed. ---

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20568 **[Test build #87472 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87472/testReport)** for PR 20568 at commit

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-14 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20568 Jenkins, test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-14 Thread mrkm4ntr
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 @hvanhovell I added a method and changed it so that we call it only from FeatureHasher. --- - To unsubscribe, e-mail:

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-14 Thread mrkm4ntr
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 @hvanhovell I sent an e-mail to the topic `[VOTE] Spark 2.3.0 (RC3)`. --- - To unsubscribe, e-mail:

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-14 Thread mrkm4ntr
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 I registered with the same user name in dev list. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-14 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20568 @mrkm4ntr I see your point. Adding a method to Murmur3 would work. The problem is that we are now going to release a `FeatureHasher` in Spark 2.3 that uses the current Murmur3

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-14 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/20568 How about add a new config to control whether to use the new Murmur3 hash function and have that default turned off? We also have to document the change explicitly. WDYT @gatorsmile

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-12 Thread mrkm4ntr
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 @hvanhovell The main motivation is making the online prediction of trained parameters using FeatureHasher in MLLib. If the generated hash value is different from the implementations in another

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-12 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/20568 @mrkm4ntr The change itself looks pretty reasonable. However I am very hesitant to merge this because this will probably break bucketing (it uses murmur3 to create the buckets); for example a

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-11 Thread mrkm4ntr
Github user mrkm4ntr commented on the issue: https://github.com/apache/spark/pull/20568 @kiszk Thank you for your review! I fixed it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20568: [SPARK-23381][CORE] Murmur3 hash generates a different v...

2018-02-10 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20568 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional