[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-23 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21772 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-23 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21772 As you actually modify `LongToUnsafeRowMap`, is it better to update the PR title and description to reflect that? --- - To

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-23 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21772 @viirya This case occurred in our cluster and we took a lot of time to find this bug. For some man-made reasons, the small table's max id has become abnormally large. The LongHasedRelation

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21772 **[Test build #93473 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93473/testReport)** for PR 21772 at commit

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21772 @liutang123 Thanks for this work. I'm curious that if this is an actual problem you hit in real application, or you just think it is problematic? ---

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-22 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21772 @viirya Hi, Could you have more time to review this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21772 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93314/ Test PASSed. ---

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21772 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21772 **[Test build #93314 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93314/testReport)** for PR 21772 at commit

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-19 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/21772 Jenkins, test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-19 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21772 Jenkins test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21772 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93267/ Test FAILed. ---

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21772 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21772 **[Test build #93267 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93267/testReport)** for PR 21772 at commit

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21772 **[Test build #93267 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93267/testReport)** for PR 21772 at commit

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-19 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21772 @viirya Yes, absolutely right. :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21772 Let me clarify it. So this means that when `LongToUnsafeRowMap` is broadcasted to executors, and it is too big to hold in memory, it will be stored in disk. At that time, because `write` uses

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-17 Thread liutang123
Github user liutang123 commented on the issue: https://github.com/apache/spark/pull/21772 @hvanhovell Thanks for reviewing. Losing data because the variable **cursor** in executor is 0 and serialization depends on it. I will add an UT later. ---

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21772 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93112/ Test PASSed. ---

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21772 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21772 **[Test build #93112 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93112/testReport)** for PR 21772 at commit

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-16 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/21772 @liutang123 can you explain why we are losing data when serializing to disk. Also can you add a unit test? --- - To

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21772 **[Test build #93112 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93112/testReport)** for PR 21772 at commit

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-16 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/21772 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21772 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21772 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21772: [SPARK-24809] [SQL] Serializing LongHashedRelation in ex...

2018-07-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21772 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional