[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13680 how about `[offset1] [offset2] [not written, use offset3] [offset3]`? Then we are still able to calculate the length by subtracting adjacent offsets. --- If your project is set up for it, you

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-05 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 @cloud-fan Option 1 did not work for some cases. If ```UnsafeArrayWriter.write()``` is not called for all of elements, e.g. only some of elements are written. For example, [this

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61731/ Test FAILed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61731 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61731/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-04 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 Implemented 4-byte offset instead of 8-byte length --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61731 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61731/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-03 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 Sorry. This is my misunderstanding. I was confused between ```UnsafeRow``` and ```UnsafeArrayData```. ```UnsafeArrayData``` keeps only one type in an instance. ```[integer] [offset] [float]

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-03 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 Option 1 can work for this array: ```UnsafeDataArray: ...[integer] [offset] [offset] [float]```. This is because 2 offsets are adjacent. Can option 1 work for this ```UnsafeDataArray:

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13680 hmmm, looks like we are not in the same page... How could 2 offsets not adjacent? We only keep offsets in the `value or offset` region, and put them one by one. --- If your project is set up

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13680 One more thought about the format: `UnsafeRow` use 8 bytes to store offset and length for variable-length type, this is because `UnsafeRow` is word-aligned, so we can't calculate the element size

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61673/ Test PASSed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61673 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61673/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61673 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61673/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61671 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61671/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61671/ Test FAILed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61671 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61671/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61660/ Test FAILed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61660 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61660/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61660 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61660/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61620/ Test PASSed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61620 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61620/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-07-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61620 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61620/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-30 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13680 can we also rerun `UDTSerializationBenchmark`? I think it should be faster now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-30 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 @cloud-fan , could you please review this again? I added benchmark programs and their results, and addressed your review comments. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61379/ Test PASSed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61379 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61379/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61379 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61379/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61377/ Test FAILed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61377 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61377/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61377 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61377/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61362/ Test PASSed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61362 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61362/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61362 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61362/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-27 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13680 looks pretty good, could you add a benchmark in this PR? an example is https://github.com/apache/spark/pull/12640/files#diff-b118a818177121a108fa92d0354871a6R26 --- If your project is set up for

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-26 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 @rxin thank you for your comment. As you said, holistic view is important. This PR is not only for machine learning. This PR has another use case for improving projection of an array in any

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-26 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 @cloud-fan and @hvanhovell thank you for your comments. Based on your comments, I implemented ```UnsafeArrayData``` by using one implementation with explicit clearing ```null bits``` by

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61239/ Test PASSed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61239 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61239/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61239 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61239/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61236/ Test PASSed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61236 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61236/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61236 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61236/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61235/ Test FAILed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61235 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61235/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61235 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61235/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61233/ Test FAILed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61233 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61233/consoleFull)** for PR 13680 at commit

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce implementation with a dense...

2016-06-25 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61233 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61233/consoleFull)** for PR 13680 at commit

<    1   2