[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61222/ Test FAILed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61222 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61222/consoleFull)** for PR 13680 at commit [`bf12e72`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61222 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61222/consoleFull)** for PR 13680 at commit [`bf12e72`](https://github.com/apache/spark/commit/b

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61218 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61218/consoleFull)** for PR 13680 at commit [`517aa72`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61218/ Test FAILed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61218 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61218/consoleFull)** for PR 13680 at commit [`517aa72`](https://github.com/apache/spark/commit/5

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-24 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13680 We should do this more holistically, i.e. thinking about what we want to do with primitive arrays for machine learning and how to handle everything end to end. Let's not rush an implementation change j

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-24 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/13680 I am kind in favor of the single implementation for a couple of reasons: - Declaring methods `final` is not a magic bullet. If you are invoking`isNull` or `get*` on the common ancestor it is s

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-24 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13680 I don't have a strong preference here, each choice has its advantage and weakness: 1. alway have the null bits region: faster element access(read the null bits and then read the element),

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-24 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 I see. I assumed that virtual call will be devirtualized by declaring ```final``` method and by optimistically propagating type information in the JIT compiler. Would it be better to add a flag like `

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-23 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13680 having 2 implementations is also kind of a branch: the virtual function call need to be dispatched between these 2 implementations, while the only one implementation can be marked as final and doe

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-23 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 @cloud-fan , for the first issue, we are on the same page. Your proposal is what I am thinking about as possible solutions. I will do that. For the second issue, it seems to be design choice b

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-23 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13680 @kiszk we should definitely put zero into the corresponding field when set null. It will be a little harder than `UnsafeRow`, as we need `setNullBoolean`, `setNullInt`, etc. but it's still doable.

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-23 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 One potential performance issue is that we have to always clear all of null bits at ```UnsafeArrayWriter.initialize()```. This is because ```holder.buffer``` is reused for each row. If one row has mor

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-23 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 @cloud-fan , I have one question about null field. Should we put zero into the corresponding field to position where ```setNullAt()``` is called as ```UnsafeRow``` [does](https://github.com/apache/sp

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13680 Thanks! Feel free to ask any questions! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-22 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 Good to hear. I will make an implementation for single format. If I would meet some issues, I will raise them here. --- If your project is set up for it, you can reply to this email and have your rep

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13680 @kiszk yea, even the null bits is true, the element still take space at `[offset or fixed-length values]` region. --- If your project is set up for it, you can reply to this email and have your r

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/61095/ Test PASSed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61095 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61095/consoleFull)** for PR 13680 at commit [`85f862c`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-22 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 It is OK to always keep ```[null bits]``` One question: Is this format to keep fixed space for ```[values]```? I mean if ```[null bit]``` is true, the corresponding element in ```[value]``` oc

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13680 null bits won't take a lot of memory(1 bit per element), and having the `all zero in null bits?` flag will slow down elements retrievement: we need an extra if branch to check this flag, and make

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-22 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13680 @cloud-fan thank you for your good comment. I also read [previous proposal](https://github.com/apache/spark/pull/12640#discussion_r61539393). I love to have only single format (or implementation).

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #61095 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61095/consoleFull)** for PR 13680 at commit [`85f862c`](https://github.com/apache/spark/commit/8

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13680 Thanks for working on it! One of my concern is: do we really need 2 unsafe array implementations? For the `UnsafeArrayDataDense`, can we follow unsafe row and introduce a null-bits to make it supp

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60568/ Test PASSed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #60568 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60568/consoleFull)** for PR 13680 at commit [`6c09d72`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #60568 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60568/consoleFull)** for PR 13680 at commit [`6c09d72`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #60566 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60566/consoleFull)** for PR 13680 at commit [`d06d200`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60566/ Test FAILed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #60566 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60566/consoleFull)** for PR 13680 at commit [`d06d200`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13680 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60556/ Test FAILed. ---

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #60556 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60556/consoleFull)** for PR 13680 at commit [`639b32d`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13680: [SPARK-15962][SQL] Introduce additonal implementation wi...

2016-06-14 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13680 **[Test build #60556 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60556/consoleFull)** for PR 13680 at commit [`639b32d`](https://github.com/apache/spark/commit/6