[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-06 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/13505 Fixed the tests by making `AttributeSeq` serializable. I'm going to merge this into master and branch-2.0. --- If your project is set up for it, you can reply to this email and have your reply ap

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60029/ Test PASSed. ---

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #60029 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60029/consoleFull)** for PR 13505 at commit [`5e9c258`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #60029 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60029/consoleFull)** for PR 13505 at commit [`5e9c258`](https://github.com/apache/spark/commit/5

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #60019 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60019/consoleFull)** for PR 13505 at commit [`5504b6c`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60019/ Test FAILed. ---

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #60015 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60015/consoleFull)** for PR 13505 at commit [`5504b6c`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60015/ Test FAILed. ---

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #60019 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60019/consoleFull)** for PR 13505 at commit [`5504b6c`](https://github.com/apache/spark/commit/5

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/13505 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #60015 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60015/consoleFull)** for PR 13505 at commit [`5504b6c`](https://github.com/apache/spark/commit/5

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13505 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60011/ Test FAILed. ---

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #60011 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60011/consoleFull)** for PR 13505 at commit [`5504b6c`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/13505 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the fea

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #60011 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60011/consoleFull)** for PR 13505 at commit [`5504b6c`](https://github.com/apache/spark/commit/5

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-05 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/13505 Lgtm with minor comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-04 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/13505 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabl

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-04 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/13505 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59994/ Test FAILed. --- If y

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-04 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #59994 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59994/consoleFull)** for PR 13505 at commit [`4efd3ee`](https://github.com/apache/spark/commi

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-04 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/13505 lgtm - I didn't look too closely though. Would be great @ericl to look at this in detail. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub a

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-04 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #3066 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3066/consoleFull)** for PR 13505 at commit [`4efd3ee`](https://github.com/apache/spark/commi

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-04 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #59994 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59994/consoleFull)** for PR 13505 at commit [`4efd3ee`](https://github.com/apache/spark/commit

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-04 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/13505 Alright, updated to address comments. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-04 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/13505 @rxin, I think that it might make sense to use AttributeSeq more widely. Right now there's an implicit conversion so we can gradually and naively migrate APIs to accept AttributeSeq. --- If your

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/13505 hm probably shouldn't happen in this pr but i'm wondering if it'd make sense to generalize AttributeSeq and use it everywhere, rather than Seq[Attribute]. --- If your project is set up for it, you c

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59976/ Test PASSed. ---

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #59976 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59976/consoleFull)** for PR 13505 at commit [`38e8a99`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/13505 Here's a flame graph of bindReferences dominating the CPU used for a 10k column query: [profile](https://github.com/apache/spark/files/298644/slow-bind-refs.svg.zip) --- If your project is set

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59980/ Test FAILed. ---

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #59980 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59980/consoleFull)** for PR 13505 at commit [`0b412b0`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/13505 @rxin, @ericl has some new benchmarks which operate on even wider schemas and which uncovered this bottleneck. Adding the caching of the map here resulted in a huge scalability improvement. Maybe

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #59980 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59980/consoleFull)** for PR 13505 at commit [`0b412b0`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #59976 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59976/consoleFull)** for PR 13505 at commit [`38e8a99`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #59975 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59975/consoleFull)** for PR 13505 at commit [`6216e94`](https://github.com/apache/spark/commit/

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59975/ Test FAILed. ---

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/13505 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #13505: [SPARK-15764][SQL] Replace N^2 loop in BindReferences

2016-06-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13505 **[Test build #59975 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59975/consoleFull)** for PR 13505 at commit [`6216e94`](https://github.com/apache/spark/commit/6