Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-137569833
@hvanhovell thanks for working on this! To keep the PR queue manageable I
propose we close this issue for now until you have time to bring it up to date
and remove the
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/7379
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enab
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/7379
[SPARK-8682][SQL][WIP] Range Join
*...copied from JIRA (SPARK-8682):*
Currently Spark SQL uses a Broadcast Nested Loop join (or a filtered
Cartesian Join) when it has to execute the foll
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121082100
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your pr
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121082240
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your pr
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121116697
ok to test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
en
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121117361
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121117436
[Test build #37182 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37182/consoleFull)
for PR 7379 at commit
[`d2bd793`](https://gith
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121117367
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121117442
[Test build #37182 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37182/console)
for PR 7379 at commit
[`d2bd793`](https://github.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121117445
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121130008
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121130017
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121130085
[Test build #37193 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37193/consoleFull)
for PR 7379 at commit
[`6727807`](https://gith
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121137612
[Test build #37193 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37193/console)
for PR 7379 at commit
[`6727807`](https://github.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121137638
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121248652
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121250819
Current test errors are a bit weird. They shouldn't have been caused by
this change, because the functionality is disabled by default.
Rebased to most recent
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121276408
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121276830
[Test build #37229 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37229/consoleFull)
for PR 7379 at commit
[`773c009`](https://gith
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121306024
[Test build #37229 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37229/console)
for PR 7379 at commit
[`773c009`](https://github.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121306135
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121316725
This looks pretty cool! I can try and do a more through review in a bit,
but a few testing suggestions:
It would be great to add a test for the query planner
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121811713
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121811701
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121812674
[Test build #37448 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37448/consoleFull)
for PR 7379 at commit
[`b405e45`](https://gith
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121815152
[Test build #37448 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37448/console)
for PR 7379 at commit
[`b405e45`](https://github.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121815160
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121816918
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121816924
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121817147
Merged build finished. Test FAILed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121819633
Merged build triggered.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not h
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121819664
Merged build started.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121820543
[Test build #37456 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37456/consoleFull)
for PR 7379 at commit
[`8204eae`](https://gith
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121832849
[Test build #37456 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/37456/console)
for PR 7379 at commit
[`8204eae`](https://github.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121833017
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user chenghao-intel commented on a diff in the pull request:
https://github.com/apache/spark/pull/7379#discussion_r34862068
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala
---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the
Github user chenghao-intel commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-122176186
This is a very interesting optimization, but will it be more general if we
consider that with the SortMergeJoin? As well as the case like:
```
SELECT A.*,
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7379#discussion_r34862439
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala
---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apac
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-122177553
The <= case is quite easy to implement.
This implementation is currently targetted at range joining a rather small
(broadcastable) to an arbitrarily large tab
Github user chenghao-intel commented on a diff in the pull request:
https://github.com/apache/spark/pull/7379#discussion_r34863539
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala
---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the
Github user chenghao-intel commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-122188991
Sorry, I shouldn't use the word `SMJ`.
I mean if we are planning to improve the performance of RangeJoin, probably
we can think of it in a more general way, no
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7379#discussion_r34891100
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala
---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apac
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-12257
No problem.
### Supporting N-Ary Predicates.
In order to make the range join work we need the predicates to define a
single interval for each side of the
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/7379#discussion_r34940713
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala
---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache
Github user marmbrus commented on a diff in the pull request:
https://github.com/apache/spark/pull/7379#discussion_r34940933
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala
---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache
46 matches
Mail list logo