[GitHub] spark pull request: SPARK-4644 blockjoin

2015-10-17 Thread koertkuipers
Github user koertkuipers commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-148917448
  
sure


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-10-17 Thread koertkuipers
Github user koertkuipers closed the pull request at:

https://github.com/apache/spark/pull/6883


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-10-16 Thread JoshRosen
Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-148828836
  
Given that we're unlikely to add this in core right now and given that it's 
available in a Spark Package, would you mind closing this PR for now in order 
to clean up the review backlog? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-08-22 Thread koertkuipers
Github user koertkuipers commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-133778896
  
i put this in a spark package together with skewjoin in case anyone wants 
to use it.
see here:
http://spark-packages.org/package/tresata/spark-skewjoin
https://github.com/tresata/spark-skewjoin



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-22 Thread zsxwing
Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-114354847
  
@rxin I remember you said you would like such improvement to be added to 
Spark SQL rather than Spark Core. What's your thoughts on this one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113670872
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113670853
  
  [Test build #35333 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35333/console)
 for   PR 6883 at commit 
[`a5dd71c`](https://github.com/apache/spark/commit/a5dd71c4636aa7d4c1a3acb0755736c526d5b0df).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113660265
  
  [Test build #35333 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35333/consoleFull)
 for   PR 6883 at commit 
[`a5dd71c`](https://github.com/apache/spark/commit/a5dd71c4636aa7d4c1a3acb0755736c526d5b0df).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113660121
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113660077
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113630470
  
  [Test build #35315 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35315/console)
 for   PR 6883 at commit 
[`adef52e`](https://github.com/apache/spark/commit/adef52ed4c335980e73c61036abb2a2806965de3).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113630477
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113630110
  
  [Test build #35315 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35315/consoleFull)
 for   PR 6883 at commit 
[`adef52e`](https://github.com/apache/spark/commit/adef52ed4c335980e73c61036abb2a2806965de3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113629557
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113629566
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113284866
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113284862
  
  [Test build #35168 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35168/console)
 for   PR 6883 at commit 
[`6ac82cb`](https://github.com/apache/spark/commit/6ac82cb644d1c226b3fb4ea01fd122ca7b623a35).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113284473
  
@zsxwing 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113284346
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113284422
  
  [Test build #35168 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/35168/consoleFull)
 for   PR 6883 at commit 
[`6ac82cb`](https://github.com/apache/spark/commit/6ac82cb644d1c226b3fb4ea01fd122ca7b623a35).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113284330
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/6883#discussion_r32776196
  
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala 
---
@@ -515,6 +515,76 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
   }
 
   /**
+   * Same as join, but uses a block join, otherwise known as a replicate 
fragment join.
+   * This is useful in cases where the data has extreme skew.
+   * The input params leftReplication and rightReplication control the 
replication of the left
+   * (this rdd) and right (other rdd) respectively.
+   */
+  def blockJoin[W](other: JavaPairRDD[K, W], leftReplication: Int, 
rightReplication: Int, 
+partitioner: Partitioner): JavaPairRDD[K, (V, W)] = {
+fromRDD(rdd.blockJoin(other, leftReplication, rightReplication, 
partitioner))
+  }
+
+  /**
+   * Same as join, but uses a block join, otherwise known as a replicate 
fragment join.
+   * This is useful in cases where the data has extreme skew.
+   * The input params leftReplication and rightReplication control the 
replication of the left
+   * (this rdd) and right (other rdd) respectively.
+   */
+  def blockJoin[W](other: JavaPairRDD[K, W], leftReplication: Int, 
rightReplication: Int) 
+  : JavaPairRDD[K, (V, W)] = {
+fromRDD(rdd.blockJoin(other, leftReplication, rightReplication))
+  }
+
+  /**
+   * Same as leftOuterJoin, but uses a block join, otherwise known as a 
replicate fragment join.
+   * This is useful in cases where the data has extreme skew.
+   * The input param rightReplication controls the replication of the 
right (other rdd).
+   */
+  def blockLeftOuterJoin[W](other: JavaPairRDD[K, W], rightReplication: 
Int,
+partitioner: Partitioner): JavaPairRDD[K, (V, Optional[W])] = {
+fromRDD(rdd.blockLeftOuterJoin(other, rightReplication, 
partitioner).mapValues{ case (v, w) =>
--- End diff --

need space after `.mapValues`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/6883#discussion_r32776189
  
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala 
---
@@ -515,6 +515,76 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
   }
 
   /**
+   * Same as join, but uses a block join, otherwise known as a replicate 
fragment join.
+   * This is useful in cases where the data has extreme skew.
+   * The input params leftReplication and rightReplication control the 
replication of the left
+   * (this rdd) and right (other rdd) respectively.
+   */
+  def blockJoin[W](other: JavaPairRDD[K, W], leftReplication: Int, 
rightReplication: Int, 
+partitioner: Partitioner): JavaPairRDD[K, (V, W)] = {
+fromRDD(rdd.blockJoin(other, leftReplication, rightReplication, 
partitioner))
+  }
+
+  /**
+   * Same as join, but uses a block join, otherwise known as a replicate 
fragment join.
+   * This is useful in cases where the data has extreme skew.
+   * The input params leftReplication and rightReplication control the 
replication of the left
+   * (this rdd) and right (other rdd) respectively.
+   */
+  def blockJoin[W](other: JavaPairRDD[K, W], leftReplication: Int, 
rightReplication: Int) 
+  : JavaPairRDD[K, (V, W)] = {
+fromRDD(rdd.blockJoin(other, leftReplication, rightReplication))
+  }
+
+  /**
+   * Same as leftOuterJoin, but uses a block join, otherwise known as a 
replicate fragment join.
+   * This is useful in cases where the data has extreme skew.
+   * The input param rightReplication controls the replication of the 
right (other rdd).
+   */
+  def blockLeftOuterJoin[W](other: JavaPairRDD[K, W], rightReplication: 
Int,
+partitioner: Partitioner): JavaPairRDD[K, (V, Optional[W])] = {
+fromRDD(rdd.blockLeftOuterJoin(other, rightReplication, 
partitioner).mapValues{ case (v, w) =>
+  (v, JavaUtils.optionToOptional(w))
+})
--- End diff --

need space after `.mapValues`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request:

https://github.com/apache/spark/pull/6883#discussion_r32776123
  
--- Diff: core/src/main/scala/org/apache/spark/api/java/JavaPairRDD.scala 
---
@@ -515,6 +515,76 @@ class JavaPairRDD[K, V](val rdd: RDD[(K, V)])
   }
 
   /**
+   * Same as join, but uses a block join, otherwise known as a replicate 
fragment join.
+   * This is useful in cases where the data has extreme skew.
+   * The input params leftReplication and rightReplication control the 
replication of the left
+   * (this rdd) and right (other rdd) respectively.
+   */
+  def blockJoin[W](other: JavaPairRDD[K, W], leftReplication: Int, 
rightReplication: Int, 
+partitioner: Partitioner): JavaPairRDD[K, (V, W)] = {
--- End diff --

style:
```
def blockJoin[W](
other: JavaPairRDD[K, W],
leftReplication: Int,
...): JavaPairRDD[...] = {
  ...
}
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread andrewor14
Github user andrewor14 commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113283813
  
add to whitelist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread koertkuipers
GitHub user koertkuipers opened a pull request:

https://github.com/apache/spark/pull/6883

SPARK-4644 blockjoin

Although the discussion (and design doc) under SPARK-4644 seem focussed on 
other aspects of skew (OOM mostly) than this pullreq (which focusses on 
avoiding a single reducer taking a long time), i decided to put this pullreq 
under SPARK-4644 anyhow, to avoid the proliferation of JIRA tickets. If this is 
not the right place let me know and i will move it.

Inspired by block join in scalding.
From scalding docs:

This is useful in cases where the data has extreme skew. A symptom of this 
is that we may see a job stuck for a very long time on a small number of 
reducers.
A block join is way to get around this: we add a random integer field and a 
replica field to every tuple in the left and right pipes. We then join on the 
original keys and on these new dummy fields. These dummy fields make it less 
likely that the skewed keys will be hashed to the same reducer.

The final data size is right * rightReplication + left * leftReplication 
but because of the fragmentation, we are guaranteed the same number of hits as 
the original join.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tresata/spark feat-blockjoin

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/6883.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #6883


commit 77d8fee6ad7ba5f83eb0c82b7f1625e2206a5446
Author: Koert Kuipers 
Date:   2015-06-17T20:35:18Z

add blockJoin, blockLeftOuterJoin and blockRightOuterJoin to spark core

commit d1fd3e020812c72c44a6461d9c94065e2784cdbb
Author: Koert Kuipers 
Date:   2015-06-17T23:48:43Z

correct scaladocs for block join functions

commit 2114df748f62b53155d7db5524e163504cead228
Author: Koert Kuipers 
Date:   2015-06-18T03:36:21Z

add block joins to java api




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: SPARK-4644 blockjoin

2015-06-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/6883#issuecomment-113178842
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org