[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread lianhuiwang
Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-213232245
  
@davies Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/10024


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-213017267
  
Merging this into master, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212928123
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56532/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212928119
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212927052
  
**[Test build #56532 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56532/consoleFull)**
 for PR 10024 at commit 
[`792ff5a`](https://github.com/apache/spark/commit/792ff5a441a6f9e6dc44df12def1aff9b6377189).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class SparkSqlParser(conf: SQLConf) extends AbstractSqlParser`
  * `class SparkSqlAstBuilder(conf: SQLConf) extends AstBuilder `
  * `   *   [STORED AS file_format | STORED BY storage_handler_class [WITH 
SERDEPROPERTIES (...)]]`
  * `case class AddJar(path: String) extends RunnableCommand `
  * `case class AddFile(path: String) extends RunnableCommand `
  * `case class CreateTableAsSelectLogicalPlan(`
  * `case class CreateViewAsSelectLogicalCommand(`
  * `case class HiveSerDe(`
  * `class HiveSqlParser(conf: SQLConf, hiveconf: HiveConf) extends 
AbstractSqlParser `
  * `class HiveSqlAstBuilder(conf: SQLConf) extends 
SparkSqlAstBuilder(conf) `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212880975
  
**[Test build #56532 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56532/consoleFull)**
 for PR 10024 at commit 
[`792ff5a`](https://github.com/apache/spark/commit/792ff5a441a6f9e6dc44df12def1aff9b6377189).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212840033
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212840035
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56513/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212839853
  
**[Test build #56513 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56513/consoleFull)**
 for PR 10024 at commit 
[`ff3c2b8`](https://github.com/apache/spark/commit/ff3c2b85bc50ac631684e443cb7a19df9359535e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212799855
  
**[Test build #56513 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56513/consoleFull)**
 for PR 10024 at commit 
[`ff3c2b8`](https://github.com/apache/spark/commit/ff3c2b85bc50ac631684e443cb7a19df9359535e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread lianhuiwang
Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212798899
  
Jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212798080
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212798081
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56496/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212797906
  
**[Test build #56496 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56496/consoleFull)**
 for PR 10024 at commit 
[`ff3c2b8`](https://github.com/apache/spark/commit/ff3c2b85bc50ac631684e443cb7a19df9359535e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread lianhuiwang
Github user lianhuiwang commented on a diff in the pull request:

https://github.com/apache/spark/pull/10024#discussion_r60540340
  
--- Diff: 
core/src/main/scala/org/apache/spark/util/collection/Spillable.scala ---
@@ -100,6 +103,27 @@ private[spark] trait Spillable[C] extends Logging {
   }
 
   /**
+   * Spill some data to disk to release memory, which will be called by 
TaskMemoryManager
+   * when there is not enough memory for the task.
+   */
+  override def spill(size: Long, trigger: MemoryConsumer): Long = {
+if (trigger != this && taskMemoryManager.getTungstenMemoryMode == 
MemoryMode.ON_HEAP) {
+  val isSpilled = forceSpill()
+  if (!isSpilled) {
+0L
+  } else {
+_elementsRead = 0
+val freeMemory = myMemoryThreshold - initialMemoryThreshold
+_memoryBytesSpilled += freeMemory
+releaseMemory()
+freeMemory
--- End diff --

It did collection = null in forceSpill() before releaseMemory().


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread lianhuiwang
Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212793811
  
@davies I also have run unit tests with big number N. 
How about add a config "spark.shuffle.spill.reservedMemory"? its default is 
true. It does not force to spilling when its value is true. Maybe 2.1 version 
the config can be deleted.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread zzcclp
Github user zzcclp commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212790206
  
sorry, I mistakenly deleted my comment.
what a pity, i can only merge this pr manually.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212786217
  
This is a big change, maybe not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread zzcclp
Github user zzcclp commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212781140
  
@davies , will this pr be backported to branch-1.6?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212778562
  
@lianhuiwang Have you run some stress tests with latest change?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-21 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212777903
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/10024#discussion_r60533610
  
--- Diff: 
core/src/main/scala/org/apache/spark/util/collection/Spillable.scala ---
@@ -100,6 +103,27 @@ private[spark] trait Spillable[C] extends Logging {
   }
 
   /**
+   * Spill some data to disk to release memory, which will be called by 
TaskMemoryManager
+   * when there is not enough memory for the task.
+   */
+  override def spill(size: Long, trigger: MemoryConsumer): Long = {
+if (trigger != this && taskMemoryManager.getTungstenMemoryMode == 
MemoryMode.ON_HEAP) {
+  val isSpilled = forceSpill()
+  if (!isSpilled) {
+0L
+  } else {
+_elementsRead = 0
+val freeMemory = myMemoryThreshold - initialMemoryThreshold
+_memoryBytesSpilled += freeMemory
+releaseMemory()
+freeMemory
--- End diff --

We should free memory first, then release memory


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212768557
  
**[Test build #56496 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56496/consoleFull)**
 for PR 10024 at commit 
[`ff3c2b8`](https://github.com/apache/spark/commit/ff3c2b85bc50ac631684e443cb7a19df9359535e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212754891
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212754896
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56475/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212754203
  
**[Test build #56475 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56475/consoleFull)**
 for PR 10024 at commit 
[`e7a98d5`](https://github.com/apache/spark/commit/e7a98d57a31923406c204e15f72c7a43579653bb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread lianhuiwang
Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212748052
  
@davies Now all tests have been passed. So Could you take a look again? 
Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212747490
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212747494
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56466/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212747343
  
**[Test build #56466 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56466/consoleFull)**
 for PR 10024 at commit 
[`e7a98d5`](https://github.com/apache/spark/commit/e7a98d57a31923406c204e15f72c7a43579653bb).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212746211
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56462/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212746210
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212746062
  
**[Test build #56462 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56462/consoleFull)**
 for PR 10024 at commit 
[`7ea7274`](https://github.com/apache/spark/commit/7ea727470735cb2a420bd5411af0202d264d9ec7).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212739055
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56459/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212739052
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212738764
  
**[Test build #56459 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56459/consoleFull)**
 for PR 10024 at commit 
[`e009d95`](https://github.com/apache/spark/commit/e009d95c715879269253da2b47e669ffc2e13683).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212733142
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56456/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212733139
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212732804
  
**[Test build #56456 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56456/consoleFull)**
 for PR 10024 at commit 
[`97fd174`](https://github.com/apache/spark/commit/97fd17483fe2efebd64a1a57dfe40aa16a46f625).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212727564
  
**[Test build #56475 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56475/consoleFull)**
 for PR 10024 at commit 
[`e7a98d5`](https://github.com/apache/spark/commit/e7a98d57a31923406c204e15f72c7a43579653bb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread lianhuiwang
Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212727414
  
Jenkins, test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread lianhuiwang
Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212717830
  
test it please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212713464
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56465/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212713463
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212713437
  
**[Test build #56465 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56465/consoleFull)**
 for PR 10024 at commit 
[`d16b5f3`](https://github.com/apache/spark/commit/d16b5f3af28315706f60677314056ffd3b4bb277).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212711564
  
**[Test build #56466 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56466/consoleFull)**
 for PR 10024 at commit 
[`e7a98d5`](https://github.com/apache/spark/commit/e7a98d57a31923406c204e15f72c7a43579653bb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212709653
  
**[Test build #56465 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56465/consoleFull)**
 for PR 10024 at commit 
[`d16b5f3`](https://github.com/apache/spark/commit/d16b5f3af28315706f60677314056ffd3b4bb277).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212708444
  
**[Test build #56462 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56462/consoleFull)**
 for PR 10024 at commit 
[`7ea7274`](https://github.com/apache/spark/commit/7ea727470735cb2a420bd5411af0202d264d9ec7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread lianhuiwang
Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212707902
  
@davies  Yes, I have update it using object.lock. I will rebased to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212702375
  
Could you also rebased to master?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212702206
  
@lianhuiwang I don't think this is enough.

If it's not easy to make it thread safe, one option could be not do forced 
spilling if spill() will called from another thread than the one consuming the 
iterator. At least it's better than before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212698434
  
**[Test build #56459 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56459/consoleFull)**
 for PR 10024 at commit 
[`e009d95`](https://github.com/apache/spark/commit/e009d95c715879269253da2b47e669ffc2e13683).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread lianhuiwang
Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212697127
  
@davies Now i make some variables volatile to avoid thread safe.  Can you 
take a look? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212695755
  
**[Test build #56456 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56456/consoleFull)**
 for PR 10024 at commit 
[`97fd174`](https://github.com/apache/spark/commit/97fd17483fe2efebd64a1a57dfe40aa16a46f625).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/10024#discussion_r60499732
  
--- Diff: 
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -727,4 +779,64 @@ private[spark] class ExternalSorter[K, V, C](
   (elem._1._2, elem._2)
 }
   }
+
+  private[this] class SpillableIterator(var upstream: Iterator[((Int, K), 
C)])
--- End diff --

This should be thread safe.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212632443
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212632448
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56397/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212631560
  
**[Test build #56397 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56397/consoleFull)**
 for PR 10024 at commit 
[`d1ed4e4`](https://github.com/apache/spark/commit/d1ed4e4c0f5d15a1e0f030c397c3f8c83482315a).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212629142
  
Build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212629144
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56398/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212628881
  
**[Test build #56398 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56398/consoleFull)**
 for PR 10024 at commit 
[`743ef16`](https://github.com/apache/spark/commit/743ef16b0274f7d2ce7f435aed96d64316c8c77e).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212579832
  
**[Test build #56398 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56398/consoleFull)**
 for PR 10024 at commit 
[`743ef16`](https://github.com/apache/spark/commit/743ef16b0274f7d2ce7f435aed96d64316c8c77e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212578347
  
**[Test build #56397 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56397/consoleFull)**
 for PR 10024 at commit 
[`d1ed4e4`](https://github.com/apache/spark/commit/d1ed4e4c0f5d15a1e0f030c397c3f8c83482315a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212496143
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212496147
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56365/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212495728
  
**[Test build #56365 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56365/consoleFull)**
 for PR 10024 at commit 
[`dc632f5`](https://github.com/apache/spark/commit/dc632f5642b6bee690b351efa1855402ef7bc716).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-20 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212447578
  
**[Test build #56365 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56365/consoleFull)**
 for PR 10024 at commit 
[`dc632f5`](https://github.com/apache/spark/commit/dc632f5642b6bee690b351efa1855402ef7bc716).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212059825
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56246/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212059821
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212059742
  
**[Test build #56246 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56246/consoleFull)**
 for PR 10024 at commit 
[`b84ad96`](https://github.com/apache/spark/commit/b84ad967f6d3fc0b62aafb3510acad2d5df695b9).
 * This patch **fails MiMa tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-212054901
  
**[Test build #56246 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56246/consoleFull)**
 for PR 10024 at commit 
[`b84ad96`](https://github.com/apache/spark/commit/b84ad967f6d3fc0b62aafb3510acad2d5df695b9).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-19 Thread lianhuiwang
Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-211953148
  
@squito  Yes, I think your understanding is correct. this PR only support 
that a Spillables will be called once.  The code 'val sort = new Spillable() 
sort.iterator() sort.iterator()' will be wrong.
for ExternalSorter.iterator, I find that is just called by 
BlockStoreShuffleReader.Read(). Every Reader will create new ExternalSorter, So 
i think it's ok for it.
for ExternalAppendOnlyMap, It has two places that use it.They are 
Aggregator and CoGroupedRDD. we also can find that it create new 
ExternalAppendOnlyMap before call one  iterator(). So i think it is also ok for 
it.
The input and output of them are Iterator and Iterator cannot be cached, so 
it will not affect when rdds are cached.
BTW: this PR has been running for a long time on our many online spark jobs.
If i missed some points, please put forward. It is worth mentioning that


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-211937421
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56229/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-211937417
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-211937366
  
**[Test build #56229 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56229/consoleFull)**
 for PR 10024 at commit 
[`70bcffa`](https://github.com/apache/spark/commit/70bcffa63da42c84e9a63f6be39ed7330662039e).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-19 Thread lianhuiwang
Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-211936269
  
@davies Thanks. I have added a SpillableIterator that can make consumer and 
spill thread safe. I think you can take a look at it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-211934985
  
**[Test build #56229 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56229/consoleFull)**
 for PR 10024 at commit 
[`70bcffa`](https://github.com/apache/spark/commit/70bcffa63da42c84e9a63f6be39ed7330662039e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-211500937
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56099/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-211500934
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-211500764
  
**[Test build #56099 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56099/consoleFull)**
 for PR 10024 at commit 
[`7c36ef0`](https://github.com/apache/spark/commit/7c36ef0f54d8506f3c9593fe27824840006c3646).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class SparkSqlAstBuilder extends AstBuilder `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-18 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-211488135
  
**[Test build #56099 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56099/consoleFull)**
 for PR 10024 at commit 
[`7c36ef0`](https://github.com/apache/spark/commit/7c36ef0f54d8506f3c9593fe27824840006c3646).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-18 Thread davies
Github user davies commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-211480633
  
@lianhuiwang Thanks for working on this, I think it's in the good 
direction. Two things left:

1) thread safety. For example, you will have two threads for PythonRDD 
(same for RRDD), one iterate rows from parent RDD, another iterator rows from 
PythonRDD/RRDD, the second one could trigger spilling, the spilling happen in 
second thread, and the first thread could consuming the same iterator. So must 
make them thread safe. This is the hardest part, you could take the SQL 
operators as examples.

2) Adding more tests. As @squito suggested, more comments to explain the 
high level ideas will be good to have.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-18 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/10024#discussion_r60093931
  
--- Diff: 
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -689,11 +760,18 @@ private[spark] class ExternalSorter[K, V, C](
   }
 
   def stop(): Unit = {
-map = null // So that the memory can be garbage-collected
-buffer = null // So that the memory can be garbage-collected
 spills.foreach(s => s.file.delete())
 spills.clear()
-releaseMemory()
+forceSpillFile.foreach(_.file.delete())
+if (map != null || buffer != null) {
+  map = null // So that the memory can be garbage-collected
+  buffer = null // So that the memory can be garbage-collected
+  releaseMemory()
+}
+  }
+
+  override def toString(): String = {
+this.getClass.getName + "@" + 
java.lang.Integer.toHexString(this.hashCode())
--- End diff --

Why this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-18 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/10024#discussion_r60093876
  
--- Diff: 
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
@@ -235,6 +237,52 @@ private[spark] class ExternalSorter[K, V, C](
* @param collection whichever collection we're using (map or buffer)
*/
   override protected[this] def spill(collection: 
WritablePartitionedPairCollection[K, C]): Unit = {
+val inMemoryIterator = 
collection.destructiveSortedWritablePartitionedIterator(comparator)
+val spillFile = spillMemoryIteratorToDisk(inMemoryIterator)
+spills.append(spillFile)
+  }
+
+  /**
+   * Force to spilling the current in-memory collection to disk to release 
memory,
+   * It will be called by TaskMemoryManager when there is not enough 
memory for the task.
+   */
+  override protected[this] def forceSpill(): Boolean = {
+if (isShuffleSort) {
--- End diff --

This could be triggered by a different thread, so it should be thread safe.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-18 Thread davies
Github user davies commented on a diff in the pull request:

https://github.com/apache/spark/pull/10024#discussion_r60092037
  
--- Diff: core/src/main/java/org/apache/spark/memory/MemoryConsumer.java ---
@@ -130,4 +130,22 @@ protected void freePage(MemoryBlock page) {
 used -= page.size();
 taskMemoryManager.freePage(page, this);
   }
+
+  /**
+   * Allocates a heap memory of `size`.
+   */
+  public long allocateHeapExecutionMemory(long size) {
--- End diff --

This function does not actually create any object, I'd like to call it 
`acquireOnHeapMemory`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-04-18 Thread squito
Github user squito commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-211424963
  
Hi @lianhuiwang thanks for submitting this patch.

I just have a really high-level question first.  If I understand how this 
works correctly, the idea is that:
1) as data is added to `Spillable`s, they work exactly the same as before 
at first, so after all records have been inserted, they've still got a bunch of 
data in memory.
2) When an iterator is requested, as before its still an iterator that 
merges data has been spilled to disk, and data that is still-in-memory
3) If the `TaskMemoryManager` requests more memory while that iterator is 
in flight, then the `Spillable`s look at the position of the current iterator 
over the in-memory data, and spill only the _remaining_ data to disk
4) The `Spillable` then free the memory to the `TaskMemoryManager`, and 
have the in-flight iterator switch to using the new spill file.

Is this a correct understanding?  If so, this seems to hinge on one key 
assumption: that the `Spillable`s are never iterated over multiple times.  If 
they were iterated multiple times, then the second iteration would be incorrect 
-- some of the initial data in the in-memory structure would be lost on the 
second iteration.

I think this assumption is sound -- it is implied by 
"destructive"SortedIterator in the internals, though I think the actual 
wrapping `Spillable`s might allow multiple iteration now.  I've been trying to 
think of a case where this assumption would be violated but cant' come up with 
anything.  (If the same shuffle data is read multiple times in different tasks, 
the work on the shuffle-read side is simply repeated each time, I'm pretty sure 
there isn't any sharing).  But nonetheless if that is the case, I think this 
deserves both a lot of comments explaining how this works, assertions which 
make sure this assumption is not violated, and a number of tests.

FWIW, I started down the path of writing something similar with*out* that 
assumption -- when a spill was requested on an in-flight iterator, then the 
*entire* in-memory structure would get spilled to disk, and the in-flight 
iterator would switch to the spilled data, and advance to the same location in 
the spilled data that it was on the in-memory data.  This was pretty 
convoluted, and as I started writing tests I realized there were corner cases 
that needed work.  So I decided to submit the simpler change instead.  It seems 
much easier to it your way.I do have some test which I think I can add as 
well -- lemme dig those up and send them later today.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-01-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-174274239
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-01-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-174274237
  
**[Test build #49951 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49951/consoleFull)**
 for PR 10024 at commit 
[`16ca87b`](https://github.com/apache/spark/commit/16ca87bfc66e1d8ddfc1067a6bf97b6875343d61).
 * This patch **fails R style tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-01-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-174274240
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49951/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-01-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-174274054
  
**[Test build #49951 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49951/consoleFull)**
 for PR 10024 at commit 
[`16ca87b`](https://github.com/apache/spark/commit/16ca87bfc66e1d8ddfc1067a6bf97b6875343d61).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-01-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-174271872
  
**[Test build #49948 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49948/consoleFull)**
 for PR 10024 at commit 
[`b561641`](https://github.com/apache/spark/commit/b5616414af8fff78f96b320cfbe3bf368d6f756c).
 * This patch **fails Scala style tests**.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-01-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-174271875
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49948/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-01-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-174271874
  
Build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2016-01-24 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-174271789
  
**[Test build #49948 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49948/consoleFull)**
 for PR 10024 at commit 
[`b561641`](https://github.com/apache/spark/commit/b5616414af8fff78f96b320cfbe3bf368d6f756c).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2015-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-160438158
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2015-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-160438161
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/46852/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2015-11-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-160438090
  
**[Test build #46852 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46852/consoleFull)**
 for PR 10024 at commit 
[`34f2441`](https://github.com/apache/spark/commit/34f24410be3f6e566900178baf90f122010846a4).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2015-11-29 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-160420970
  
**[Test build #46852 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46852/consoleFull)**
 for PR 10024 at commit 
[`34f2441`](https://github.com/apache/spark/commit/34f24410be3f6e566900178baf90f122010846a4).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4452][Core]Shuffle data structures can ...

2015-11-29 Thread lianhuiwang
Github user lianhuiwang commented on the pull request:

https://github.com/apache/spark/pull/10024#issuecomment-160419872
  
test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >