[GitHub] spark pull request: [SPARK-2534] Avoid pulling in the entire RDD i...

2014-07-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1450#issuecomment-49261722 Eh the binary checker is really failing me. Is there a way to disable binary checker for inner functions? @pwendell --- If your project is set up for it, you can

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-17 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1447#discussion_r15042631 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -216,17 +216,17 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-17 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1447#discussion_r15042857 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -216,17 +216,17 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1447#discussion_r15042897 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -216,17 +216,17 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-17 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1447#discussion_r15042905 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -571,12 +571,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])

[GitHub] spark pull request: [branch-0.9] Fix github links in docs

2014-07-17 Thread mengxr
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/1456 [branch-0.9] Fix github links in docs We moved example code in v1.0. The links are no longer valid if still pointing to `tree/master`. You can merge this pull request into a Git repository by

[GitHub] spark pull request: [SPARK-2299] Consolidate various stageIdTo* ha...

2014-07-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1262#issuecomment-49263949 I pushed a new version. I'd first merge this and then have a separate PR to index the hash table by stageId + attempt. Now it includes @kayousterhout's change.

[GitHub] spark pull request: Required AM memory is amMem, not args.amMem...

2014-07-17 Thread maji2014
GitHub user maji2014 opened a pull request: https://github.com/apache/spark/pull/1457 Required AM memory is amMem, not args.amMemory ERROR yarn.Client: Required AM memory (1024) is above the max threshold (1048) of this cluster appears if this code is not changed. obviously, 1024

[GitHub] spark pull request: [branch-0.9] Fix github links in docs

2014-07-17 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1456#issuecomment-49263937 This PR only contains changes in docs and the links were verified using linkchecker. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [branch-0.9] Fix github links in docs

2014-07-17 Thread mengxr
Github user mengxr closed the pull request at: https://github.com/apache/spark/pull/1456 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [branch-0.9] Fix github links in docs

2014-07-17 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1456#issuecomment-49264105 Merged into branch-0.9. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: Required AM memory is amMem, not args.amMem...

2014-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1457#issuecomment-49264180 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2299] Consolidate various stageIdTo* ha...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1262#issuecomment-49264275 QA tests have started for PR 1262. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16768/consoleFull ---

[GitHub] spark pull request: Required AM memory is amMem, not args.amMem...

2014-07-17 Thread maji2014
Github user maji2014 commented on the pull request: https://github.com/apache/spark/pull/1457#issuecomment-49264699 Please focus on the second issue as the first issue is a old patch on June. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: SPARK-2294: fix locality inversion bug in Task...

2014-07-17 Thread lirui-intel
Github user lirui-intel commented on the pull request: https://github.com/apache/spark/pull/1313#issuecomment-49264857 If a TaskSet only contains no-pref tasks, there won't be delay because the only valid level is ANY, so everything gets scheduled right away. If a TaskSet

[GitHub] spark pull request: [branch-0.9] bump versions for v0.9.2 release ...

2014-07-17 Thread mengxr
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/1458 [branch-0.9] bump versions for v0.9.2 release candidate Manually update some version numbers. You can merge this pull request into a Git repository by running: $ git pull

[GitHub] spark pull request: [branch-0.9] bump versions for v0.9.2 release ...

2014-07-17 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1458#issuecomment-49265114 Merged into branch-0.9. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [branch-0.9] bump versions for v0.9.2 release ...

2014-07-17 Thread mengxr
Github user mengxr closed the pull request at: https://github.com/apache/spark/pull/1458 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2523] [SQL] [WIP] Hadoop table scan bug...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1439#issuecomment-49265512 QA results for PR 1439:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: update CHANGES.txt

2014-07-17 Thread mengxr
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/1459 update CHANGES.txt You can merge this pull request into a Git repository by running: $ git pull https://github.com/mengxr/spark v0.9.2-rc Alternatively you can review and apply these changes

[GitHub] spark pull request: [SPARK-2534] Avoid pulling in the entire RDD i...

2014-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1450#discussion_r15043414 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -214,7 +214,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])

[GitHub] spark pull request: [branch-0.9] Update CHANGES.txt

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1459#issuecomment-49266193 QA tests have started for PR 1459. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16769/consoleFull ---

[GitHub] spark pull request: [SPARK-2534] Avoid pulling in the entire RDD i...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1450#issuecomment-49266828 QA tests have started for PR 1450. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16770/consoleFull ---

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-17 Thread sryza
Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/1447#discussion_r15043849 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -571,12 +571,7 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-17 Thread sryza
Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/1447#discussion_r15043860 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -216,17 +216,17 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-17 Thread sryza
Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/1447#discussion_r15044028 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -712,8 +701,8 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-17 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1447#discussion_r15044062 --- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala --- @@ -712,8 +701,8 @@ class PairRDDFunctions[K, V](self: RDD[(K, V)])

[GitHub] spark pull request: [branch-0.9] Update CHANGES.txt

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1459#issuecomment-49267990 QA results for PR 1459:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2534] Avoid pulling in the entire RDD i...

2014-07-17 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1450#issuecomment-49269144 I created a JIRA to deal with this and did some initial exploration, but I think I'll need to wait for Prashant to actually do it:

[GitHub] spark pull request: [SPARK-2412] CoalescedRDD throws exception wit...

2014-07-17 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1337#issuecomment-49270261 Okay I merged this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-2412] CoalescedRDD throws exception wit...

2014-07-17 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1337 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-2526: Simplify options in make-distribut...

2014-07-17 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1445 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2410][SQL][WIP] Cherry picked Hive Thri...

2014-07-17 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1399#discussion_r15045195 --- Diff: sbin/start-thriftserver.sh --- @@ -0,0 +1,24 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [SPARK-2423] Clean up SparkSubmit for readabil...

2014-07-17 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1349#issuecomment-49271133 Thanks Andrew, looks good! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-17 Thread davies
GitHub user davies opened a pull request: https://github.com/apache/spark/pull/1460 [SPARK-2538] [PySpark] Hash based disk spilling aggregation During aggregation in Python worker, if the memory usage is above spark.executor.memory, it will do disk spilling aggregation.

[GitHub] spark pull request: [SPARK-2423] Clean up SparkSubmit for readabil...

2014-07-17 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1349 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [branch-0.9] Update CHANGES.txt

2014-07-17 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1459#issuecomment-49271772 Merged into branch-0.9. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [branch-0.9] Update CHANGES.txt

2014-07-17 Thread mengxr
Github user mengxr closed the pull request at: https://github.com/apache/spark/pull/1459 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2299] Consolidate various stageIdTo* ha...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1262#issuecomment-49271814 QA results for PR 1262:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1460#issuecomment-49271938 QA tests have started for PR 1460. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16772/consoleFull ---

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-07-17 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1361#issuecomment-49272156 Jenkins, add to whitelist. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-07-17 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1361#issuecomment-49272169 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1447#issuecomment-49272425 QA tests have started for PR 1447. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16773/consoleFull ---

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1361#issuecomment-49272423 QA tests have started for PR 1361. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16774/consoleFull ---

[GitHub] spark pull request: SPARK-2553. CoGroupedRDD unnecessarily allocat...

2014-07-17 Thread sryza
GitHub user sryza opened a pull request: https://github.com/apache/spark/pull/1461 SPARK-2553. CoGroupedRDD unnecessarily allocates a Tuple2 per dependency... ... per key My humble opinion is that avoiding allocations in this performance-critical section is worth the extra

[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...

2014-07-17 Thread li-zhihui
Github user li-zhihui commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-49284642 @tgravescs --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1460#issuecomment-49286439 QA results for PR 1460:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1361#issuecomment-49287245 QA results for PR 1361:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1447#issuecomment-49287721 QA results for PR 1447:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2125] Add sort flag and move sort into ...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1210#issuecomment-49293649 QA results for PR 1210:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2497 Exclude companion classes, with the...

2014-07-17 Thread ScrapCodes
GitHub user ScrapCodes opened a pull request: https://github.com/apache/spark/pull/1463 SPARK-2497 Exclude companion classes, with their corresponding objects. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ScrapCodes/spark-1

[GitHub] spark pull request: SPARK-2497 Exclude companion classes, with the...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1463#issuecomment-49302217 QA tests have started for PR 1463. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16777/consoleFull ---

[GitHub] spark pull request: [SPARK-2460] Optimize SparkContext.hadoopFile ...

2014-07-17 Thread lianhuiwang
Github user lianhuiwang commented on a diff in the pull request: https://github.com/apache/spark/pull/1385#discussion_r15055059 --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala --- @@ -128,25 +123,13 @@ class HadoopRDD[K, V]( // Returns a JobConf that

[GitHub] spark pull request: SPARK-2481: The environment variables SPARK_HI...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1341#issuecomment-49303258 QA tests have started for PR 1341. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16778/consoleFull ---

[GitHub] spark pull request: [WIP]When the executor is thrown OutOfMemoryEr...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49303783 QA tests have started for PR 1387. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16779/consoleFull ---

[GitHub] spark pull request: SPARK-2497 Exclude companion classes, with the...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1463#issuecomment-49314286 QA results for PR 1463:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2481: The environment variables SPARK_HI...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1341#issuecomment-49316014 QA results for PR 1341:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: The driver perform garbage collection, when th...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1387#issuecomment-49316425 QA results for PR 1387:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2479][MLlib] Comparing floating-point n...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1425#issuecomment-49318202 QA tests have started for PR 1425. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16780/consoleFull ---

[GitHub] spark pull request: [SPARK-2479][MLlib] Comparing floating-point n...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1425#issuecomment-49321741 QA tests have started for PR 1425. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16781/consoleFull ---

[GitHub] spark pull request: [Spark 2557] fix LOCAL_N_REGEX in createTaskSc...

2014-07-17 Thread advancedxy
GitHub user advancedxy opened a pull request: https://github.com/apache/spark/pull/1464 [Spark 2557] fix LOCAL_N_REGEX in createTaskScheduler and make local-n and local-n-failures consistent [SPARK-2557](https://issues.apache.org/jira/browse/SPARK-2557) You can merge this

[GitHub] spark pull request: [Spark 2557] fix LOCAL_N_REGEX in createTaskSc...

2014-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1464#issuecomment-49323035 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2523] [SQL] [WIP] Hadoop table scan bug...

2014-07-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/1439#discussion_r15067812 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableScan.scala --- @@ -67,95 +61,12 @@ case class HiveTableScan( }

[GitHub] spark pull request: [SPARK-2523] [SQL] [WIP] Hadoop table scan bug...

2014-07-17 Thread yhuai
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/1439#discussion_r15068484 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala --- @@ -156,33 +158,43 @@ class HadoopTableReader(@transient _tableDesc:

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1460#issuecomment-49334156 QA tests have started for PR 1460. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16782/consoleFull ---

[GitHub] spark pull request: SPARK-2083 Add support for spark.local.maxFail...

2014-07-17 Thread kbzod
GitHub user kbzod opened a pull request: https://github.com/apache/spark/pull/1465 SPARK-2083 Add support for spark.local.maxFailures configuration property The logic in `SparkContext` for creating a new task scheduler now looks for a spark.local.maxFailures property to specify the

[GitHub] spark pull request: SPARK-2083 Add support for spark.local.maxFail...

2014-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1465#issuecomment-49335741 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2411] Add a history-not-found page to s...

2014-07-17 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1336#issuecomment-49336964 I have updated the screenshots again. Anything else? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: SPARK-1478.2 Fix incorrect NioServerSocketChan...

2014-07-17 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/1466 SPARK-1478.2 Fix incorrect NioServerSocketChannelFactory constructor call The line break inadvertently means this was interpreted as a call to the no-arg constructor. This doesn't exist in older

[GitHub] spark pull request: [SPARK-2340] Resolve event logging and History...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1280#issuecomment-49337830 QA tests have started for PR 1280. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16784/consoleFull ---

[GitHub] spark pull request: SPARK-1478.2 Fix incorrect NioServerSocketChan...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1466#issuecomment-49337826 QA tests have started for PR 1466. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16783/consoleFull ---

[GitHub] spark pull request: [SPARK-2523] [SQL] [WIP] Hadoop table scan bug...

2014-07-17 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/1439#issuecomment-49338675 I think we are not clear on the boundary between a `TableReader` and a physical `TableScan` operator (e.g. `HiveTableScan`). Seems we just want `TableReader` to create

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-17 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1460#discussion_r15072735 --- Diff: python/pyspark/rdd.py --- @@ -168,6 +170,123 @@ def _replaceRoot(self, value): self._sink(1) +class Merger(object):

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-17 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1460#discussion_r15072768 --- Diff: python/pyspark/rdd.py --- @@ -168,6 +170,123 @@ def _replaceRoot(self, value): self._sink(1) +class Merger(object):

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-17 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1460#discussion_r15072860 --- Diff: python/pyspark/rdd.py --- @@ -168,6 +170,123 @@ def _replaceRoot(self, value): self._sink(1) +class Merger(object):

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-17 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1460#discussion_r15072948 --- Diff: python/pyspark/rdd.py --- @@ -1247,15 +1366,12 @@ def combineLocally(iterator): return combiners.iteritems()

[GitHub] spark pull request: [SPARK-2411] Add a history-not-found page to s...

2014-07-17 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1336#discussion_r15072986 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -667,29 +664,47 @@ private[spark] class Master( */ def

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-17 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1460#discussion_r15073011 --- Diff: python/pyspark/serializers.py --- @@ -297,6 +297,33 @@ class MarshalSerializer(FramedSerializer): loads = marshal.loads

[GitHub] spark pull request: [SPARK-2523] [SQL] [WIP] Hadoop table scan bug...

2014-07-17 Thread yhuai
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/1439#issuecomment-49341477 @chenghao-intel explained the root cause in https://issues.apache.org/jira/browse/SPARK-2523. Basically, we should use partition-specific `ObjectInspectors` to extract

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-17 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1460#discussion_r15073144 --- Diff: python/pyspark/serializers.py --- @@ -297,6 +297,33 @@ class MarshalSerializer(FramedSerializer): loads = marshal.loads

[GitHub] spark pull request: SPARK-2497 Exclude companion classes, with the...

2014-07-17 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1463#issuecomment-49341909 @ScrapCodes could you explain a bit more how this fixes SPARK-2497. If I look at the original false-positive the issue reported was not with a companion class. It was

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-17 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1460#discussion_r15073412 --- Diff: python/pyspark/rdd.py --- @@ -168,6 +170,123 @@ def _replaceRoot(self, value): self._sink(1) +class Merger(object):

[GitHub] spark pull request: SPARK-1215 [MLLIB]: Clustering: Index out of b...

2014-07-17 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/1407#discussion_r15073983 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LocalKMeans.scala --- @@ -59,6 +59,11 @@ private[mllib] object LocalKMeans extends Logging

[GitHub] spark pull request: [SPARK-2542] Exit Code Class should be renamed...

2014-07-17 Thread sarutak
GitHub user sarutak opened a pull request: https://github.com/apache/spark/pull/1467 [SPARK-2542] Exit Code Class should be renamed and placed package properly You can merge this pull request into a Git repository by running: $ git pull https://github.com/sarutak/spark master

[GitHub] spark pull request: [SPARK-2542] Exit Code Class should be renamed...

2014-07-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1467#issuecomment-49343702 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Streaming mllib [SPARK-2438][MLLIB]

2014-07-17 Thread freeman-lab
Github user freeman-lab commented on the pull request: https://github.com/apache/spark/pull/1361#issuecomment-49344174 Looks like the basic test for correct final params passes, but not the stricter test for improvement on every update. Both pass locally. My guess is that it's

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1460#issuecomment-49346100 QA results for PR 1460:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: SPARK-1215 [MLLIB]: Clustering: Index out of b...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1407#issuecomment-49347076 QA tests have started for PR 1407. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16785/consoleFull ---

[GitHub] spark pull request: SPARK-1215 [MLLIB]: Clustering: Index out of b...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1407#issuecomment-49347163 QA results for PR 1407:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-695] In DAGScheduler's getPreferredLocs...

2014-07-17 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/1362#discussion_r15076386 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1107,7 +1106,6 @@ class DAGScheduler( case shufDep:

[GitHub] spark pull request: Modify default YARN memory_overhead-- from an ...

2014-07-17 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/1391#issuecomment-49348179 Bringing the discussion back online. Thanks for all the input so far. Ran a few experiments yday and today. Number of executors (which was the other main

[GitHub] spark pull request: SPARK-1478.2 Fix incorrect NioServerSocketChan...

2014-07-17 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/1466#discussion_r15076631 --- Diff: external/flume/src/main/scala/org/apache/spark/streaming/flume/FlumeInputDStream.scala --- @@ -153,15 +153,15 @@ class FlumeReceiver(

[GitHub] spark pull request: SPARK-1478.2 Fix incorrect NioServerSocketChan...

2014-07-17 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/1466#discussion_r15076673 --- Diff: external/flume/src/main/scala/org/apache/spark/streaming/flume/FlumeInputDStream.scala --- @@ -153,15 +153,15 @@ class FlumeReceiver(

[GitHub] spark pull request: [SPARK-695] In DAGScheduler's getPreferredLocs...

2014-07-17 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1362#discussion_r15076921 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -32,8 +32,6 @@ abstract class Dependency[T](val rdd: RDD[T]) extends Serializable

[GitHub] spark pull request: [SPARK-695] In DAGScheduler's getPreferredLocs...

2014-07-17 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1362#discussion_r15076891 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1107,7 +1106,6 @@ class DAGScheduler( case shufDep:

[GitHub] spark pull request: [SPARK-2534] Avoid pulling in the entire RDD i...

2014-07-17 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1450#issuecomment-49350307 Merged in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-2340] Resolve event logging and History...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1280#issuecomment-49350379 QA results for PR 1280:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2534] Avoid pulling in the entire RDD i...

2014-07-17 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1450 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-1478.2 Fix incorrect NioServerSocketChan...

2014-07-17 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1466#issuecomment-49350436 QA results for PR 1466:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2538] [PySpark] Hash based disk spillin...

2014-07-17 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/1460#discussion_r15078093 --- Diff: python/pyspark/rdd.py --- @@ -168,6 +170,123 @@ def _replaceRoot(self, value): self._sink(1) +class Merger(object):

  1   2   3   >