[GitHub] spark pull request: [SPARK-2521] Broadcast RDD object (instead of ...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1452#issuecomment-49398792 QA tests have started for PR 1452. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16813/consoleFull ---

[GitHub] spark pull request: [SPARK-2571] Correctly report shuffle read met...

2014-07-18 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1476#discussion_r15097975 --- Diff: core/src/test/scala/org/apache/spark/ui/jobs/JobProgressListenerSuite.scala --- @@ -81,8 +81,6 @@ class JobProgressListenerSuite extends FunSuite

[GitHub] spark pull request: [SPARK-2521] Broadcast RDD object (instead of ...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1452#issuecomment-49399238 QA tests have started for PR 1452. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16814/consoleFull ---

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1367#issuecomment-49399247 QA tests have started for PR 1367. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16815/consoleFull ---

[GitHub] spark pull request: [SPARK-2570] [SQL] Fix the bug of ClassCastExc...

2014-07-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1475#issuecomment-49399326 Merging. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: Fixed a typo in the comments in RangePartition...

2014-07-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1473#discussion_r15098078 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -135,7 +135,7 @@ class RangePartitioner[K : Ordering : ClassTag, V]( val k =

[GitHub] spark pull request: [SPARK-2570] [SQL] Fix the bug of ClassCastExc...

2014-07-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1475 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Fixed a typo in the comments in RangePartition...

2014-07-18 Thread dorx
Github user dorx commented on a diff in the pull request: https://github.com/apache/spark/pull/1473#discussion_r15098158 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -135,7 +135,7 @@ class RangePartitioner[K : Ordering : ClassTag, V]( val k =

[GitHub] spark pull request: Fixed a typo in the comments in RangePartition...

2014-07-18 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1473#discussion_r15098185 --- Diff: core/src/main/scala/org/apache/spark/Partitioner.scala --- @@ -135,7 +135,7 @@ class RangePartitioner[K : Ordering : ClassTag, V]( val k =

[GitHub] spark pull request: Reservoir sampling implementation.

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1478#issuecomment-49400124 QA results for PR 1478:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: Fixed a typo in the comments in RangePartition...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1473#issuecomment-49400241 QA tests have started for PR 1473. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16816/consoleFull ---

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-18 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r15098382 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/SpearmansCorrelation.scala --- @@ -0,0 +1,125 @@ +/* + * Licensed to the

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-18 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r15098384 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/SpearmansCorrelation.scala --- @@ -0,0 +1,125 @@ +/* + * Licensed to the

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-18 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r15098421 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/SpearmansCorrelation.scala --- @@ -0,0 +1,125 @@ +/* + * Licensed to the

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-18 Thread dorx
Github user dorx commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r15098446 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/SpearmansCorrelation.scala --- @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-18 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r15098472 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/SpearmansCorrelation.scala --- @@ -0,0 +1,125 @@ +/* + * Licensed to the

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-18 Thread dorx
Github user dorx commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r15098523 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/SpearmansCorrelation.scala --- @@ -0,0 +1,125 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2571] Correctly report shuffle read met...

2014-07-18 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/1476#discussion_r15098545 --- Diff: core/src/test/scala/org/apache/spark/ui/jobs/JobProgressListenerSuite.scala --- @@ -81,8 +81,6 @@ class JobProgressListenerSuite extends

[GitHub] spark pull request: [MLlib] SPARK-1536: multiclass classification ...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/886#issuecomment-49401055 QA results for PR 886:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2553. CoGroupedRDD unnecessarily allocat...

2014-07-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1461 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-2553. CoGroupedRDD unnecessarily allocat...

2014-07-18 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1461#issuecomment-49401094 Merged this, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-18 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1447#issuecomment-49401225 This looks okay to me as is, @rxin what do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: SPARK-2553. CoGroupedRDD unnecessarily allocat...

2014-07-18 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/1461#issuecomment-49401593 It looks like there was actually a compile error here that I missed last night. Uploading the fix. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: Reservoir sampling implementation.

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1478#issuecomment-49401911 QA results for PR 1478:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2553. CoGroupedRDD unnecessarily allocat...

2014-07-18 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/1461#issuecomment-49402486 Fix: https://github.com/apache/spark/pull/1479 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: SPARK-2553. Fix compile error

2014-07-18 Thread sryza
GitHub user sryza opened a pull request: https://github.com/apache/spark/pull/1479 SPARK-2553. Fix compile error You can merge this pull request into a Git repository by running: $ git pull https://github.com/sryza/spark sandy-spark-2553 Alternatively you can review and

[GitHub] spark pull request: [SPARK-2571] Correctly report shuffle read met...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1476#issuecomment-49402820 QA results for PR 1476:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-18 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r15099185 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/SpearmansCorrelation.scala --- @@ -0,0 +1,125 @@ +/* + * Licensed to the

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-18 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r15099190 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/SpearmansCorrelation.scala --- @@ -0,0 +1,125 @@ +/* + * Licensed to the

[GitHub] spark pull request: SPARK-2553. Fix compile error

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1479#issuecomment-49402876 QA tests have started for PR 1479. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16817/consoleFull ---

[GitHub] spark pull request: SPARK-2564. ShuffleReadMetrics.totalBlocksRead...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1474#issuecomment-49403246 QA tests have started for PR 1474. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16818/consoleFull ---

[GitHub] spark pull request: [SPARK-2571] Correctly report shuffle read met...

2014-07-18 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1476#discussion_r15099340 --- Diff: core/src/test/scala/org/apache/spark/ui/jobs/JobProgressListenerSuite.scala --- @@ -81,8 +81,6 @@ class JobProgressListenerSuite extends FunSuite

[GitHub] spark pull request: Reservoir sampling implementation.

2014-07-18 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1478#discussion_r15099390 --- Diff: core/src/main/scala/org/apache/spark/util/random/SamplingUtils.scala --- @@ -17,9 +17,49 @@ package org.apache.spark.util.random

[GitHub] spark pull request: SPARK-2564. ShuffleReadMetrics.totalBlocksRead...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1474#issuecomment-49403450 QA results for PR 1474:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: Reservoir sampling implementation.

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1478#issuecomment-49403909 QA tests have started for PR 1478. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16819/consoleFull ---

[GitHub] spark pull request: Reservoir sampling implementation.

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1478#issuecomment-49404149 QA results for PR 1478:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2553. Fix compile error

2014-07-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1479#issuecomment-49404215 Merging this ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: SPARK-2553. Fix compile error

2014-07-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1479 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: Reservoir sampling implementation.

2014-07-18 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1478#issuecomment-49404318 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Reservoir sampling implementation.

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1478#issuecomment-49404657 QA tests have started for PR 1478. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16820/consoleFull ---

[GitHub] spark pull request: Fixed a typo in the comments in RangePartition...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1473#issuecomment-49405021 QA results for PR 1473:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2553. CoGroupedRDD unnecessarily allocat...

2014-07-18 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/1461#issuecomment-49405145 This PR also changed the behavior with respect to mutating CoGroups -- now we mutate combiner1 in place rather than returning a new ArrayBuffer. This is _probably_ not

[GitHub] spark pull request: [SPARK-2572] Delete the local dir on executor ...

2014-07-18 Thread watermen
GitHub user watermen opened a pull request: https://github.com/apache/spark/pull/1480 [SPARK-2572] Delete the local dir on executor automatically when using spark on Mesos. When running spark over Mesos in “fine-grained” modes or “coarse-grained” mode. After the

[GitHub] spark pull request: [SPARK-2572] Delete the local dir on executor ...

2014-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1480#issuecomment-49405622 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1367#issuecomment-49405645 QA results for PR 1367:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2553. CoGroupedRDD unnecessarily allocat...

2014-07-18 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/1461#issuecomment-49406581 I actually did consider this, I should have made a note in the PR. Agreed that we should be careful with the implications of these small changes. --- If your project is

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-18 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/1447#issuecomment-49407083 Upmerged --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1447#issuecomment-49407225 QA tests have started for PR 1447. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16821/consoleFull ---

[GitHub] spark pull request: SPARK-2566. Update ShuffleWriteMetrics increme...

2014-07-18 Thread sryza
GitHub user sryza opened a pull request: https://github.com/apache/spark/pull/1481 SPARK-2566. Update ShuffleWriteMetrics incrementally I haven't tested this out on a cluster yet, but wanted to make sure the approach (passing ShuffleWriteMetrics down to DiskBlockObjectWriter) was

[GitHub] spark pull request: SPARK-2566. Update ShuffleWriteMetrics increme...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1481#issuecomment-49407558 QA tests have started for PR 1481. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16822/consoleFull ---

[GitHub] spark pull request: SPARK-2491: Fix When an OOM is thrown,the exec...

2014-07-18 Thread witgo
GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/1482 SPARK-2491: Fix When an OOM is thrown,the executor does not stop properly. You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-2491

[GitHub] spark pull request: SPARK-2491: Fix When an OOM is thrown,the exec...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1482#issuecomment-49407922 QA tests have started for PR 1482. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16823/consoleFull ---

[GitHub] spark pull request: SPARK-2407: Added Parser of SQL SUBSTR()

2014-07-18 Thread chutium
Github user chutium commented on the pull request: https://github.com/apache/spark/pull/1442#issuecomment-49407883 @egraldlo @willb thanks guys, substring is also added, @marmbrus test also done and passed. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: SPARK-2491: Fix When an OOM is thrown,the exec...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1482#issuecomment-49408105 QA results for PR 1482:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2491: Fix When an OOM is thrown,the exec...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1482#issuecomment-49409124 QA tests have started for PR 1482. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16824/consoleFull ---

[GitHub] spark pull request: SPARK-2553. Fix compile error

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1479#issuecomment-49410414 QA results for PR 1479:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: Reservoir sampling implementation.

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1478#issuecomment-49412218 QA results for PR 1478:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2484][SQL] By default does not run hive...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1403#issuecomment-49412442 QA tests have started for PR 1403. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16825/consoleFull ---

[GitHub] spark pull request: SPARK-2519 part 2. Remove pattern matching on ...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1447#issuecomment-49415317 QA results for PR 1447:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: SPARK-2566. Update ShuffleWriteMetrics increme...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1481#issuecomment-49415768 QA results for PR 1481:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [Spark 2557] fix LOCAL_N_REGEX in createTaskSc...

2014-07-18 Thread advancedxy
Github user advancedxy commented on the pull request: https://github.com/apache/spark/pull/1464#issuecomment-49416016 ping. @aarondav what do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [WIP][SPARK-2491]: Fix When an OOM is thrown,t...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1482#issuecomment-49417140 QA results for PR 1482:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [YARN] SPARK-2577: File upload to viewfs is br...

2014-07-18 Thread gerashegalov
GitHub user gerashegalov opened a pull request: https://github.com/apache/spark/pull/1483 [YARN] SPARK-2577: File upload to viewfs is broken due to mount point re... Opting to the option 2 defined in SPARK-2577, i.e., retrieve and pass the correct file system object to addResource.

[GitHub] spark pull request: [YARN] SPARK-2577: File upload to viewfs is br...

2014-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1483#issuecomment-49417518 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2484][SQL] By default does not run hive...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1403#issuecomment-49420035 QA results for PR 1403:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [MLLIB] [WIP] SPARK-1473: Feature selection fo...

2014-07-18 Thread avulanov
GitHub user avulanov opened a pull request: https://github.com/apache/spark/pull/1484 [MLLIB] [WIP] SPARK-1473: Feature selection for high dimensional datasets The following is implemented: 1) generic traits for feature selection and filtering 2) trait for feature selection

[GitHub] spark pull request: [MLLIB] [WIP] SPARK-1473: Feature selection fo...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1484#issuecomment-49421587 QA results for PR 1484:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: [MLLIB] [WIP] SPARK-1473: Feature selection fo...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1484#issuecomment-49421581 QA tests have started for PR 1484. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16826/consoleFull ---

[GitHub] spark pull request: [MLLIB] [WIP] SPARK-1473: Feature selection fo...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1484#issuecomment-49421907 QA tests have started for PR 1484. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16827/consoleFull ---

[GitHub] spark pull request: SPARK-1707. Remove unnecessary 3 second sleep ...

2014-07-18 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/634#issuecomment-49429447 He had increased it to 30 seconds based on some experiments he did. I guess it depends on how well the scheduler recovers from starting with fewer executors. I know

[GitHub] spark pull request: [MLLIB] [WIP] SPARK-1473: Feature selection fo...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1484#issuecomment-49429734 QA results for PR 1484:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: [SPARK-2555] Support configuration spark.sched...

2014-07-18 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/1462#issuecomment-49434048 Did you test it on a cluster? I unfortunately don't have access to one and am not an expert on mesos. Is there a race condition between when the scheduler

[GitHub] spark pull request: SPARK-2150: Provide direct link to finished ap...

2014-07-18 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/1094#discussion_r15112301 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/HistoryServer.scala --- @@ -114,7 +114,7 @@ class HistoryServer(

[GitHub] spark pull request: [SPARK-2494] [PySpark] make hash of None consi...

2014-07-18 Thread mattf
Github user mattf commented on the pull request: https://github.com/apache/spark/pull/1371#issuecomment-49435527 i've confirmed that this patch addresses the reported issue... ``` ( len(sc.parallelize([((None, 1), 1),] * 100, 100).groupByKey(10).collect()) == 1,

[GitHub] spark pull request: SPARK-2380 [WIP]: Support displaying accumulat...

2014-07-18 Thread lianhuiwang
Github user lianhuiwang commented on a diff in the pull request: https://github.com/apache/spark/pull/1309#discussion_r15113632 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala --- @@ -217,6 +223,7 @@ private[ui] class StagePage(parent: JobProgressTab)

[GitHub] spark pull request: [SPARK-2522] set default broadcast factory to ...

2014-07-18 Thread lianhuiwang
Github user lianhuiwang commented on the pull request: https://github.com/apache/spark/pull/1437#issuecomment-49439644 but when broadcoast's size 1G, TorrentBroadcast has a array size exceeds error.in Utils.serialize() will transfer object to Array[Byte]. when broadcoast object's

[GitHub] spark pull request: Fixed the number of worker thread

2014-07-18 Thread fireflyc
GitHub user fireflyc opened a pull request: https://github.com/apache/spark/pull/1485 Fixed the number of worker thread There are a lot of input Block cause too many Worker threads and will load all data.So it should be to control the number of Worker threads You can merge this

[GitHub] spark pull request: Fixed the number of worker thread

2014-07-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1485#issuecomment-49443851 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: Fixed the number of worker thread

2014-07-18 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/1485#issuecomment-49444796 Slightly bigger point: both the 'fixed' and 'cached' executors from `Executors` have some drawbacks: - 'fixed' always keeps the given number of threads active

[GitHub] spark pull request: [SPARK-2571] Correctly report shuffle read met...

2014-07-18 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/1476#discussion_r15119268 --- Diff: core/src/test/scala/org/apache/spark/ui/jobs/JobProgressListenerSuite.scala --- @@ -81,8 +81,6 @@ class JobProgressListenerSuite extends

[GitHub] spark pull request: [SPARK-2494] [PySpark] make hash of None consi...

2014-07-18 Thread davies
Github user davies commented on the pull request: https://github.com/apache/spark/pull/1371#issuecomment-49451658 @mattf, Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-2571] Correctly report shuffle read met...

2014-07-18 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1476#discussion_r15121395 --- Diff: core/src/test/scala/org/apache/spark/ui/jobs/JobProgressListenerSuite.scala --- @@ -81,8 +81,6 @@ class JobProgressListenerSuite extends FunSuite

[GitHub] spark pull request: [SPARK-695] In DAGScheduler's getPreferredLocs...

2014-07-18 Thread staple
Github user staple commented on a diff in the pull request: https://github.com/apache/spark/pull/1362#discussion_r15122197 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1107,7 +1106,6 @@ class DAGScheduler( case shufDep:

[GitHub] spark pull request: [SPARK-2521] Broadcast RDD object (instead of ...

2014-07-18 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/1452#discussion_r15122396 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1195,21 +1196,32 @@ abstract class RDD[T: ClassTag]( /** * Return whether

[GitHub] spark pull request: [SPARK-695] In DAGScheduler's getPreferredLocs...

2014-07-18 Thread staple
Github user staple commented on the pull request: https://github.com/apache/spark/pull/1362#issuecomment-49457211 I added some patches to address the above comments and introduce a timed test. My test uses an RDD with no preferred locations in the entire dependency graph.

[GitHub] spark pull request: [SPARK-2521] Broadcast RDD object (instead of ...

2014-07-18 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/1452#issuecomment-49457321 This looks good to me. It might be good for @tdas to look over the test. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-2571] Correctly report shuffle read met...

2014-07-18 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1476#discussion_r15123302 --- Diff: core/src/main/scala/org/apache/spark/util/JsonProtocol.scala --- @@ -527,8 +527,9 @@ private[spark] object JsonProtocol {

[GitHub] spark pull request: [SPARK-2571] Correctly report shuffle read met...

2014-07-18 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1476#discussion_r15123564 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -75,7 +75,9 @@ class TaskMetrics extends Serializable { /**

[GitHub] spark pull request: [SPARK-2571] Correctly report shuffle read met...

2014-07-18 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1476#issuecomment-49458944 Couple minor comments, but the changes look good. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: [SPARK-2521] Broadcast RDD object (instead of ...

2014-07-18 Thread ash211
Github user ash211 commented on the pull request: https://github.com/apache/spark/pull/1452#issuecomment-49459127 I'm interested in this for the JobConf/Configuration threadsafety issue here: https://issues.apache.org/jira/browse/SPARK-2546 but agree that should go in a separate

[GitHub] spark pull request: [SPARK-2571] Correctly report shuffle read met...

2014-07-18 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/1476#discussion_r15123953 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -75,7 +75,9 @@ class TaskMetrics extends Serializable { /**

[GitHub] spark pull request: [SPARK-2571] Correctly report shuffle read met...

2014-07-18 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1476#discussion_r15124263 --- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala --- @@ -75,7 +75,9 @@ class TaskMetrics extends Serializable { /**

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1165#issuecomment-49465699 QA tests have started for PR 1165. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16828/consoleFull ---

[GitHub] spark pull request: Fixed a typo in the comments in RangePartition...

2014-07-18 Thread dorx
Github user dorx commented on the pull request: https://github.com/apache/spark/pull/1473#issuecomment-49466619 A superficial look at the failed unit tests seems to suggest some Spark SQL optimizations rely on the fact that 1000 is set as the sequential scan threshhold. @rxin

[GitHub] spark pull request: Fixed a typo in the comments in RangePartition...

2014-07-18 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1473#issuecomment-49467466 It appears to me that the range partitioner is not correctly using the provided ordering in the case where it uses a binary search. --- If your project is set up for

[GitHub] spark pull request: SPARK-2564. ShuffleReadMetrics.totalBlocksRead...

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1474#issuecomment-49467773 QA tests have started for PR 1474. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16829/consoleFull ---

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-18 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1367#issuecomment-49467775 QA tests have started for PR 1367. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16830/consoleFull ---

[GitHub] spark pull request: [SPARK-2521] Broadcast RDD object (instead of ...

2014-07-18 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1452#issuecomment-49468287 @ash211 I created https://issues.apache.org/jira/browse/SPARK-2585 to track this. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: Added t2 instance types

2014-07-18 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1446#issuecomment-49469831 LGTM - thanks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: Added t2 instance types

2014-07-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1446 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

  1   2   >