[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14808169 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -140,14 +144,39 @@ private[spark] class CacheManager(blockManager: BlockManager)

[GitHub] spark pull request: [WIP][SPARK-2054][SQL] Code Generation for Exp...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/993#issuecomment-48697796 QA tests have started for PR 993. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16557/consoleFull ---

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14808250 --- Diff: python/pyspark/join.py --- @@ -1,35 +1,19 @@ - -Copyright (c) 2011, Douban Inc. http://www.douban.com/ -All rights reserved. -

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14808269 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -140,14 +144,39 @@ private[spark] class CacheManager(blockManager: BlockManager)

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14808344 --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala --- @@ -67,10 +67,14 @@ class SparkEnv ( val metricsSystem: MetricsSystem,

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14808675 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/Statistics.scala --- @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14808684 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/Correlation.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14808671 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/Statistics.scala --- @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14808679 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/Statistics.scala --- @@ -0,0 +1,52 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14808688 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/Correlation.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14808707 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/Correlation.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14808705 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/Correlation.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14808714 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/Correlation.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: hijack hash to make hash of None consistant cr...

2014-07-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1371#issuecomment-48699285 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14808775 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -463,16 +463,16 @@ private[spark] class BlockManager(

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14808809 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -87,9 +97,32 @@ private class MemoryStore(blockManager: BlockManager,

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14808804 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/Correlation.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14808829 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -141,6 +174,86 @@ private class MemoryStore(blockManager: BlockManager,

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14808850 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -141,6 +174,86 @@ private class MemoryStore(blockManager: BlockManager,

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14808860 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/Correlation.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14808921 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -141,6 +174,86 @@ private class MemoryStore(blockManager: BlockManager,

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14808944 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -141,6 +174,86 @@ private class MemoryStore(blockManager: BlockManager,

[GitHub] spark pull request: Made rdd.py pep8 complaint by using Autopep8 a...

2014-07-11 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1354#discussion_r14809046 --- Diff: python/pyspark/rdd.py --- @@ -509,7 +522,8 @@ def sortByKey(self, ascending=True, numPartitions=None, keyfunc = lambda x: x): tmp2 =

[GitHub] spark pull request: [SPARK-2119][SQL] Improved Parquet performance...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1370#issuecomment-48700541 QA results for PR 1370:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14809203 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -141,6 +174,86 @@ private class MemoryStore(blockManager: BlockManager,

[GitHub] spark pull request: [SPARK-1969][MLlib] Public available online su...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/955#issuecomment-48701090 QA tests have started for PR 955. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16558/consoleFull ---

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1165#issuecomment-48701135 Did an initial pass with some feedback - not totally done yet but it should be enough to get some work done. --- If your project is set up for it, you can reply to

[GitHub] spark pull request: Use the scala-logging wrapper instead of the d...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1369#issuecomment-48701944 QA results for PR 1369:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-1969][MLlib] Public available online su...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/955#issuecomment-48702465 QA tests have started for PR 955. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16561/consoleFull ---

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14809771 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManager.scala --- @@ -463,16 +463,16 @@ private[spark] class BlockManager(

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14809809 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -141,6 +174,86 @@ private class MemoryStore(blockManager: BlockManager,

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14809836 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -87,9 +97,32 @@ private class MemoryStore(blockManager: BlockManager,

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14809878 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -140,14 +144,39 @@ private[spark] class CacheManager(blockManager: BlockManager)

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14809944 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -141,6 +174,86 @@ private class MemoryStore(blockManager: BlockManager,

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14809936 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -140,14 +144,39 @@ private[spark] class CacheManager(blockManager: BlockManager)

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14810031 --- Diff: core/src/main/scala/org/apache/spark/util/collection/SizeTracker.scala --- @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: Fix JIRA-983 and support exteranl sort for sor...

2014-07-11 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/931#issuecomment-48703798 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Use the Executor's ClassLoader in sc.objectFil...

2014-07-11 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/181#issuecomment-48704142 Cool - thanks for contributing the test! Jenkins, test this please. LGTM pending tests. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: Use the Executor's ClassLoader in sc.objectFil...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/181#issuecomment-48704291 QA tests have started for PR 181. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16563/consoleFull ---

[GitHub] spark pull request: Made rdd.py pep8 complaint by using Autopep8 a...

2014-07-11 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/1354#discussion_r14810364 --- Diff: python/pyspark/rdd.py --- @@ -509,7 +522,8 @@ def sortByKey(self, ascending=True, numPartitions=None, keyfunc = lambda x: x): tmp2

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-07-11 Thread drexin
Github user drexin commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-48706534 Hi Patrick, the problem is described in [this mailing list

[GitHub] spark pull request: Use the scala-logging wrapper instead of the d...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1369#issuecomment-48706567 QA tests have started for PR 1369. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16564/consoleFull ---

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-07-11 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1134#issuecomment-48707013 Thanks Michael , (1) We could make it as a user hint ,like hive does . set hive.optimize.skewjoin = true; set hive.skewjoin.key = skew_key_threshold

[GitHub] spark pull request: Use the scala-logging wrapper instead of the d...

2014-07-11 Thread srowen
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/1369#issuecomment-48707227 Wasn't this already decided against in https://github.com/apache/spark/pull/332 and again https://github.com/apache/spark/pull/1208 ? or is this not another PR for

[GitHub] spark pull request: mesos executor ids now consist of the slave id...

2014-07-11 Thread drexin
Github user drexin commented on the pull request: https://github.com/apache/spark/pull/1358#issuecomment-48707209 Created a JIRA issue here: https://issues.apache.org/jira/browse/SPARK-2445 --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-07-11 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1134#issuecomment-48707289 Hi , I also make a left semi join .I don't know is this join as a optimization as the left semi join or as a single join algorithm. I think the 1127 PR also has some

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-07-11 Thread YanjieGao
Github user YanjieGao commented on a diff in the pull request: https://github.com/apache/spark/pull/1134#discussion_r14811559 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins.scala --- @@ -400,3 +401,73 @@ case class BroadcastNestedLoopJoin(

[GitHub] spark pull request: [SPARK-1969][MLlib] Online summarizer APIs for...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/955#issuecomment-48708106 QA results for PR 955:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: Use the scala-logging wrapper instead of the d...

2014-07-11 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/1369#issuecomment-48708268 #332 can't automatic test . #1208 was messing up and I do not know how to solve . :sweat: --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: replace println to log4j

2014-07-11 Thread fireflyc
GitHub user fireflyc opened a pull request: https://github.com/apache/spark/pull/1372 replace println to log4j Our program needs to receive a large amount of data and run for a long time. We set the log level to WARN but Storing iterator received single as such message

[GitHub] spark pull request: [SPARK-1470] Use the scala-logging wrapper ins...

2014-07-11 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/332#issuecomment-48708675 It can't automatic test. I submit a new PR #1369. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: replace println to log4j

2014-07-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1372#issuecomment-48708875 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-1969][MLlib] Online summarizer APIs for...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/955#issuecomment-48709108 QA results for PR 955:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: [SPARK-1969][MLlib] Online summarizer APIs for...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/955#issuecomment-48709925 QA results for PR 955:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: [SPARK-2446][SQL] Add BinaryType support to Pa...

2014-07-11 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/1373 [SPARK-2446][SQL] Add BinaryType support to Parquet I/O. To support `BinaryType`, the following changes are needed: - Make `StringType` use `OriginalType.UTF8` - Add `BinaryType` using

[GitHub] spark pull request: [SPARK-2446][SQL] Add BinaryType support to Pa...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1373#issuecomment-48710936 QA tests have started for PR 1373. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16565/consoleFull ---

[GitHub] spark pull request: [SPARK-1946] Submit tasks after (configured ra...

2014-07-11 Thread li-zhihui
Github user li-zhihui commented on a diff in the pull request: https://github.com/apache/spark/pull/900#discussion_r14813843 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -46,9 +46,19 @@ class

[GitHub] spark pull request: [SPARK-1946] Submit tasks after (configured ra...

2014-07-11 Thread li-zhihui
Github user li-zhihui commented on the pull request: https://github.com/apache/spark/pull/900#issuecomment-48714143 Thanks @tgravescs I will file a new jira for handling mesos and follow it after the PR merged. --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [MLLIB] [SPARK-2222] Add multiclass evaluation...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1155#issuecomment-48714186 QA tests have started for PR 1155. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16566/consoleFull ---

[GitHub] spark pull request: [SPARK-1470,SPARK-1842] Use the scala-logging ...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1369#issuecomment-48714364 QA results for PR 1369:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [MLLIB] [SPARK-2222] Add multiclass evaluation...

2014-07-11 Thread avulanov
Github user avulanov commented on a diff in the pull request: https://github.com/apache/spark/pull/1155#discussion_r14814439 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/evaluation/MulticlassMetrics.scala --- @@ -0,0 +1,181 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [MLLIB] [SPARK-2222] Add multiclass evaluation...

2014-07-11 Thread avulanov
Github user avulanov commented on the pull request: https://github.com/apache/spark/pull/1155#issuecomment-48715008 @mengxr I've addressed you comments, except the one with import which I commented above. I've posted a question about feature selection interface. Could

[GitHub] spark pull request: [SPARK-2446][SQL] Add BinaryType support to Pa...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1373#issuecomment-48718560 QA results for PR 1373:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [MLLIB] [SPARK-2222] Add multiclass evaluation...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1155#issuecomment-48720849 QA results for PR 1155:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds the following public classes (experimental):brclass

[GitHub] spark pull request: [SPARK-2437] Rename MAVEN_PROFILES to SBT_MAVE...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1374#issuecomment-48722841 QA results for PR 1374:br- This patch FAILED unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: Made rdd.py pep8 complaint by using Autopep8 a...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1354#issuecomment-48725049 QA results for PR 1354:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-2437] Rename MAVEN_PROFILES to SBT_MAVE...

2014-07-11 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/1374#issuecomment-48725337 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2437] Rename MAVEN_PROFILES to SBT_MAVE...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1374#issuecomment-48725500 QA tests have started for PR 1374. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16569/consoleFull ---

[GitHub] spark pull request: [SPARK-2165] spark on yarn: add support for se...

2014-07-11 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/1279#discussion_r14821748 --- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala --- @@ -124,6 +124,14 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with

[GitHub] spark pull request: [SPARK-2165] spark on yarn: add support for se...

2014-07-11 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/1279#discussion_r14822032 --- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala --- @@ -167,6 +175,8 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable with

[GitHub] spark pull request: [SPARK-2165] spark on yarn: add support for se...

2014-07-11 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/1279#discussion_r14822125 --- Diff: yarn/alpha/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -108,6 +108,10 @@ class Client(clientArgs: ClientArguments,

[GitHub] spark pull request: [SPARK-2165] spark on yarn: add support for se...

2014-07-11 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/1279#discussion_r14822197 --- Diff: yarn/stable/src/main/scala/org/apache/spark/deploy/yarn/Client.scala --- @@ -81,6 +81,10 @@ class Client(clientArgs: ClientArguments, hadoopConf:

[GitHub] spark pull request: [SPARK-2437] Rename MAVEN_PROFILES to SBT_MAVE...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1374#issuecomment-48736229 QA results for PR 1374:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-1946] Submit tasks after (configured ra...

2014-07-11 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/900#discussion_r14823260 --- Diff: docs/configuration.md --- @@ -699,6 +699,22 @@ Apart from these, the following properties are also available, and may be useful (in

[GitHub] spark pull request: replace println to log4j

2014-07-11 Thread fireflyc
Github user fireflyc commented on the pull request: https://github.com/apache/spark/pull/1372#issuecomment-48739929 I have verified, the log level is set to Info right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark pull request: Made rdd.py pep8 complaint by using Autopep8 a...

2014-07-11 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/1354#issuecomment-48740596 Found a way actually. On Jul 11, 2014 6:07 PM, Apache Spark QA notificati...@github.com wrote: QA results for PR 1354: - This patch PASSES unit

[GitHub] spark pull request: [SPARK-1946] Submit tasks after (configured ra...

2014-07-11 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/900#discussion_r14825405 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -244,6 +257,17 @@ class

[GitHub] spark pull request: SPARK-1719: spark.*.extraLibraryPath isn't app...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1022#issuecomment-48750411 QA tests have started for PR 1022. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16570/consoleFull ---

[GitHub] spark pull request: [SPARK-2236][SQL]SparkSQL add SkewJoin

2014-07-11 Thread YanjieGao
Github user YanjieGao commented on the pull request: https://github.com/apache/spark/pull/1134#issuecomment-48752406 Hi I rewrite the code ,and resolve some former problem --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

[GitHub] spark pull request: [SPARK-2441][SQL] Add more efficient distinct ...

2014-07-11 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1366#issuecomment-48759522 @aarondav, you are totally right. However, the `Aggregate` operator that this is replacing made the same assumption and this approach will use strictly less memory.

[GitHub] spark pull request: [SPARK-2446][SQL] Add BinaryType support to Pa...

2014-07-11 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/1373#issuecomment-48760249 Thanks for the patch! One quick question: will this change the behavior when loading in string data that was saved with previous versions of Spark SQL? --- If your

[GitHub] spark pull request: SPARK-1719: spark.*.extraLibraryPath isn't app...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1022#issuecomment-48761837 QA results for PR 1022:br- This patch PASSES unit tests.br- This patch merges cleanlybr- This patch adds no public classesbrbrFor more information see test

[GitHub] spark pull request: [SPARK-1177] Allow SPARK_JAR to be set program...

2014-07-11 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/987#issuecomment-48762832 #560 is merged. Close this PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-1177] Allow SPARK_JAR to be set program...

2014-07-11 Thread dbtsai
Github user dbtsai closed the pull request at: https://github.com/apache/spark/pull/987 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: SPARK-2425 Don't kill a still-running Applicat...

2014-07-11 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/1360#issuecomment-48762929 QA tests have started for PR 1360. This patch merges cleanly. brView progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16571/consoleFull ---

[GitHub] spark pull request: SPARK-1536: multiclass classification support ...

2014-07-11 Thread etrain
Github user etrain commented on a diff in the pull request: https://github.com/apache/spark/pull/886#discussion_r14836561 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala --- @@ -768,104 +973,157 @@ object DecisionTree extends Serializable with

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread dorx
Github user dorx commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14836617 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/Correlation.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread dorx
Github user dorx commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14836715 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/Correlation.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: SPARK-1536: multiclass classification support ...

2014-07-11 Thread etrain
Github user etrain commented on the pull request: https://github.com/apache/spark/pull/886#issuecomment-48767401 I've gone through this in some depth, and aside from a couple of minor style nits - the logic looks good to me. Manish - have you compared output vs. scikit-learn for

[GitHub] spark pull request: [SPARK-2359][MLlib] Correlations

2014-07-11 Thread dorx
Github user dorx commented on a diff in the pull request: https://github.com/apache/spark/pull/1367#discussion_r14836922 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/Correlation.scala --- @@ -0,0 +1,121 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: [WIP] SPARK-2450: Add YARN executor log links ...

2014-07-11 Thread kbzod
GitHub user kbzod opened a pull request: https://github.com/apache/spark/pull/1375 [WIP] SPARK-2450: Add YARN executor log links to UI executors page This adds a new column in the Executors page of the Spark UI called Logs, visible only when running under YARN. After getting a

[GitHub] spark pull request: [WIP] SPARK-2450: Add YARN executor log links ...

2014-07-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1375#issuecomment-48768609 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-2437] Rename MAVEN_PROFILES to SBT_MAVE...

2014-07-11 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1374#issuecomment-48768736 Thanks, looks good. I tested this with: ``` SBT_MAVEN_PROFILES=yarn SBT_MAVEN_PROPERTIES=hadoop.versn=2.2.0 sbt/sbt package ``` --- If your project is

[GitHub] spark pull request: [SPARK-2411] Add a history-not-found page to s...

2014-07-11 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/1336#issuecomment-48769071 LGTM with one small comment. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-2437] Rename MAVEN_PROFILES to SBT_MAVE...

2014-07-11 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1374 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request: [SPARK-2411] Add a history-not-found page to s...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1336#discussion_r14837627 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/ui/HistoryNotFoundPage.scala --- @@ -0,0 +1,40 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14837730 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -141,6 +174,86 @@ private class MemoryStore(blockManager: BlockManager,

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14837775 --- Diff: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala --- @@ -141,6 +174,86 @@ private class MemoryStore(blockManager: BlockManager,

[GitHub] spark pull request: [SPARK-1777] Prevent OOMs from single partitio...

2014-07-11 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/1165#discussion_r14837855 --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala --- @@ -67,10 +67,14 @@ class SparkEnv ( val metricsSystem: MetricsSystem,

[GitHub] spark pull request: SPARK-1576 (Allow JAVA_OPTS to be passed as a ...

2014-07-11 Thread nishkamravi2
Github user nishkamravi2 commented on the pull request: https://github.com/apache/spark/pull/492#issuecomment-48769703 @tgravescs Sorry, missed this one somehow. Yes, spark-submit is now fully tested. We could keep this one open for 0.92 potentially. --- If your project is set up

  1   2   >