[GitHub] spark pull request #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/19337#discussion_r142853643 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala --- @@ -224,6 +224,20 @@ private[clustering] trait LDAParams extends Params with

[GitHub] spark pull request #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/19337#discussion_r142853109 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala --- @@ -224,6 +224,20 @@ private[clustering] trait LDAParams extends Params with

[GitHub] spark pull request #19337: [SPARK-22114][ML][MLLIB]add epsilon for LDA

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/19337#discussion_r142854372 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -322,6 +326,13 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82469 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82469/testReport)** for PR 18732 at commit

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19418 **[Test build #82471 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82471/testReport)** for PR 19418 at commit

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82472 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82472/testReport)** for PR 19436 at commit

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82469/ Test FAILed. ---

[GitHub] spark issue #19287: [SPARK-22074][Core] Task killed by other attempt task sh...

2017-10-05 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/19287 lgtm, thanks @xuanyuanking @jerryshao can you merge this? I will have very intermittent access for a few weeks, I'd prefer not to merge in case there is any issue that needs an urgent

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19418 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19418 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82471/ Test FAILed. ---

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-10-05 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/19194 btw I'm going to have really intermittent access for a few weeks, so you don't need to wait for me to proceed with this @tgravescs ---

[GitHub] spark issue #19435: [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "keyWithIn...

2017-10-05 Thread lw-lin
Github user lw-lin commented on the issue: https://github.com/apache/spark/pull/19435 @tdas would you take a look, thanks --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82473 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82473/testReport)** for PR 19436 at commit

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19436 Let me install R environment to test it locally... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82473 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82473/testReport)** for PR 19436 at commit

[GitHub] spark issue #19369: [SPARK-22147][CORE] Removed redundant allocations from B...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19369 **[Test build #3941 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3941/testReport)** for PR 19369 at commit

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82474 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82474/testReport)** for PR 19436 at commit

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19369: [SPARK-22147][CORE] Removed redundant allocations from B...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19369 **[Test build #3941 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3941/testReport)** for PR 19369 at commit

[GitHub] spark pull request #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work...

2017-10-05 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19436#discussion_r142901810 --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R --- @@ -3075,6 +3075,11 @@ test_that("gapply() and gapplyCollect() on a DataFrame", { df1Collect

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82472 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82472/testReport)** for PR 19436 at commit

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19436 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19369: [SPARK-22147][CORE] Removed redundant allocations...

2017-10-05 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/19369#discussion_r142896027 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -67,7 +67,7 @@ private[spark] class DiskStore( var threwException:

[GitHub] spark issue #19374: [SPARK-22145][MESOS] fix supervise with checkpointing on...

2017-10-05 Thread skonto
Github user skonto commented on the issue: https://github.com/apache/spark/pull/19374 @ArtRand @susanxhuynh gentle ping. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82470 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82470/testReport)** for PR 19436 at commit

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82473/ Test FAILed. ---

[GitHub] spark pull request #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19436#discussion_r142903183 --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R --- @@ -3075,6 +3075,11 @@ test_that("gapply() and gapplyCollect() on a DataFrame", { df1Collect <-

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82470/ Test FAILed. ---

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82472/ Test FAILed. ---

[GitHub] spark issue #19429: [SPARK-20055] [Docs] Added documentation for loading csv...

2017-10-05 Thread jomach
Github user jomach commented on the issue: https://github.com/apache/spark/pull/19429 @felixcheung Sorry for that. Should be there now. Can you test ? thanks --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19436#discussion_r142902434 --- Diff: R/pkg/tests/fulltests/test_sparkSQL.R --- @@ -3075,6 +3075,11 @@ test_that("gapply() and gapplyCollect() on a DataFrame", { df1Collect <-

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19061 The PR is updated. Thank you for review, @vanzin ! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143077626 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143080675 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -503,21 +533,22 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143080051 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143080481 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-05 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143080889 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,69 @@ def pivot(self, pivot_col, values=None): jgd =

[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19363#discussion_r143094589 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -18,16 +18,17 @@ package org.apache.spark.sql

[GitHub] spark issue #19363: [Minor]Override toString of KeyValueGroupedDataset

2017-10-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19363 Btw, I think this might need a minor/trivial JIRA ticket. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19061 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82491/ Test PASSed. ---

[GitHub] spark pull request #19442: [SPARK-8515][ML][WIP] Improve ML Attribute API

2017-10-05 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/19442 [SPARK-8515][ML][WIP] Improve ML Attribute API ## What changes were proposed in this pull request? The current ML attribute API has issues like inefficiency and not easy to use. This work

[GitHub] spark pull request #19061: [SPARK-21568][CORE] ConsoleProgressBar should onl...

2017-10-05 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19061#discussion_r143076954 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -434,7 +434,7 @@ class SparkContext(config: SparkConf) extends Logging {

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19394 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82486/ Test FAILed. ---

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-05 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143081782 --- Diff: python/pyspark/sql/group.py --- @@ -194,6 +194,65 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col,

[GitHub] spark pull request #19061: [SPARK-21568][CORE] ConsoleProgressBar should onl...

2017-10-05 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/19061#discussion_r143085886 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -203,6 +203,10 @@ package object config { private[spark] val

[GitHub] spark pull request #19440: [SPARK-21871][SQL] Fix infinite loop when bytecod...

2017-10-05 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19440#discussion_r143091966 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala --- @@ -185,4 +185,22 @@ class WholeStageCodegenSuite

[GitHub] spark issue #19336: [SPARK-21947][SS] Check and report error when monotonica...

2017-10-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19336 ping @zsxwing Can you take a look? Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19440: [SPARK-21871][SQL] Fix infinite loop when bytecode size ...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19440 **[Test build #82494 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82494/testReport)** for PR 19440 at commit

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19394 **[Test build #82486 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82486/testReport)** for PR 19394 at commit

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19394 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19440: [SPARK-21871][SQL] Fix infinite loop when bytecod...

2017-10-05 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19440#discussion_r143089714 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -392,12 +392,16 @@ case class

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143093875 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -73,25 +73,37 @@ case class

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82493 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82493/testReport)** for PR 18732 at commit

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19061 **[Test build #82491 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82491/testReport)** for PR 19061 at commit

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19061 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143069049 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82489/ Test FAILed. ---

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82489 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82489/testReport)** for PR 18732 at commit

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-05 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/19394 What's the other value? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19440: [SPARK-21871][SQL] Fix infinite loop when bytecod...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19440#discussion_r143092189 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -392,12 +392,16 @@ case class

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19061 **[Test build #82491 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82491/testReport)** for PR 19061 at commit

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19061 It's done. Thank you for review, @jiangxb1987 . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-05 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143077148 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -73,25 +73,37 @@ case class

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18924 **[Test build #82487 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82487/testReport)** for PR 18924 at commit

[GitHub] spark pull request #15770: [SPARK-15784][ML]:Add Power Iteration Clustering ...

2017-10-05 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/15770#discussion_r143078479 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala --- @@ -0,0 +1,216 @@ +/* + * Licensed to the

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82487/ Test PASSed. ---

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-05 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143081592 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,69 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark pull request #19440: [SPARK-21871][SQL] Fix infinite loop when bytecod...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19440#discussion_r143091744 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala --- @@ -185,4 +185,22 @@ class WholeStageCodegenSuite extends

[GitHub] spark pull request #19061: [SPARK-21568][CORE] ConsoleProgressBar should onl...

2017-10-05 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19061#discussion_r143092704 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -203,6 +203,10 @@ package object config { private[spark]

[GitHub] spark issue #19336: [SPARK-21947][SS] Check and report error when monotonica...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19336 **[Test build #82492 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82492/testReport)** for PR 19336 at commit

[GitHub] spark issue #19294: [SPARK-21549][CORE] Respect OutputFormats with no output...

2017-10-05 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19294 Since this is not related to Spark SQL, please do not add the test cases to the Spark SQL side. --- - To unsubscribe,

[GitHub] spark issue #19419: [SPARK-22188] [CORE] Adding security headers for prevent...

2017-10-05 Thread krishna-pandey
Github user krishna-pandey commented on the issue: https://github.com/apache/spark/pull/19419 @dongjoon-hyun Thanks for the review. Made the changes as suggested. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #15770: [SPARK-15784][ML]:Add Power Iteration Clustering ...

2017-10-05 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/15770#discussion_r143078744 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala --- @@ -0,0 +1,216 @@ +/* + * Licensed to the

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143081342 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-05 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/19394 Not sure - maybe print the chi-value of the test and see if they make sense. If they do, we can change the threshold. --- - To

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-05 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18664 I agree with Bryan. I think we might want to rethink the assumption that toPandas result with arrow / without arrow should be 100% the same. For instance, non-Arrow doesn't respect

[GitHub] spark issue #19363: [Minor]Override toString of KeyValueGroupedDataset

2017-10-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19363 Maybe add a simple test? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19440: [SPARK-21871][SQL] Fix infinite loop when bytecod...

2017-10-05 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19440#discussion_r143094785 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -392,12 +392,16 @@ case class

[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-10-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17357#discussion_r143096475 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverWrapper.scala --- @@ -23,14 +23,15 @@ import org.apache.commons.lang3.StringUtils

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19061 **[Test build #82488 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82488/testReport)** for PR 19061 at commit

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19061 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19294: [SPARK-21549][CORE] Respect OutputFormats with no output...

2017-10-05 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/19294 @gatorsmile have your concerns been addressed ? If yes, I will merge this into master and 2.2.1 This patch is clearly better than existing state for 2.2 and master - for spark core and

[GitHub] spark issue #19442: [SPARK-8515][ML][WIP] Improve ML Attribute API

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19442 **[Test build #82495 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82495/testReport)** for PR 19442 at commit

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82493/ Test FAILed. ---

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19369: [SPARK-22147][CORE] Removed redundant allocations from B...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19369 **[Test build #3942 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3942/testReport)** for PR 19369 at commit

[GitHub] spark issue #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19437 **[Test build #82475 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82475/testReport)** for PR 19437 at commit

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82474/ Test PASSed. ---

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-05 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142944123 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,67 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark issue #19369: [SPARK-22147][CORE] Removed redundant allocations from B...

2017-10-05 Thread superbobry
Github user superbobry commented on the issue: https://github.com/apache/spark/pull/19369 I've fixed the failing `DiskStoreSuite` and ensured the other two suites also pass fine. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-05 Thread susanxhuynh
Github user susanxhuynh commented on the issue: https://github.com/apache/spark/pull/19437 @ArtRand @skonto Please review. Tests passed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-05 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142948551 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,67 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82477 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82477/testReport)** for PR 18732 at commit

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-05 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142949179 --- Diff: python/pyspark/worker.py --- @@ -74,17 +74,35 @@ def wrap_udf(f, return_type): def wrap_pandas_udf(f, return_type): -

  1   2   3   4   >