[GitHub] spark pull request #19399: [SPARK-22175][WEB-UI] Add status column to histor...

2017-10-05 Thread caneGuy
Github user caneGuy commented on a diff in the pull request: https://github.com/apache/spark/pull/19399#discussion_r143114309 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -850,6 +869,18 @@ private[history] class

[GitHub] spark pull request #19419: [SPARK-22188] [CORE] Adding security headers for ...

2017-10-05 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19419#discussion_r143113850 --- Diff: conf/spark-defaults.conf.template --- @@ -25,3 +25,10 @@ # spark.serializer org.apache.spark.serializer.KryoSerializer

[GitHub] spark issue #19336: [SPARK-21947][SS] Check and report error when monotonica...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19336 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82492/ Test PASSed. ---

[GitHub] spark issue #19336: [SPARK-21947][SS] Check and report error when monotonica...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19336 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19336: [SPARK-21947][SS] Check and report error when monotonica...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19336 **[Test build #82492 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82492/testReport)** for PR 19336 at commit

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143112965 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19294: [SPARK-21549][CORE] Respect OutputFormats with no output...

2017-10-05 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/19294 @gatorsmile Sounds good, @szhem can we remove the spark sql tests you added (due to my request). Once build passes, I will commit this - it will definitely help spark core users. ---

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82493/ Test FAILed. ---

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82493 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82493/testReport)** for PR 18732 at commit

[GitHub] spark issue #19419: [SPARK-22188] [CORE] Adding security headers for prevent...

2017-10-05 Thread krishna-pandey
Github user krishna-pandey commented on the issue: https://github.com/apache/spark/pull/19419 @dongjoon-hyun Thanks for the review. Made the changes as suggested. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19442: [SPARK-8515][ML][WIP] Improve ML Attribute API

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19442 **[Test build #82495 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82495/testReport)** for PR 19442 at commit

[GitHub] spark pull request #19442: [SPARK-8515][ML][WIP] Improve ML Attribute API

2017-10-05 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/19442 [SPARK-8515][ML][WIP] Improve ML Attribute API ## What changes were proposed in this pull request? The current ML attribute API has issues like inefficiency and not easy to use. This work

[GitHub] spark issue #19294: [SPARK-21549][CORE] Respect OutputFormats with no output...

2017-10-05 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19294 Since this is not related to Spark SQL, please do not add the test cases to the Spark SQL side. --- - To unsubscribe,

[GitHub] spark issue #19294: [SPARK-21549][CORE] Respect OutputFormats with no output...

2017-10-05 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/19294 @gatorsmile have your concerns been addressed ? If yes, I will merge this into master and 2.2.1 This patch is clearly better than existing state for 2.2 and master - for spark core and

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19061 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82491/ Test PASSed. ---

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19061 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19061 **[Test build #82491 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82491/testReport)** for PR 19061 at commit

[GitHub] spark issue #19440: [SPARK-21871][SQL] Fix infinite loop when bytecode size ...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19440 **[Test build #82494 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82494/testReport)** for PR 19440 at commit

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82493 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82493/testReport)** for PR 18732 at commit

[GitHub] spark issue #19336: [SPARK-21947][SS] Check and report error when monotonica...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19336 **[Test build #82492 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82492/testReport)** for PR 19336 at commit

[GitHub] spark issue #19336: [SPARK-21947][SS] Check and report error when monotonica...

2017-10-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19336 ping @zsxwing Can you take a look? Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19061 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82488/ Test PASSed. ---

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19061 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19061 **[Test build #82488 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82488/testReport)** for PR 19061 at commit

[GitHub] spark pull request #17357: [SPARK-20025][CORE] Ignore SPARK_LOCAL* env, whil...

2017-10-05 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/17357#discussion_r143096475 --- Diff: core/src/main/scala/org/apache/spark/deploy/worker/DriverWrapper.scala --- @@ -23,14 +23,15 @@ import org.apache.commons.lang3.StringUtils

[GitHub] spark issue #19363: [Minor]Override toString of KeyValueGroupedDataset

2017-10-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19363 Btw, I think this might need a minor/trivial JIRA ticket. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19363: [Minor]Override toString of KeyValueGroupedDataset

2017-10-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19363 Although this `toString` in scala-shell looks good, when you print out directly, it might look weird because you just see: ```scala [key: [value: string], value: [value: string]] ```

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82490/ Test FAILed. ---

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82490 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82490/testReport)** for PR 18732 at commit

[GitHub] spark pull request #19440: [SPARK-21871][SQL] Fix infinite loop when bytecod...

2017-10-05 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19440#discussion_r143094785 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -392,12 +392,16 @@ case class

[GitHub] spark issue #19363: [Minor]Override toString of KeyValueGroupedDataset

2017-10-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19363 Maybe add a simple test? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19363#discussion_r143094589 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -18,16 +18,17 @@ package org.apache.spark.sql

[GitHub] spark pull request #19363: [Minor]Override toString of KeyValueGroupedDatase...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19363#discussion_r143094398 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/KeyValueGroupedDataset.scala --- @@ -564,4 +565,30 @@ class KeyValueGroupedDataset[K, V]

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143094078 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -73,25 +73,37 @@ case class

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143093875 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -73,25 +73,37 @@ case class

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143093472 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -73,25 +73,37 @@ case class

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19061 **[Test build #82491 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82491/testReport)** for PR 19061 at commit

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19061 It's done. Thank you for review, @jiangxb1987 . --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #19061: [SPARK-21568][CORE] ConsoleProgressBar should onl...

2017-10-05 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19061#discussion_r143092704 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -203,6 +203,10 @@ package object config { private[spark]

[GitHub] spark issue #19440: [SPARK-21871][SQL] Fix infinite loop when bytecode size ...

2017-10-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19440 LGTM except two minor comments. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark pull request #19440: [SPARK-21871][SQL] Fix infinite loop when bytecod...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19440#discussion_r143092189 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -392,12 +392,16 @@ case class

[GitHub] spark pull request #19440: [SPARK-21871][SQL] Fix infinite loop when bytecod...

2017-10-05 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19440#discussion_r143091966 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala --- @@ -185,4 +185,22 @@ class WholeStageCodegenSuite

[GitHub] spark pull request #19440: [SPARK-21871][SQL] Fix infinite loop when bytecod...

2017-10-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19440#discussion_r143091744 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/WholeStageCodegenSuite.scala --- @@ -185,4 +185,22 @@ class WholeStageCodegenSuite extends

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-05 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/19394 What's the other value? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19363: [Minor]Override toString of KeyValueGroupedDataset

2017-10-05 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19363 cc @viirya Could you review this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19440: [SPARK-21871][SQL] Fix infinite loop when bytecod...

2017-10-05 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19440#discussion_r143089714 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -392,12 +392,16 @@ case class

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-05 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143086739 --- Diff: python/pyspark/sql/group.py --- @@ -194,6 +194,65 @@ def pivot(self, pivot_col, values=None): jgd =

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-05 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18664 I agree with Bryan. I think we might want to rethink the assumption that toPandas result with arrow / without arrow should be 100% the same. For instance, non-Arrow doesn't respect

[GitHub] spark issue #19440: [SPARK-21871][SQL] Fix infinite loop when bytecode size ...

2017-10-05 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19440 Thanks for pining! LGTM except for one comment. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19440: [SPARK-21871][SQL] Fix infinite loop when bytecod...

2017-10-05 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19440#discussion_r143086096 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -392,12 +392,16 @@ case class

[GitHub] spark pull request #19061: [SPARK-21568][CORE] ConsoleProgressBar should onl...

2017-10-05 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/19061#discussion_r143085886 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -203,6 +203,10 @@ package object config { private[spark] val

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143084875 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -503,21 +533,22 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-05 Thread rdblue
Github user rdblue commented on the issue: https://github.com/apache/spark/pull/19394 Here's the error message: TestFailedException: 347.5272 was not greater than 1000 --- - To unsubscribe, e-mail:

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143084656 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-05 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143083397 --- Diff: python/pyspark/sql/group.py --- @@ -194,6 +194,65 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col,

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-05 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143082816 --- Diff: python/pyspark/sql/group.py --- @@ -194,6 +194,65 @@ def pivot(self, pivot_col, values=None): jgd =

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-05 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18664 @ueshin @HyukjinKwon , I think it would be critical for users to have timestamps working for Arrow. Just to recap, the remaining issue here was that `toPandas()` without Arrow does not have

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-05 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143081782 --- Diff: python/pyspark/sql/group.py --- @@ -194,6 +194,65 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col,

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-05 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143081592 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,69 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82490 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82490/testReport)** for PR 18732 at commit

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143081342 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-05 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/19394 Not sure - maybe print the chi-value of the test and see if they make sense. If they do, we can change the threshold. --- - To

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143080051 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143080481 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143080675 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -503,21 +533,22 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-05 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143080889 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,69 @@ def pivot(self, pivot_col, values=None): jgd =

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82489 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82489/testReport)** for PR 18732 at commit

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82489/ Test FAILed. ---

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82489 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82489/testReport)** for PR 18732 at commit

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19394 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82486/ Test FAILed. ---

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19394 **[Test build #82486 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82486/testReport)** for PR 19394 at commit

[GitHub] spark issue #19394: [SPARK-22170][SQL] Reduce memory consumption in broadcas...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19394 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #15770: [SPARK-15784][ML]:Add Power Iteration Clustering ...

2017-10-05 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/15770#discussion_r143078744 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala --- @@ -0,0 +1,216 @@ +/* + * Licensed to the

[GitHub] spark pull request #15770: [SPARK-15784][ML]:Add Power Iteration Clustering ...

2017-10-05 Thread wangmiao1981
Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/15770#discussion_r143078479 --- Diff: mllib/src/main/scala/org/apache/spark/ml/clustering/PowerIterationClustering.scala --- @@ -0,0 +1,216 @@ +/* + * Licensed to the

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82487/ Test PASSed. ---

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18924 **[Test build #82487 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82487/testReport)** for PR 18924 at commit

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19061 **[Test build #82488 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82488/testReport)** for PR 19061 at commit

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-05 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19061 The PR is updated. Thank you for review, @vanzin ! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143077626 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-05 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143077148 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -73,25 +73,37 @@ case class

[GitHub] spark pull request #19061: [SPARK-21568][CORE] ConsoleProgressBar should onl...

2017-10-05 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19061#discussion_r143076954 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -434,7 +434,7 @@ class SparkContext(config: SparkConf) extends Logging {

[GitHub] spark pull request #19061: [SPARK-21568][CORE] ConsoleProgressBar should onl...

2017-10-05 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19061#discussion_r143076726 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -598,6 +598,11 @@ object SparkSubmit extends CommandLineUtils with

[GitHub] spark pull request #19061: [SPARK-21568][CORE] ConsoleProgressBar should onl...

2017-10-05 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19061#discussion_r143076481 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -434,7 +434,7 @@ class SparkContext(config: SparkConf) extends Logging {

[GitHub] spark pull request #19061: [SPARK-21568][CORE] ConsoleProgressBar should onl...

2017-10-05 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/19061#discussion_r143073410 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -203,6 +203,10 @@ package object config { private[spark] val

[GitHub] spark pull request #19061: [SPARK-21568][CORE] ConsoleProgressBar should onl...

2017-10-05 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/19061#discussion_r143073506 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -598,6 +598,11 @@ object SparkSubmit extends CommandLineUtils with Logging {

[GitHub] spark pull request #19061: [SPARK-21568][CORE] ConsoleProgressBar should onl...

2017-10-05 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/19061#discussion_r143073447 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -434,7 +434,7 @@ class SparkContext(config: SparkConf) extends Logging {

[GitHub] spark pull request #19061: [SPARK-21568][CORE] ConsoleProgressBar should onl...

2017-10-05 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/19061#discussion_r143073586 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -399,6 +399,18 @@ class SparkSubmitSuite mainClass should be

[GitHub] spark pull request #19061: [SPARK-21568][CORE] ConsoleProgressBar should onl...

2017-10-05 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/19061#discussion_r143073559 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -399,6 +399,18 @@ class SparkSubmitSuite mainClass should be

[GitHub] spark pull request #19424: [SPARK-22197][SQL] push down operators to data so...

2017-10-05 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19424#discussion_r143072487 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/PushDownOperatorsToDataSource.scala --- @@ -0,0 +1,99 @@ +/* +

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143069049 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143068229 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18924 **[Test build #82487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82487/testReport)** for PR 18924 at commit

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143067455 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-05 Thread akopich
Github user akopich commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r143066229 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +463,60 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-05 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143048261 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

2017-10-05 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/17673#discussion_r143063348 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2VecCBOWSolver.scala --- @@ -0,0 +1,344 @@ +/* + * Licensed to the Apache Software

  1   2   3   4   >