[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82465 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82465/testReport)** for PR 19436 at commit

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82465/ Test FAILed. ---

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGene...

2017-10-04 Thread rekhajoshm
Github user rekhajoshm closed the pull request at: https://github.com/apache/spark/pull/19418 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #13440: [SPARK-15699] [ML] Implement a Chi-Squared test statisti...

2017-10-04 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/13440 @srowen @thunterdb any more thoughts on this? how about @sethah @yanboliang @jkbradley? --- - To unsubscribe, e-mail:

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82466/ Test FAILed. ---

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19418 I think we don't want to add such defensive condition and avoid the logtrace. Without that, we don't know the problem is happened. We should identify the issue and fix it if it is really there,

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82466 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82466/testReport)** for PR 18732 at commit

[GitHub] spark pull request #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19436#discussion_r142850005 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/objects.scala --- @@ -394,7 +394,11 @@ case class FlatMapGroupsInRExec( override def

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread rekhajoshm
Github user rekhajoshm commented on the issue: https://github.com/apache/spark/pull/19418 @viirya You are correct, i am on latest master, and i did not get it yet. This PR was to have a defensive condition. As, if this happens only under certain unique data/flow, this is the one

[GitHub] spark pull request #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work...

2017-10-04 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/19436#discussion_r142849580 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/objects.scala --- @@ -394,7 +394,11 @@ case class FlatMapGroupsInRExec(

[GitHub] spark issue #19435: [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "keyWithIn...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19435 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82462/ Test PASSed. ---

[GitHub] spark issue #19435: [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "keyWithIn...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19435 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19435: [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "keyWithIn...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19435 **[Test build #82462 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82462/testReport)** for PR 19435 at commit

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19418 From the stacktrace posted in the JIRA, the problematic code is: /* 151 */ comp = agg_bufValue.compare(smj_value3); `agg_bufValue` is a `long` but `smj_value3` is a

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19436 @HyukjinKwon Yeah, wait me few minutes. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82463/ Test FAILed. ---

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82463 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82463/testReport)** for PR 18732 at commit

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19436 LGTM mind opening a JIRA? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark pull request #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGene...

2017-10-04 Thread rekhajoshm
Github user rekhajoshm commented on a diff in the pull request: https://github.com/apache/spark/pull/19418#discussion_r142848020 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -697,7 +697,12 @@ class

[GitHub] spark pull request #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGene...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19418#discussion_r142847702 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -697,7 +697,12 @@ class CodegenContext {

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19418 **[Test build #82471 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82471/testReport)** for PR 19418 at commit

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82464/ Test FAILed. ---

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82464 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82464/testReport)** for PR 19424 at commit

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82470 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82470/testReport)** for PR 19436 at commit

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19436 Ok. The added test works to verify this is an issue. See the test result of https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82468/testReport. ---

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82469/testReport)** for PR 18732 at commit

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142845456 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,67 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82468/ Test FAILed. ---

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82468 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82468/testReport)** for PR 19436 at commit

[GitHub] spark pull request #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGene...

2017-10-04 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19418#discussion_r142844714 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -697,7 +697,14 @@ class CodegenContext {

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19418 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19418 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82467/ Test FAILed. ---

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19418 **[Test build #82467 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82467/testReport)** for PR 19418 at commit

[GitHub] spark pull request #19432: [SPARK-22203][SQL]Add job description for file li...

2017-10-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19432 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19432: [SPARK-22203][SQL]Add job description for file listing S...

2017-10-04 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/19432 Thanks! Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82468 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82468/testReport)** for PR 19436 at commit

[GitHub] spark pull request #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGene...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19418#discussion_r142841674 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -697,7 +697,14 @@ class CodegenContext {

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142841543 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,67 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19418 **[Test build #82467 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82467/testReport)** for PR 19418 at commit

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142840490 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -519,3 +519,18 @@ case class CoGroup(

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82466 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82466/testReport)** for PR 18732 at commit

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82465 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82465/testReport)** for PR 19436 at commit

[GitHub] spark pull request #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribut...

2017-10-04 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/19436 [SQL][WIP] Fix FlatMapGroupsInR's child distribution when grouping attributes are empty ## What changes were proposed in this pull request? Looks like

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82464 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82464/testReport)** for PR 19424 at commit

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142839010 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -26,6 +26,25 @@ import

[GitHub] spark pull request #19424: [SPARK-22197][SQL] push down operators to data so...

2017-10-04 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19424#discussion_r142838899 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/PushDownOperatorsToDataSource.scala --- @@ -0,0 +1,99 @@ +/* + *

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82463 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82463/testReport)** for PR 18732 at commit

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18732 I pushed a new commit addressing the comments. Let me scan through the comments again. I think there are some comments around worker.py not being addressed yet. ---

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142836611 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -26,6 +26,25 @@ import

[GitHub] spark pull request #19416: [SPARK-22187][SS] Update unsaferow format for sav...

2017-10-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19416 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19435: [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "ke...

2017-10-04 Thread lw-lin
Github user lw-lin commented on a diff in the pull request: https://github.com/apache/spark/pull/19435#discussion_r142836474 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala --- @@ -291,7 +291,7 @@ class

[GitHub] spark issue #19435: [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "keyWithIn...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19435 **[Test build #82462 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82462/testReport)** for PR 19435 at commit

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142836297 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -519,3 +519,18 @@ case class CoGroup(

[GitHub] spark pull request #19435: [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "ke...

2017-10-04 Thread lw-lin
GitHub user lw-lin opened a pull request: https://github.com/apache/spark/pull/19435 [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "keyWithIndexToValue" ## What changes were proposed in this pull request? This PR changes `keyWithIndexToNumValues` to `keyWithIndexToValue`.

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142836245 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -519,3 +519,18 @@ case class CoGroup(

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142835260 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,66 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142833499 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142833374 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-04 Thread hhbyyh
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/18924 Yes, I think local test is enough for both correctness and performance. For consistency with old LDA, just some manual local test would be sufficient. You may well just use the LDA example

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142831316 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19432: [SPARK-22203][SQL]Add job description for file listing S...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19432 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19432: [SPARK-22203][SQL]Add job description for file listing S...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19432 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82460/ Test PASSed. ---

[GitHub] spark issue #19432: [SPARK-22203][SQL]Add job description for file listing S...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19432 **[Test build #82460 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82460/testReport)** for PR 19432 at commit

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-04 Thread akopich
Github user akopich commented on the issue: https://github.com/apache/spark/pull/18924 @jkbradley, thank you! - Correctness: in order to test the equivalence of two versions of `submitMiniBatch` I have to bring both of them into the scope... One solution would be to derive a

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142826379 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142826326 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +462,54 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82461 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82461/testReport)** for PR 19424 at commit

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82461/ Test FAILed. ---

[GitHub] spark issue #19434: [SPARK-21785][SQL]Support create table from a parquet fi...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19434 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19434: [SPARK-21785][SQL]Support create table from a par...

2017-10-04 Thread CrazyJacky
GitHub user CrazyJacky opened a pull request: https://github.com/apache/spark/pull/19434 [SPARK-21785][SQL]Support create table from a parquet file schema ## Support create table from a parquet file schema As described in jira: ```sql CREATE EXTERNAL TABLE IF NOT EXISTS

[GitHub] spark issue #19412: [SPARK-22142][BUILD][STREAMING] Move Flume support behin...

2017-10-04 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19412 I think the argument for it if anything is that it's a) deprecated, so should kinda be optional to build, and b) this would simply be consistent with how other external/* modules are handled. For

[GitHub] spark issue #19272: [Spark-21842][Mesos] Support Kerberos ticket renewal and...

2017-10-04 Thread ArtRand
Github user ArtRand commented on the issue: https://github.com/apache/spark/pull/19272 @kalvinnchau I'm running Hadoop 2.6 on a DC/OS cluster with Mesos 1.4.0 --- - To unsubscribe, e-mail:

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82461 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82461/testReport)** for PR 19424 at commit

[GitHub] spark issue #19392: [SPARK-22169][SQL] support byte length literal as identi...

2017-10-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19392 hmm, it's not a bug fix but a nice-to-have feature, do we want this in spark 2.2? --- - To unsubscribe, e-mail:

[GitHub] spark issue #19433: [SPARK-3162] [MLlib][WIP] Add local tree training for de...

2017-10-04 Thread smurching
Github user smurching commented on the issue: https://github.com/apache/spark/pull/19433 @WeichenXu123 would you be able to take an initial look at this? --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19270: [SPARK-21809] : Change Stage Page to use datatabl...

2017-10-04 Thread ajbozarth
Github user ajbozarth commented on a diff in the pull request: https://github.com/apache/spark/pull/19270#discussion_r142782246 --- Diff: core/src/main/resources/org/apache/spark/ui/static/taskspages.js --- @@ -0,0 +1,474 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #19270: [SPARK-21809] : Change Stage Page to use datatabl...

2017-10-04 Thread ajbozarth
Github user ajbozarth commented on a diff in the pull request: https://github.com/apache/spark/pull/19270#discussion_r142781908 --- Diff: core/src/main/resources/org/apache/spark/ui/static/utils.js --- @@ -46,3 +46,64 @@ function formatBytes(bytes, type) { var i =

[GitHub] spark issue #19433: [SPARK-3162] [MLlib][WIP] Add local tree training for de...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19433 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19433: [SPARK-3162] [MLlib][WIP] Add local tree training...

2017-10-04 Thread smurching
GitHub user smurching opened a pull request: https://github.com/apache/spark/pull/19433 [SPARK-3162] [MLlib][WIP] Add local tree training for decision tree regressors ## What changes were proposed in this pull request? WIP, DO NOT MERGE ### Overview This PR

[GitHub] spark issue #19432: [SPARK-22203][SQL]Add job description for file listing S...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19432 **[Test build #82460 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82460/testReport)** for PR 19432 at commit

[GitHub] spark pull request #19424: [SPARK-22197][SQL] push down operators to data so...

2017-10-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19424#discussion_r142800670 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/PushDownOperatorsToDataSource.scala --- @@ -0,0 +1,99 @@ +/* +

[GitHub] spark pull request #19424: [SPARK-22197][SQL] push down operators to data so...

2017-10-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19424#discussion_r142801593 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala --- @@ -32,13 +32,12 @@ import

[GitHub] spark pull request #19424: [SPARK-22197][SQL] push down operators to data so...

2017-10-04 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19424#discussion_r142806719 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/PushDownOperatorsToDataSource.scala --- @@ -0,0 +1,99 @@ +/* +

[GitHub] spark issue #19432: [SPARK-22203][SQL]Add job description for file listing S...

2017-10-04 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/19432 cc @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19432: [SPARK-22203][SQL]Add job description for file li...

2017-10-04 Thread zsxwing
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/19432 [SPARK-22203][SQL]Add job description for file listing Spark jobs ## What changes were proposed in this pull request? The user may be confused about some 1-tasks jobs. We can add a job

[GitHub] spark pull request #19428: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-04 Thread susanxhuynh
Github user susanxhuynh closed the pull request at: https://github.com/apache/spark/pull/19428 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142801623 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -26,6 +26,25 @@ import

[GitHub] spark issue #19108: [SPARK-21898][ML] Feature parity for KolmogorovSmirnovTe...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19108 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19108: [SPARK-21898][ML] Feature parity for KolmogorovSmirnovTe...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19108 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82459/ Test PASSed. ---

[GitHub] spark issue #19108: [SPARK-21898][ML] Feature parity for KolmogorovSmirnovTe...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19108 **[Test build #82459 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82459/testReport)** for PR 19108 at commit

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142796899 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -519,3 +519,18 @@ case class CoGroup(

[GitHub] spark issue #18966: [SPARK-21751][SQL] CodeGeneraor.splitExpressions counts ...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18966 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18966: [SPARK-21751][SQL] CodeGeneraor.splitExpressions counts ...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18966 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82455/ Test PASSed. ---

[GitHub] spark issue #18966: [SPARK-21751][SQL] CodeGeneraor.splitExpressions counts ...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18966 **[Test build #82455 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82455/testReport)** for PR 18966 at commit

  1   2   3   >