[GitHub] spark issue #19434: [SPARK-21785][SQL]Support create table from a parquet fi...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19434 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-04 Thread hhbyyh
Github user hhbyyh commented on the issue: https://github.com/apache/spark/pull/18924 Yes, I think local test is enough for both correctness and performance. For consistency with old LDA, just some manual local test would be sufficient. You may well just use the LDA example

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142835260 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,66 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82470 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82470/testReport)** for PR 19436 at commit

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82465/ Test FAILed. ---

[GitHub] spark issue #19433: [SPARK-3162] [MLlib][WIP] Add local tree training for de...

2017-10-04 Thread smurching
Github user smurching commented on the issue: https://github.com/apache/spark/pull/19433 @WeichenXu123 would you be able to take an initial look at this? --- - To unsubscribe, e-mail:

[GitHub] spark issue #19272: [Spark-21842][Mesos] Support Kerberos ticket renewal and...

2017-10-04 Thread ArtRand
Github user ArtRand commented on the issue: https://github.com/apache/spark/pull/19272 @kalvinnchau I'm running Hadoop 2.6 on a DC/OS cluster with Mesos 1.4.0 --- - To unsubscribe, e-mail:

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142841543 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,67 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142845456 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,67 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19436 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82468/ Test FAILed. ---

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82468 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82468/testReport)** for PR 19436 at commit

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82463 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82463/testReport)** for PR 18732 at commit

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142836611 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -26,6 +26,25 @@ import

[GitHub] spark issue #19435: [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "keyWithIn...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19435 **[Test build #82462 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82462/testReport)** for PR 19435 at commit

[GitHub] spark pull request #19435: [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "ke...

2017-10-04 Thread lw-lin
Github user lw-lin commented on a diff in the pull request: https://github.com/apache/spark/pull/19435#discussion_r142836474 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala --- @@ -291,7 +291,7 @@ class

[GitHub] spark pull request #19416: [SPARK-22187][SS] Update unsaferow format for sav...

2017-10-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19416 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19418 **[Test build #82467 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82467/testReport)** for PR 19418 at commit

[GitHub] spark pull request #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGene...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19418#discussion_r142847702 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -697,7 +697,12 @@ class CodegenContext {

[GitHub] spark issue #19435: [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "keyWithIn...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19435 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82462/ Test PASSed. ---

[GitHub] spark issue #13440: [SPARK-15699] [ML] Implement a Chi-Squared test statisti...

2017-10-04 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/13440 @srowen @thunterdb any more thoughts on this? how about @sethah @yanboliang @jkbradley? --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGene...

2017-10-04 Thread rekhajoshm
Github user rekhajoshm closed the pull request at: https://github.com/apache/spark/pull/19418 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribut...

2017-10-04 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/19436 [SQL][WIP] Fix FlatMapGroupsInR's child distribution when grouping attributes are empty ## What changes were proposed in this pull request? Looks like

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-04 Thread akopich
Github user akopich commented on the issue: https://github.com/apache/spark/pull/18924 @jkbradley, thank you! - Correctness: in order to test the equivalence of two versions of `submitMiniBatch` I have to bring both of them into the scope... One solution would be to derive a

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142840490 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -519,3 +519,18 @@ case class CoGroup(

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82466 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82466/testReport)** for PR 18732 at commit

[GitHub] spark issue #19432: [SPARK-22203][SQL]Add job description for file listing S...

2017-10-04 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/19432 Thanks! Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19418 **[Test build #82471 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82471/testReport)** for PR 19418 at commit

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19436 @HyukjinKwon Yeah, wait me few minutes. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19436#discussion_r142850005 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/objects.scala --- @@ -394,7 +394,11 @@ case class FlatMapGroupsInRExec( override def

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82461 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82461/testReport)** for PR 19424 at commit

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82461 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82461/testReport)** for PR 19424 at commit

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82461/ Test FAILed. ---

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82463 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82463/testReport)** for PR 18732 at commit

[GitHub] spark pull request #19432: [SPARK-22203][SQL]Add job description for file li...

2017-10-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19432 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82464 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82464/testReport)** for PR 19424 at commit

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19436 LGTM mind opening a JIRA? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark pull request #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGene...

2017-10-04 Thread rekhajoshm
Github user rekhajoshm commented on a diff in the pull request: https://github.com/apache/spark/pull/19418#discussion_r142848020 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -697,7 +697,12 @@ class

[GitHub] spark issue #19435: [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "keyWithIn...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19435 **[Test build #82462 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82462/testReport)** for PR 19435 at commit

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19418 From the stacktrace posted in the JIRA, the problematic code is: /* 151 */ comp = agg_bufValue.compare(smj_value3); `agg_bufValue` is a `long` but `smj_value3` is a

[GitHub] spark issue #19435: [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "keyWithIn...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19435 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread rekhajoshm
Github user rekhajoshm commented on the issue: https://github.com/apache/spark/pull/19418 @viirya You are correct, i am on latest master, and i did not get it yet. This PR was to have a defensive condition. As, if this happens only under certain unique data/flow, this is the one

[GitHub] spark pull request #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work...

2017-10-04 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/19436#discussion_r142849580 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/objects.scala --- @@ -394,7 +394,11 @@ case class FlatMapGroupsInRExec(

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82466 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82466/testReport)** for PR 18732 at commit

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19418 I think we don't want to add such defensive condition and avoid the logtrace. Without that, we don't know the problem is happened. We should identify the issue and fix it if it is really there,

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82466/ Test FAILed. ---

[GitHub] spark issue #19436: [SPARK-22206][SQL][SparkR] gapply in R can't work on emp...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82465 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82465/testReport)** for PR 19436 at commit

[GitHub] spark issue #19412: [SPARK-22142][BUILD][STREAMING] Move Flume support behin...

2017-10-04 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19412 I think the argument for it if anything is that it's a) deprecated, so should kinda be optional to build, and b) this would simply be consistent with how other external/* modules are handled. For

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82465 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82465/testReport)** for PR 19436 at commit

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19436 **[Test build #82468 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82468/testReport)** for PR 19436 at commit

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19418 **[Test build #82467 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82467/testReport)** for PR 19418 at commit

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19418 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82467/ Test FAILed. ---

[GitHub] spark issue #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGenerator: ...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19418 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGene...

2017-10-04 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19418#discussion_r142844714 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -697,7 +697,14 @@ class CodegenContext {

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82469/testReport)** for PR 18732 at commit

[GitHub] spark issue #19436: [SQL][WIP] Fix FlatMapGroupsInR's child distribution whe...

2017-10-04 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19436 Ok. The added test works to verify this is an issue. See the test result of https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82468/testReport. ---

[GitHub] spark issue #19392: [SPARK-22169][SQL] support byte length literal as identi...

2017-10-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19392 hmm, it's not a bug fix but a nice-to-have feature, do we want this in spark 2.2? --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19434: [SPARK-21785][SQL]Support create table from a par...

2017-10-04 Thread CrazyJacky
GitHub user CrazyJacky opened a pull request: https://github.com/apache/spark/pull/19434 [SPARK-21785][SQL]Support create table from a parquet file schema ## Support create table from a parquet file schema As described in jira: ```sql CREATE EXTERNAL TABLE IF NOT EXISTS

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142826326 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +462,54 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142826379 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19432: [SPARK-22203][SQL]Add job description for file listing S...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19432 **[Test build #82460 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82460/testReport)** for PR 19432 at commit

[GitHub] spark issue #19432: [SPARK-22203][SQL]Add job description for file listing S...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19432 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82460/ Test PASSed. ---

[GitHub] spark issue #19432: [SPARK-22203][SQL]Add job description for file listing S...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19432 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142831316 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142833374 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-10-04 Thread hhbyyh
Github user hhbyyh commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r142833499 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,36 +462,55 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #19435: [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "ke...

2017-10-04 Thread lw-lin
GitHub user lw-lin opened a pull request: https://github.com/apache/spark/pull/19435 [WIP][SS][MINOR] "keyWithIndexToNumValues" -> "keyWithIndexToValue" ## What changes were proposed in this pull request? This PR changes `keyWithIndexToNumValues` to `keyWithIndexToValue`.

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142836245 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -519,3 +519,18 @@ case class CoGroup(

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18732 I pushed a new commit addressing the comments. Let me scan through the comments again. I think there are some comments around worker.py not being addressed yet. ---

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142839010 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -26,6 +26,25 @@ import

[GitHub] spark pull request #19424: [SPARK-22197][SQL] push down operators to data so...

2017-10-04 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19424#discussion_r142838899 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/PushDownOperatorsToDataSource.scala --- @@ -0,0 +1,99 @@ +/* + *

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142836297 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -519,3 +519,18 @@ case class CoGroup(

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82464 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82464/testReport)** for PR 19424 at commit

[GitHub] spark pull request #19418: [SPARK-19984][SQL] Fix for ERROR codegen.CodeGene...

2017-10-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19418#discussion_r142841674 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -697,7 +697,14 @@ class CodegenContext {

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82464/ Test FAILed. ---

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82463/ Test FAILed. ---

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19422: [SPARK-22193][SQL] Minor typo fix

2017-10-04 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19422 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19083 **[Test build #82443 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82443/testReport)** for PR 19083 at commit

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19061 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82444 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82444/testReport)** for PR 19424 at commit

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19083 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19083 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19425: [SPARK-22196][Core] Combine multiple input splits...

2017-10-04 Thread vgankidi
GitHub user vgankidi opened a pull request: https://github.com/apache/spark/pull/19425 [SPARK-22196][Core] Combine multiple input splits into a HadoopPartition ## What changes were proposed in this pull request? Spark native read path allows tuning the partition size based

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19424 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82444/ Test FAILed. ---

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19083 **[Test build #82442 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82442/testReport)** for PR 19083 at commit

[GitHub] spark issue #19061: [SPARK-21568][CORE] ConsoleProgressBar should only be en...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19061 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82441/ Test PASSed. ---

[GitHub] spark issue #19426: [SPARK-22190][CORE] Add Spark executor task metrics to D...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19426 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19424 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19083 **[Test build #82446 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82446/testReport)** for PR 19083 at commit

[GitHub] spark issue #19425: [SPARK-22196][Core] Combine multiple input splits into a...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19425 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19083 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82442/ Test FAILed. ---

[GitHub] spark issue #19083: [SPARK-21871][SQL] Check actual bytecode size when compi...

2017-10-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19083 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82443/ Test FAILed. ---

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-10-04 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/18704 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-04 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r142586820 --- Diff: python/pyspark/worker.py --- @@ -74,17 +74,35 @@ def wrap_udf(f, return_type): def wrap_pandas_udf(f, return_type): -

  1   2   3   >