[GitHub] spark pull request #19294: [SPARK-21549][CORE] Respect OutputFormats with no...

2017-10-06 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19294#discussion_r143232229 --- Diff: core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala --- @@ -57,6 +60,15 @@ class

[GitHub] spark issue #19448: [SPARK-22217] [SQL] ParquetFileFormat to support arbitra...

2017-10-06 Thread steveloughran
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/19448 + @rdblue --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19448: [SPARK-22217] [SQL] ParquetFileFormat to support ...

2017-10-06 Thread steveloughran
GitHub user steveloughran opened a pull request: https://github.com/apache/spark/pull/19448 [SPARK-22217] [SQL] ParquetFileFormat to support arbitrary OutputCommitters ## What changes were proposed in this pull request? `ParquetFileFormat` to relax its requirement of output

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19447 Instead of simply introducing a new conf, we need to answer three questions; otherwise, this is not useful to the users. - What is the perf impact of this conf? - When should users

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19082 TPC-DS benchmark just shows the perf of one of the Spark workloads. To avoid introducing the unexpected regressions, we have to be very careful when merging this PR and

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19447 Yes, with small values it will produce a lot of small `NestedClass`es, but it will work. Instead, if the value is too high this, all the functions (methods) which are created are inlined in the

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143228937 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -934,6 +934,15 @@ object SQLConf { .intConf

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-06 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143228854 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -934,6 +934,15 @@ object SQLConf { .intConf

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19447 Do you mean Spark will work with small value like 1000? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-06 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143228054 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -73,25 +73,37 @@ case class

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-06 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143227547 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -934,6 +934,15 @@ object SQLConf { .intConf

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-06 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143226823 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -73,25 +73,37 @@ case class

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-06 Thread rdblue
Github user rdblue commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143226714 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -73,25 +73,37 @@ case class

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19447 @dongjoon-hyun I am not sure how I can test it. The use case in which this was useful is quite complex and I have not been able to reproduce it in a simpler way. ---

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-06 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143224951 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -934,6 +934,15 @@ object SQLConf { .intConf

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/19447 Could you add a test case for this? Since it's configurable, you can set a small number and catch some exceptions? --- -

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-06 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143223820 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -934,6 +934,15 @@ object SQLConf { .intConf

[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-06 Thread skonto
Github user skonto commented on a diff in the pull request: https://github.com/apache/spark/pull/19437#discussion_r143221450 --- Diff: resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/Utils.scala --- @@ -105,4 +108,108 @@ object Utils { def

[GitHub] spark pull request #19437: [SPARK-22131][MESOS] Mesos driver secrets

2017-10-06 Thread skonto
Github user skonto commented on a diff in the pull request: https://github.com/apache/spark/pull/19437#discussion_r143221337 --- Diff: resource-managers/mesos/src/test/scala/org/apache/spark/scheduler/cluster/mesos/Utils.scala --- @@ -105,4 +108,108 @@ object Utils { def

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread mgaido91
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/19447 thank you for your comments @kiszk and @dongjoon-hyun, I changed a bit the approach according to a similar approach in the same file:

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-06 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143220404 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -934,6 +934,15 @@ object SQLConf { .intConf

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-06 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143219623 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -934,6 +934,15 @@ object SQLConf { .intConf

[GitHub] spark issue #18931: [SPARK-21717][SQL] Decouple consume functions of physica...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18931 **[Test build #82516 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82516/testReport)** for PR 18931 at commit

[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19370 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19370 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82510/ Test PASSed. ---

[GitHub] spark pull request #19082: [SPARK-21870][SQL] Split aggregation code into sm...

2017-10-06 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19082#discussion_r143218854 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -944,6 +945,24 @@ class CodegenContext {

[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19370 **[Test build #82510 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82510/testReport)** for PR 19370 at commit

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-06 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19082 sure! I'll check two more; - master + #18931 - master + this pr + #18931 --- - To unsubscribe, e-mail:

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-06 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143217286 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -279,11 +279,13 @@ class CodegenContext

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-06 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19082 I'm also curious the performance of the master + #18931 on q66. Based on @maropu's benchmark before, q66 has much performance improvement. ---

[GitHub] spark pull request #19082: [SPARK-21870][SQL] Split aggregation code into sm...

2017-10-06 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19082#discussion_r143216395 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -944,6 +945,24 @@ class CodegenContext {

[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-10-06 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/19370 I saw the same problem on github.com a couple of days ago. I think something is broken with the github integration but not sure how that is managed. ---

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143215289 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -279,11 +279,13 @@ class

[GitHub] spark issue #19442: [SPARK-8515][ML][WIP] Improve ML Attribute API

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19442 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82511/ Test PASSed. ---

[GitHub] spark issue #19442: [SPARK-8515][ML][WIP] Improve ML Attribute API

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19442 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19442: [SPARK-8515][ML][WIP] Improve ML Attribute API

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19442 **[Test build #82511 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82511/testReport)** for PR 19442 at commit

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82514 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82514/testReport)** for PR 19424 at commit

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82515 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82515/testReport)** for PR 18732 at commit

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-06 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143213850 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -279,11 +279,13 @@ class CodegenContext {

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-06 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18732 Hi All, I think all comments should be addressed at this point, except for the naming comment from @rxin. If I missed something or if there is anything else you want me to address,

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-06 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143213000 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,69 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark issue #19444: [SPARK-22214][SQL] Refactor the list hive partitions cod...

2017-10-06 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/19444 cc @gatorsmile @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-06 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19082 Also, I checked the performance of the master + this pr; ``` OpenJDK 64-Bit Server VM 1.8.0_141-b16 on Linux 4.9.38-16.35.amzn1.x86_64 Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz TPCDS

[GitHub] spark pull request #19090: [SPARK-21877][DEPLOY, WINDOWS] Handle quotes in W...

2017-10-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19090 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-06 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143207681 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -73,25 +73,37 @@ case class

[GitHub] spark issue #19090: [SPARK-21877][DEPLOY, WINDOWS] Handle quotes in Windows ...

2017-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19090 Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19394: [SPARK-22170][SQL] Reduce memory consumption in b...

2017-10-06 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19394#discussion_r143205255 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala --- @@ -73,25 +73,37 @@ case class

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-06 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143204826 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -279,11 +279,13 @@ class

[GitHub] spark issue #19438: [SPARK-22208] [SQL] Improve percentile_approx by not rou...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19438 **[Test build #82512 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82512/testReport)** for PR 19438 at commit

[GitHub] spark issue #19424: [SPARK-22197][SQL] push down operators to data source be...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19424 **[Test build #82513 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82513/testReport)** for PR 19424 at commit

[GitHub] spark pull request #19438: [SPARK-22208] [SQL] Improve percentile_approx by ...

2017-10-06 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/19438#discussion_r143203632 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/ImputerSuite.scala --- @@ -43,7 +43,7 @@ class ImputerSuite extends SparkFunSuite with

[GitHub] spark pull request #19438: [SPARK-22208] [SQL] Improve percentile_approx by ...

2017-10-06 Thread wzhfy
Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/19438#discussion_r143202515 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/ApproximatePercentileQuerySuite.scala --- @@ -129,7 +144,7 @@ class ApproximatePercentileQuerySuite

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-06 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19447#discussion_r143202186 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala --- @@ -279,11 +279,13 @@ class CodegenContext {

[GitHub] spark issue #19444: [SPARK-22214][SQL] Refactor the list hive partitions cod...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19444 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82509/ Test PASSed. ---

[GitHub] spark issue #19444: [SPARK-22214][SQL] Refactor the list hive partitions cod...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19444 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19444: [SPARK-22214][SQL] Refactor the list hive partitions cod...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19444 **[Test build #82509 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82509/testReport)** for PR 19444 at commit

[GitHub] spark issue #19443: [SPARK-22212][SQL][PySpark] Some SQL functions in Python...

2017-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19443 I think it's okay to wait for a few days more and for other committers who might support or like this idea before closing this. I won't stay against. Providing more compelling reasons

[GitHub] spark pull request #19412: [SPARK-22142][BUILD][STREAMING] Move Flume suppor...

2017-10-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19412 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19412: [SPARK-22142][BUILD][STREAMING] Move Flume support behin...

2017-10-06 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19412 Merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19443: [SPARK-22212][SQL][PySpark] Some SQL functions in Python...

2017-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19443 > I think the argument about consistency here is valid, though, I agree with @jaceklaskowski that changes should go one way or the other, i.e. allow string column names or remove this option

[GitHub] spark issue #19447: [SPARK-22215][SQL] Add configuration to set the threshol...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19447 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19447: [SPARK-22215][SQL] Add configuration to set the t...

2017-10-06 Thread mgaido91
GitHub user mgaido91 opened a pull request: https://github.com/apache/spark/pull/19447 [SPARK-22215][SQL] Add configuration to set the threshold for generated class ## What changes were proposed in this pull request? SPARK-18016 introduced an arbitrary threshold for the

[GitHub] spark pull request #18732: [SPARK-20396][SQL][PySpark] groupby().apply() wit...

2017-10-06 Thread icexelloss
Github user icexelloss commented on a diff in the pull request: https://github.com/apache/spark/pull/18732#discussion_r143198047 --- Diff: python/pyspark/sql/group.py --- @@ -192,7 +193,69 @@ def pivot(self, pivot_col, values=None): jgd = self._jgd.pivot(pivot_col)

[GitHub] spark issue #19442: [SPARK-8515][ML][WIP] Improve ML Attribute API

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19442 **[Test build #82511 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82511/testReport)** for PR 19442 at commit

[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19370 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82508/ Test PASSed. ---

[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19370 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19370 **[Test build #82508 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82508/testReport)** for PR 19370 at commit

[GitHub] spark issue #19443: [SPARK-22212][SQL][PySpark] Some SQL functions in Python...

2017-10-06 Thread jsnowacki
Github user jsnowacki commented on the issue: https://github.com/apache/spark/pull/19443 @HyukjinKwon Thanks for pointing that out. I think the argument about consistency here is valid, though, I agree with @jaceklaskowski that changes should go one way or the other, i.e. allow

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18664 cc @ueshin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Timestamp ...

2017-10-06 Thread icexelloss
Github user icexelloss commented on the issue: https://github.com/apache/spark/pull/18664 Thanks @gatorsmile for the constructive feedback! I don't want to make this more complicated but I also want to make sure we are aware that there is also difference between

[GitHub] spark issue #19294: [SPARK-21549][CORE] Respect OutputFormats with no output...

2017-10-06 Thread szhem
Github user szhem commented on the issue: https://github.com/apache/spark/pull/19294 @mridulm sql-related tests were removed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19294: [SPARK-21549][CORE] Respect OutputFormats with no output...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19294 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19294: [SPARK-21549][CORE] Respect OutputFormats with no output...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19294 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82504/ Test PASSed. ---

[GitHub] spark issue #19294: [SPARK-21549][CORE] Respect OutputFormats with no output...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19294 **[Test build #82504 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82504/testReport)** for PR 19294 at commit

[GitHub] spark issue #19443: [SPARK-22212][SQL][PySpark] Some SQL functions in Python...

2017-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19443 This might look okay within Python side because the fix looks minimised and does not actually increase complexity much; however, I think we focus on API consistency between other languages in

[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19370 Yup, it looks triggering fine - https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1822-master although I wonder why check mark does not appear. I think it is not specific to

[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19370 **[Test build #82510 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82510/testReport)** for PR 19370 at commit

[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-10-06 Thread jsnowacki
Github user jsnowacki commented on the issue: https://github.com/apache/spark/pull/19370 @HyukjinKwon Commit squashed to one as you've requested. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19082 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82502/ Test PASSed. ---

[GitHub] spark issue #19082: [SPARK-21870][SQL] Split aggregation code into small fun...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19082 **[Test build #82502 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82502/testReport)** for PR 19082 at commit

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18924 **[Test build #82506 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82506/testReport)** for PR 18924 at commit

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82506/ Test PASSed. ---

[GitHub] spark issue #19090: [SPARK-21877][DEPLOY, WINDOWS] Handle quotes in Windows ...

2017-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19090 Build started:

[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19370 Otherwise, looks good to me. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19370: [SPARK-18136] Fix setup of SPARK_HOME variable on Window...

2017-10-06 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19370 @jsnowacki, would you mind if I ask squash those commits into single one so that we can check if the squashed commit, having the changes in `appveyor.yml` and `*.cmd`, actually triggers

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18924 **[Test build #82505 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82505/testReport)** for PR 18924 at commit

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82505/ Test PASSed. ---

[GitHub] spark pull request #19446: Dataset optimization

2017-10-06 Thread sohum2002
Github user sohum2002 closed the pull request at: https://github.com/apache/spark/pull/19446 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19446: Dataset optimization

2017-10-06 Thread sohum2002
GitHub user sohum2002 opened a pull request: https://github.com/apache/spark/pull/19446 Dataset optimization The proposed two new additional functions is to help select all the columns in a Dataset except for given columns. You can merge this pull request into a Git repository by

[GitHub] spark pull request #19445: Dataset select all columns

2017-10-06 Thread sohum2002
Github user sohum2002 closed the pull request at: https://github.com/apache/spark/pull/19445 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19444: [SPARK-22214][SQL] Refactor the list hive partiti...

2017-10-06 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/19444#discussion_r143168926 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala --- @@ -638,12 +638,14 @@ private[hive] class HiveClientImpl(

[GitHub] spark pull request #19445: Dataset select all columns

2017-10-06 Thread sohum2002
GitHub user sohum2002 opened a pull request: https://github.com/apache/spark/pull/19445 Dataset select all columns The proposed two new additional functions is to help select all the columns in a Dataset except for given columns. You can merge this pull request into a Git

[GitHub] spark issue #19444: [SPARK-22214][SQL] Refactor the list hive partitions cod...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19444 **[Test build #82509 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82509/testReport)** for PR 19444 at commit

[GitHub] spark pull request #19444: [SPARK-22214][SQL] Refactor the list hive partiti...

2017-10-06 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request: https://github.com/apache/spark/pull/19444 [SPARK-22214][SQL] Refactor the list hive partitions code ## What changes were proposed in this pull request? In this PR we make a few changes to the list hive partitions code, to make

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82501/ Test PASSed. ---

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18732 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18732: [SPARK-20396][SQL][PySpark] groupby().apply() with panda...

2017-10-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18732 **[Test build #82501 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82501/testReport)** for PR 18732 at commit

<    1   2   3   4   >