[GitHub] spark issue #22160: Revert "[SPARK-24418][BUILD] Upgrade Scala to 2.11.12 an...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22160 @dbtsai Have you tried to run it in scala 2.12? We still can do the upgrade after Apache 2.4 RC. --- - To unsubscribe, e

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-20 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21320 @mallman Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22160: Revert "[SPARK-24418][BUILD] Upgrade Scala to 2.11.12 an...

2018-08-20 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22160 Thank you! @dbtsai Let us see whether this can pass all the tests? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-20 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21320 Try this when `spark.sql.nestedSchemaPruning.enabled` is on? ```SQL withTable("t1") { spark.sql( """ |Create table t

[GitHub] spark pull request #22123: [SPARK-25134][SQL] Csv column pruning with checki...

2018-08-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22123#discussion_r211287978 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala --- @@ -1603,6 +1603,39 @@ class CSVSuite extends

[GitHub] spark pull request #22123: [SPARK-25134][SQL] Csv column pruning with checki...

2018-08-20 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22123#discussion_r211283824 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala --- @@ -227,10 +210,9 @@ object

[GitHub] spark issue #21909: [SPARK-24959][SQL] Speed up count() for JSON and CSV

2018-08-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21909 LGTM. Thanks for being patient to address all the comments! Merged to master. --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22123: [SPARK-25134][SQL] Csv column pruning with checking of h...

2018-08-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22123 cc @MaxGekk --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20226: [SPARK-23034][SQL] Override `nodeName` for all *ScanExec...

2018-08-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20226 @maropu Could you take this over? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #21909: [SPARK-24959][SQL] Speed up count() for JSON and ...

2018-08-17 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21909#discussion_r211045699 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JsonDataSource.scala --- @@ -223,7 +224,8 @@ object

[GitHub] spark pull request #21909: [SPARK-24959][SQL] Speed up count() for JSON and ...

2018-08-17 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21909#discussion_r211045061 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1492,6 +1492,15 @@ object SQLConf { "This us

[GitHub] spark issue #22121: [SPARK-25133][SQL][Doc]Avro data source guide

2018-08-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22121 We also need to document the extra enhancements that are added in this release, compared with the databricks/spark-avro package

[GitHub] spark issue #22121: [SPARK-25133][SQL][Doc]Avro data source guide

2018-08-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22121 We should do the same thing for the other native sources. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22121: [SPARK-25133][SQL][Doc]Avro data source guide

2018-08-17 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22121#discussion_r210970616 --- Diff: docs/avro-data-source-guide.md --- @@ -0,0 +1,267 @@ +--- +layout: global +title: Avro Data Source Guide +--- + +Since

[GitHub] spark issue #22134: [SPARK-25143][SQL] Support data source name mapping conf...

2018-08-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22134 Do we need it in the current stage? Regarding UX, it looks complex to end users. I am unable to remember the names. It is very easy to provide a wrong class name

[GitHub] spark pull request #22133: [SPARK-25129][SQL]Make the mapping of com.databri...

2018-08-17 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22133#discussion_r210959550 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -637,6 +638,17 @@ object DataSource extends

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21537 @kiszk Please create a JIRA and we can post more ideas there. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22121: [SPARK-25133][SQL][Doc]AVRO data source guide

2018-08-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22121 @gengliangwang Could you also post the screen shot in your PR description? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21537 @kiszk The initial prototype or proof of concept can be in any personal branch. When we merge it to the master branch, we still need to separate it from the current codegen and make

[GitHub] spark pull request #21909: [SPARK-24959][SQL] Speed up count() for JSON and ...

2018-08-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21909#discussion_r210767018 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala --- @@ -2223,21 +2223,31 @@ class JsonSuite extends

[GitHub] spark pull request #21909: [SPARK-24959][SQL] Speed up count() for JSON and ...

2018-08-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21909#discussion_r210765672 --- Diff: docs/sql-programming-guide.md --- @@ -1894,6 +1894,7 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark issue #22108: [SPARK-25092][SQL][FOLLOWUP] Add RewriteCorrelatedScalar...

2018-08-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22108 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21909: [SPARK-24959][SQL] Speed up count() for JSON and ...

2018-08-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21909#discussion_r210693829 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala --- @@ -2223,21 +2223,31 @@ class JsonSuite extends

[GitHub] spark pull request #21909: [SPARK-24959][SQL] Speed up count() for JSON and ...

2018-08-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21909#discussion_r210666117 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1492,6 +1492,15 @@ object SQLConf { "This us

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21537 @HyukjinKwon I am worrying about the design of a mixture of representation s"" and code""? When the design is not good, it is hard to maintain it and add new code based on

[GitHub] spark pull request #21868: [SPARK-24906][SQL] Adaptively enlarge split / par...

2018-08-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21868#discussion_r210494497 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -425,12 +426,44 @@ case class FileSourceScanExec

[GitHub] spark issue #22119: [WIP][SPARK-25129][SQL] Revert mapping com.databricks.sp...

2018-08-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22119 For details, see the discussion in the JIRA https://issues.apache.org/jira/browse/SPARK-24924 --- - To unsubscribe, e-mail

[GitHub] spark issue #22111: [SPARK-25123][SQL] Use Block to track code in SimpleExpr...

2018-08-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22111 Let us hold these codegen PRs until we see the design doc for building IR for the codegen? --- - To unsubscribe, e-mail

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21537 We are fully swamped by the hotfix and regressions of 2.3 release and the new features that are targeting to 2.4. We should post some comments in this PR earlier. Designing an IR

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21537 To Spark users, introducing AnalysisBarrier is a disaster. However, to the developers of Spark internal, this is just a bug. If you served the customers who are heavily using Spark, you

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21537 I am fine to not revert it since it is too late. So many related PRs have been merged, but we need to seriously consider writing and reviewing the design docs before changing the code generation

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21537 AnalysisBarrier does not introduce a behavior change. However, this requires our analyzer rules must be idempotent. The most recent correctness bug also shows another big potential hole https

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21537 This is not related to the credit or not. I think we need to introduce an IR like what the compiler is doing instead of continuous improvement on the existing one, which is already very hacky

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21537 This is bad. The design needs to be carefully reviewed before implementing it. Basically, we breaks the basic principles of software engineering. It is very strange to write the design doc after

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21537 @kiszk You are a JVM expert with a very strong IR background. Could you lead the efforts and drive the IR design

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21537 Another example is the AnalysisBarrier, which becomes a disaster to Spark 2.3 release. Many blockers, correctness bugs, performance regressions are caused by that. Thus, I think we should revert

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21537 Based on this PR, so many changes will be made in the codegen. The codegen is very fundamental to Spark SQL. I do not think we should merge this PR at this stage. To be more disciplined, we need

[GitHub] spark issue #22007: [SPARK-25033] Bump Apache commons.{httpclient, httpcore}

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22007 The bump is fine but this is not for making it congruent with Stocator, which is just an external connector

[GitHub] spark issue #17185: [SPARK-19602][SQL] Support column resolution of fully qu...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17185 Let us discuss it in the JIRA https://issues.apache.org/jira/browse/SPARK-25121 --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #17185: [SPARK-19602][SQL] Support column resolution of fully qu...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17185 @skambha @dilipbiswal How about the hint resolution after supporting multi part names? --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #17400: [SPARK-19981][SQL] Update output partitioning info. when...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/17400 @maropu We should fix it. How about doing it after the code freeze? So far, all are swamped by different tasks

[GitHub] spark issue #22107: [SPARK-25117] Add EXEPT ALL and INTERSECT ALL support in...

2018-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22107 cc @felixcheung --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22102: [SPARK-25051][SQL] FixNullability should not stop...

2018-08-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22102#discussion_r210053264 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1704,6 +1704,7 @@ class Analyzer

[GitHub] spark issue #21123: [SPARK-24045][SQL]Create base class for file data source...

2018-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21123 @HyukjinKwon @rdblue V2 data source APIs should not break FileFormat V1 compatibility, IMO. Based on my experience, it is not a right thing to force users to change their Spark applications

[GitHub] spark issue #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in FileSo...

2018-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22104 @icexelloss Do we face the same issue for DataSourceStrategy? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in...

2018-08-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22104#discussion_r210044089 --- Diff: python/pyspark/sql/tests.py --- @@ -3367,6 +3367,24 @@ def test_ignore_column_of_all_nulls(self): finally

[GitHub] spark pull request #22101: [SPARK-25114][Core] Fix RecordBinaryComparator wh...

2018-08-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22101#discussion_r210043412 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/execution/RecordBinaryComparator.java --- @@ -27,7 +27,6 @@ public int compare

[GitHub] spark issue #22102: [SPARK-25051][SQL] FixNullability should not stop on Ana...

2018-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22102 @mgaido91 The PR is merged. Could you close it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22102: [SPARK-25051][SQL] FixNullability should not stop...

2018-08-14 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22102#discussion_r210036342 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala --- @@ -2300,4 +2300,10 @@ class DataFrameSuite extends QueryTest

[GitHub] spark issue #22102: [SPARK-25051][SQL] FixNullability should not stop on Ana...

2018-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22102 Let me confirm it and then will merge it to 2.3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22102: [SPARK-25051][SQL] FixNullability should not stop on Ana...

2018-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22102 This might not be the last one. Let us backport the fix of @maryannxue ? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22082: [SPARK-24420][Build][FOLLOW-UP] Upgrade ASM6 APIs

2018-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22082 @dbtsai Nope. I did not hit any issue. :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 @BryanCutler @shaneknapp Thanks for your work! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21439: [SPARK-24391][SQL] Support arrays of any types by from_j...

2018-08-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21439 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

2018-08-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21439#discussion_r209461516 --- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql --- @@ -39,3 +39,8 @@ select from_json('{"a":1, "b&q

[GitHub] spark pull request #21439: [SPARK-24391][SQL] Support arrays of any types by...

2018-08-12 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21439#discussion_r209461334 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonParser.scala --- @@ -101,6 +102,21 @@ class JacksonParser

[GitHub] spark issue #22082: [SPARK-24420][Build][FOLLOW-UP] Upgrade ASM6 APIs

2018-08-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22082 cc @dbtsai @srowen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22082: [SPARK-24420][Build][FOLLOW-UP] Upgrade ASM6 APIs

2018-08-11 Thread gatorsmile
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/22082 [SPARK-24420][Build][FOLLOW-UP] Upgrade ASM6 APIs ## What changes were proposed in this pull request? Use ASM 6 APIs after we upgrading it to ASM6. ## How was this patch tested

[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...

2018-08-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22079 The original fix https://github.com/apache/spark/pull/22079/commits/efccc028bce64bf4754ce81ee16533c19b4384b2 has been merged to Spark 2.3. After 5+ months, we have not received any correctness

[GitHub] spark issue #22079: [SPARK-23207][SQL][BACKPORT-2.2] Shuffle+Repartition on ...

2018-08-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22079 cc @jiangxb1987 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22072: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...

2018-08-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22072 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22072: [SPARK-25081][Core]Nested spill in ShuffleExternalSorter...

2018-08-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22072 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22077: [SPARK-25084][SQL][BACKPORT-2.3] "distribute by" on mult...

2018-08-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22077 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22077: [SPARK-25084][SQL][BACKPORT-2.3] "distribute by" on mult...

2018-08-11 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22077 test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #21977: SPARK-25004: Add spark.executor.pyspark.memory limit.

2018-08-10 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21977 Why not using `resource.RLIMIT_RSS`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21889 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21320 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21087: [SPARK-23997][SQL] Configurable maximum number of...

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21087#discussion_r209077023 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -580,6 +580,11 @@ object SQLConf { .booleanConf

[GitHub] spark pull request #21087: [SPARK-23997][SQL] Configurable maximum number of...

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21087#discussion_r209076282 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -580,6 +580,11 @@ object SQLConf { .booleanConf

[GitHub] spark issue #22060: [DO NOT MERGE][TEST ONLY] Add once-policy rule check

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22060 `AliasViewChild` is the only rule? Can we whitelist it first? It sounds like many tests are skipped. --- - To unsubscribe

[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21889 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21320 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21608: [SPARK-24626] [SQL] Improve location size calculation in...

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21608 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21889 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21320 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22049: [SPARK-25063][SQL] Rename class KnowNotNull to KnownNotN...

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22049 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21994: [SPARK-24529][Build][test-maven][follow-up] Add s...

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21994#discussion_r20881 --- Diff: pom.xml --- @@ -2609,6 +2609,28 @@ + +com.github.spotbugs +spotbugs

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-09 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 Really thank you for your help! @shaneknapp --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-08 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21889 I hit the following error in my local environment. ``` sbt.ForkMain$ForkError: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 220.0 failed 1 times

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 @shaneknapp That is great! I think making Jenkins in a quite mode looks fine, as long as we send out a note to the dev list

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 https://lists.apache.org/thread.html/5b0836e44f9386fae2f99deed0a01441c699040c991d833faf520357@%3Cdev.arrow.apache.org%3E Arrow 0.10.0 release is officially announced. @shaneknapp Could

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-08 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 We need to upgrade it to 0.10.0 in this release, if possible. It resolves some bugs, e.g., https://issues.apache.org/jira/browse/ARROW-1973

[GitHub] spark pull request #20611: [SPARK-23425][SQL]Support wildcard in HDFS path f...

2018-08-07 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20611#discussion_r208412882 --- Diff: core/src/main/scala/org/apache/spark/util/Utils.scala --- @@ -1976,6 +1976,49 @@ private[spark] object Utils extends Logging

[GitHub] spark issue #21596: [SPARK-24601] Bump Jackson version

2018-08-07 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21596 @jerryshao This is for 3.0 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21608: [SPARK-24626] [SQL] Improve location size calcula...

2018-08-07 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21608#discussion_r208408858 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala --- @@ -49,4 +51,11 @@ object DataSourceUtils

[GitHub] spark issue #22028: [SPARK-25046][SQL] Fix Alter View can excute sql like "A...

2018-08-07 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22028 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21977: SPARK-25004: Add spark.executor.pyspark.memory limit.

2018-08-07 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21977 cc @jiangxb1987 @cloud-fan @jerryshao @vanzin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21977: SPARK-25004: Add spark.executor.pyspark.memory limit.

2018-08-07 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21977 @rdblue Is this for YARN only? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #21977: SPARK-25004: Add spark.executor.pyspark.memory limit.

2018-08-07 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21977 test cases? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22006: [SPARK-25031][SQL] Fix MapType schema print

2018-08-07 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22006 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22028: [SPARK-25046][SQL] Fix Alter View can excute sql like "A...

2018-08-07 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22028 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #22030: [SPARK-25048][SQL] Pivoting by multiple columns i...

2018-08-07 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22030#discussion_r208326382 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala --- @@ -403,20 +415,29 @@ class RelationalGroupedDataset protected

[GitHub] spark pull request #21608: [SPARK-24626] [SQL] Improve location size calcula...

2018-08-07 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21608#discussion_r208269963 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala --- @@ -49,4 +51,11 @@ object DataSourceUtils

[GitHub] spark issue #21970: [SPARK-24996][SQL] Use DSL in DeclarativeAggregate

2018-08-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21970 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22012: [SPARK-25036][SQL] Should compare ExprValue.isNull with ...

2018-08-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22012 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 It sounds like the vote can pass soon. https://lists.apache.org/thread.html/9900da1540be5aafce27691fd40395bb53f465302db29979c154d99a@%3Cdev.arrow.apache.org%3E

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 To get this in, we might need to delay the code freeze. Can you reply the dev list email http://apache-spark-developers-list.1001551.n3.nabble.com/code-freeze-and-branch-cut-for-Apache-Spark-2

[GitHub] spark issue #21939: [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10....

2018-08-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21939 After the code freeze, the dependency changes are not allowed. Hopefully, we can make it before that. --- - To unsubscribe

[GitHub] spark pull request #21909: [SPARK-24959][SQL] Speed up count() for JSON and ...

2018-08-06 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21909#discussion_r207850329 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala --- @@ -2225,19 +2225,21 @@ class JsonSuite extends

[GitHub] spark issue #21970: [SPARK-24996][SQL] Use DSL in DeclarativeAggregate

2018-08-06 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21970 LGTM pending Jenkins --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

<    2   3   4   5   6   7   8   9   10   11   >