[GitHub] spark issue #22197: [SPARK-25207][SQL] Case-insensitve field resolution for ...

2018-08-29 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22197 ``` Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Duplicate column name c1 in the table definition. at org.apache.hadoop.hive.ql.metadata.Table.validateColumns

[GitHub] spark pull request #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream forma...

2018-08-29 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r213826584 --- Diff: python/pyspark/context.py --- @@ -494,10 +494,14 @@ def f(split, iterator): c = list(c)# Make it a list so we can compute

[GitHub] spark pull request #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream forma...

2018-08-29 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r213825858 --- Diff: python/pyspark/context.py --- @@ -494,10 +494,14 @@ def f(split, iterator): c = list(c)# Make it a list so we can compute

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-08-29 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22184#discussion_r213569148 --- Diff: docs/sql-programming-guide.md --- @@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark issue #22261: [SPARK-25248.1][PYSPARK] update barrier Python API

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22261 What is `.1` in the title `[SPARK-25248.1]`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream format for c...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21546 @BryanCutler The worst case is to turn off `spark.sql.execution.arrow.enabled`, if the new code path has a bug, right

[GitHub] spark issue #22205: [SPARK-25212][SQL] Support Filter in ConvertToLocalRelat...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22205 Thanks! Merged to master. The JIRA is created to resolve the issues regarding the tests. --- - To unsubscribe, e-mail

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22112 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22205: [SPARK-25212][SQL] Support Filter in ConvertToLocalRelat...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22205 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22184#discussion_r213426988 --- Diff: docs/sql-programming-guide.md --- @@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22184#discussion_r213426538 --- Diff: docs/sql-programming-guide.md --- @@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark issue #22237: [SPARK-25243][SQL] Use FailureSafeParser in from_json

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22237 If we can finish it before the code freeze, it will be 2.4; otherwise it is 3.0 --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22233: [SPARK-25240][SQL] Fix for a deadlock in RECOVER PARTITI...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22233 Basically, this PR is to revert the code to the original .par -based solution. LGTM Thanks! Merged to master

[GitHub] spark pull request #22233: [SPARK-25240][SQL] Fix for a deadlock in RECOVER ...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22233#discussion_r213423805 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala --- @@ -60,7 +60,8 @@ class HiveCatalogedDDLSuite extends

[GitHub] spark pull request #22233: [SPARK-25240][SQL] Fix for a deadlock in RECOVER ...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22233#discussion_r213406078 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala --- @@ -52,23 +52,24 @@ class InMemoryCatalogedDDLSuite extends

[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21087 @kiszk Please submit a follow-up PR to address your comment? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21087 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r213397871 --- Diff: core/src/main/scala/org/apache/spark/rdd/MapPartitionsRDD.scala --- @@ -32,12 +32,16 @@ import org.apache.spark.{Partition, TaskContext

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r213397293 --- Diff: core/src/main/scala/org/apache/spark/rdd/MapPartitionsRDD.scala --- @@ -32,12 +32,16 @@ import org.apache.spark.{Partition, TaskContext

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r213390708 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1918,3 +1991,19 @@ object RDD { new DoubleRDDFunctions(rdd.map(x

[GitHub] spark pull request #22112: [SPARK-23243][Core] Fix RDD.repartition() data co...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r213387147 --- Diff: core/src/main/scala/org/apache/spark/rdd/LocalCheckpointRDD.scala --- @@ -37,11 +37,12 @@ import org.apache.spark.storage.RDDBlockId

[GitHub] spark pull request #22205: [SPARK-25212][SQL] Support Filter in ConvertToLoc...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22205#discussion_r213370527 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizerRuleExclusionSuite.scala --- @@ -21,7 +21,9 @@ import

[GitHub] spark pull request #22205: [SPARK-25212][SQL] Support Filter in ConvertToLoc...

2018-08-28 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22205#discussion_r213370444 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/test/SharedSparkSession.scala --- @@ -35,10 +36,18 @@ trait SharedSparkSession

[GitHub] spark issue #22149: [SPARK-25158][SQL]Executor accidentally exit because Scr...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22149 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22149: [SPARK-25158][SQL]Executor accidentally exit because Scr...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22149 Is that possible to add a test case? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22211: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22211 Thanks! Merged to 2.1 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22188: [SPARK-25164][SQL] Avoid rebuilding column and path list...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22188 Normally, we do not backport such improvement PRs. However, the risk of this PR is pretty small. I think it is fine. Let me do

[GitHub] spark issue #22183: [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field ...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22183 For Hive tables, column resolution is always case insensitive. However, When `spark.sql.hive.convertMetastoreParquet` is true, users might face inconsistent behaviors when they use native

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22184#discussion_r213135626 --- Diff: docs/sql-programming-guide.md --- @@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark issue #22188: [SPARK-25164][SQL] Avoid rebuilding column and path list...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22188 @bersprockets The risk is pretty small I think. I am fine to backport it to the previous versions. Why 2.2 only

[GitHub] spark issue #22205: [SPARK-25212][SQL] Support Filter in ConvertToLocalRelat...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22205 Yes. Disable this rule for testing only. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22249#discussion_r213120096 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -2459,6 +2459,26 @@ object functions { StringTrimLeft(e.expr

[GitHub] spark issue #22205: [SPARK-25212][SQL] Support Filter in ConvertToLocalRelat...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22205 It would be safer to turn off this rule, since it will skip the actual query execution. Normally, the tests are introduced for testing end-to-end scenarios instead of applying this rule

[GitHub] spark pull request #22205: [SPARK-25212][SQL] Support Filter in ConvertToLoc...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22205#discussion_r213113632 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1349,6 +1353,12 @@ object ConvertToLocalRelation

[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22249#discussion_r213109101 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -2459,6 +2459,26 @@ object functions { StringTrimLeft(e.expr

[GitHub] spark issue #22205: [SPARK-25212][SQL] Support Filter in ConvertToLocalRelat...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22205 Many test cases will be invalid after this rule is applied, since they are built on LocalRelation. Thus, how about turning off the rule `ConvertToLocalRelation` by using the conf

[GitHub] spark pull request #22205: [SPARK-25212][SQL] Support Filter in ConvertToLoc...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22205#discussion_r213105696 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1349,6 +1353,12 @@ object ConvertToLocalRelation

[GitHub] spark pull request #22205: [SPARK-25212][SQL] Support Filter in ConvertToLoc...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22205#discussion_r213101120 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1349,6 +1353,12 @@ object ConvertToLocalRelation

[GitHub] spark pull request #22233: [SPARK-25240][SQL] Fix for a deadlock in RECOVER ...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22233#discussion_r213057049 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -671,7 +674,7 @@ case class AlterTableRecoverPartitionsCommand

[GitHub] spark issue #21330: [SPARK-22234] Support distinct window functions

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21330 cc @jiangxb1987 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #22233: [SPARK-25240][SQL] Fix for a deadlock in RECOVER ...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22233#discussion_r213041684 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -671,7 +674,7 @@ case class AlterTableRecoverPartitionsCommand

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-08-26 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22184#discussion_r212834530 --- Diff: docs/sql-programming-guide.md --- @@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-08-26 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22184#discussion_r212834477 --- Diff: docs/sql-programming-guide.md --- @@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark pull request #22197: [SPARK-25207][SQL] Case-insensitve field resoluti...

2018-08-25 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22197#discussion_r212814249 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala --- @@ -44,7 +45,12 @@ private[parquet] class

[GitHub] spark issue #22197: [SPARK-25207][SQL] Case-insensitve field resolution for ...

2018-08-25 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22197 This PR is basically trying to resolve case sensitivity when the logical schema and physical schema do not match. This sounds like a general issue in all the data sources. Could any of you do us

[GitHub] spark pull request #22197: [SPARK-25207][SQL] Case-insensitve field resoluti...

2018-08-25 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22197#discussion_r212814157 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala --- @@ -350,25 +356,38 @@ private[parquet

[GitHub] spark issue #22197: [SPARK-25207][SQL] Case-insensitve field resolution for ...

2018-08-25 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22197 @dongjoon-hyun Do you think we face the same issue in ORC? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22233: [SPARK-25240][SQL] Fix for a deadlock in RECOVER ...

2018-08-25 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22233#discussion_r212813730 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala --- @@ -671,7 +674,7 @@ case class AlterTableRecoverPartitionsCommand

[GitHub] spark issue #22204: [SPARK-25196][SQL] Analyze column statistics in cached q...

2018-08-25 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22204 cc @dongjoon-hyun Try to review this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22198: [SPARK-25121][SQL] Supports multi-part table names for b...

2018-08-25 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22198 cc @dongjoon-hyun Try to review this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22226: [SPARK-24391][SQL] Support arrays of any types by to_jso...

2018-08-25 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/6 cc @dongjoon-hyun Try to review this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-08-24 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22184#discussion_r212533857 --- Diff: docs/sql-programming-guide.md --- @@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-08-24 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22184#discussion_r212533706 --- Diff: docs/sql-programming-guide.md --- @@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark pull request #21087: [SPARK-23997][SQL] Configurable maximum number of...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21087#discussion_r212521169 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -164,9 +165,12 @@ case class BucketSpec

[GitHub] spark issue #21087: [SPARK-23997][SQL] Configurable maximum number of bucket...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21087 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #21087: [SPARK-23997][SQL] Configurable maximum number of...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21087#discussion_r212521145 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -164,9 +165,12 @@ case class BucketSpec

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22079 @bersprockets Could you please close this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22203: [SPARK-25029][BUILD][CORE] Janino "Two non-abstract meth...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22203 Thanks! Merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21320 Thanks! Merged to master. BTW, we can keep thinking whether there are other better solutions for nested column pruning. Also cc @dongjoon-hyun If you are interested

[GitHub] spark issue #22211: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22211 cc @jiangxb1987 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21320 LGTM, as I explained above. https://github.com/apache/spark/pull/21320#issuecomment-415526369 Thanks for your patience and great work! @mallman Sorry, it takes two years to merge

[GitHub] spark pull request #21320: [SPARK-4502][SQL] Parquet nested column pruning -...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21320#discussion_r212476709 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala --- @@ -31,6 +32,7 @@ import

[GitHub] spark pull request #21320: [SPARK-4502][SQL] Parquet nested column pruning -...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21320#discussion_r212476400 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala --- @@ -31,6 +32,7 @@ import

[GitHub] spark pull request #21320: [SPARK-4502][SQL] Parquet nested column pruning -...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21320#discussion_r212476268 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruningSuite.scala --- @@ -31,6 +32,7 @@ import

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22079 https://github.com/apache/spark/commit/d7c3aae2074b3dd3923dd754c0a3c97308c66893 Done --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22079 Thanks! Merged to 2.2 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22203: [SPARK-25029][BUILD][CORE] Janino "Two non-abstra...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22203#discussion_r212419913 --- Diff: dev/deps/spark-deps-hadoop-2.6 --- @@ -98,7 +98,7 @@ jackson-module-jaxb-annotations-2.6.7.jar jackson-module-paranamer-2.7.9.jar

[GitHub] spark pull request #22203: [SPARK-25029][BUILD][CORE] Janino "Two non-abstra...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22203#discussion_r212418839 --- Diff: dev/deps/spark-deps-hadoop-2.6 --- @@ -98,7 +98,7 @@ jackson-module-jaxb-annotations-2.6.7.jar jackson-module-paranamer-2.7.9.jar

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21320 The feature has already been developed for almost two years. I am feeling sorry to merge it in Spark 2.4 release. Personally, I think we should not block merging this PR to Spark 2.4 release

[GitHub] spark pull request #21320: [SPARK-4502][SQL] Parquet nested column pruning -...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21320#discussion_r212414888 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala --- @@ -202,11 +204,15 @@ private

[GitHub] spark pull request #22188: [SPARK-25164][SQL] Avoid rebuilding column and pa...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22188#discussion_r212199355 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedParquetRecordReader.java --- @@ -270,21 +270,23 @@ public

[GitHub] spark pull request #21320: [SPARK-4502][SQL] Parquet nested column pruning -...

2018-08-23 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21320#discussion_r212194825 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala --- @@ -202,11 +204,15 @@ private

[GitHub] spark issue #21749: [SPARK-24785] [SHELL] Making sure REPL prints Spark UI i...

2018-08-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21749 Talked with @zsxwing . We do not need to revert the version bump, as long as this PR does not introduce a new regression. BTW, merging to the RC branches should be treated

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21320 let me take a look this today. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #22183: [SPARK-25132][SQL][BACKPORT-2.3] Case-insensitive field ...

2018-08-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22183 See my comment: https://github.com/apache/spark/pull/22184/files#r212006137 --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22184: [SPARK-25132][SQL][DOC] Add migration doc for cas...

2018-08-22 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22184#discussion_r212006137 --- Diff: docs/sql-programming-guide.md --- @@ -1895,6 +1895,10 @@ working with timestamps in `pandas_udf`s to get the best performance, see

[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API

2018-08-22 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22009 It sounds like the sync between apache and github is down. Although it has been merged, the PR has not been closed

[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22165 @xuanyuanking thanks for helping the test coverage! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #22165: [SPARK-25017][Core] Add test suite for BarrierCoordinato...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22165 cc @jiangxb1987 @mengxr --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22079 cc @jiangxb1987 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark issue #22152: [SPARK-25159][SQL] json schema inference should only tri...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22152 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22154 LGTM Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #22141: [SPARK-25154][SQL] Support NOT IN sub-queries ins...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22141#discussion_r211788303 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/subquery.scala --- @@ -137,13 +137,21 @@ object RewritePredicateSubquery

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SPARK-24564][SPARK-25114][SQL...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22079 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22148: [SPARK-25132][SQL] Case-insensitive field resolution whe...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22148 @seancxmao Please submit a follow-up PR to document the behavior changes in the migration guide of Spark SQL

[GitHub] spark pull request #22148: [SPARK-25132][SQL] Case-insensitive field resolut...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22148#discussion_r211783541 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala --- @@ -430,6 +430,48 @@ class FileBasedDataSourceSuite extends

[GitHub] spark pull request #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exc...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22154#discussion_r211730772 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CodeGeneratorWithInterpretedFallback.scala --- @@ -17,24 +17,10

[GitHub] spark pull request #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exc...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22154#discussion_r211730445 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Projection.scala --- @@ -180,7 +180,10 @@ object UnsafeProjection

[GitHub] spark pull request #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exc...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22154#discussion_r211729573 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CodeGeneratorWithInterpretedFallback.scala --- @@ -63,7 +49,10

[GitHub] spark pull request #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exc...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22154#discussion_r211729173 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/CodeGeneratorWithInterpretedFallback.scala --- @@ -63,7 +49,10

[GitHub] spark pull request #22174: [SPARK-22779][SQL] Fallback config defaults shoul...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/22174#discussion_r211727767 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -1954,14 +1954,7 @@ class SQLConf extends Serializable

[GitHub] spark issue #22174: [SPARK-22779] Fallback config defaults should behave lik...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22174 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #22154: [SPARK-23711][SPARK-25140][SQL] Catch correct exceptions...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22154 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20725: [SPARK-23555][PYTHON] Add BinaryType support for ...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20725#discussion_r211715497 --- Diff: python/pyspark/sql/types.py --- @@ -1597,6 +1598,12 @@ def to_arrow_type(dt): arrow_type = pa.decimal128(dt.precision, dt.scale

[GitHub] spark issue #22009: [SPARK-24882][SQL] improve data source v2 API

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22009 @cloud-fan Post these follow-up tasks in the PR description? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22166: [2.3][SPARK-25114][Core][FOLLOWUP] Fix RecordBinaryCompa...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22166 @jiangxb1987 Please close it --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21320 Add some test cases when turning on `spark.sql.caseSensitive`? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #18146: [SPARK-20924] [SQL] Unable to call the function register...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/18146 @rajeshcode 2.1 branch is not actively maintained. Please use the 2.2 branch --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #22079: [SPARK-23207][SPARK-22905][SQL][BACKPORT-2.2] Shuffle+Re...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22079 The PR https://github.com/apache/spark/pull/22101 has been merged. Please merge the latest one to this PR. Thanks

[GitHub] spark issue #22101: [SPARK-25114][Core] Fix RecordBinaryComparator when subt...

2018-08-21 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22101 Thanks! Merged to master and 2.3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

<    1   2   3   4   5   6   7   8   9   10   >