[GitHub] spark pull request #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in ...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13756#discussion_r69541474 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala --- @@ -248,4 +281,15 @@ private[sql] case class PreWriteCheck

[GitHub] spark pull request #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in ...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13756#discussion_r69541158 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala --- @@ -206,7 +207,39 @@ private[sql] case class PreWriteCheck

[GitHub] spark pull request #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in ...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13756#discussion_r69540936 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala --- @@ -248,4 +281,15 @@ private[sql] case class PreWriteCheck

[GitHub] spark pull request #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in ...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13756#discussion_r69540754 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala --- @@ -206,7 +207,39 @@ private[sql] case class PreWriteCheck

[GitHub] spark pull request #13756: [SPARK-16041][SQL] Disallow Duplicate Columns in ...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13756#discussion_r69528941 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala --- @@ -867,8 +865,8 @@ class SparkSqlAstBuilder(conf: SQLConf

[GitHub] spark issue #13389: [SPARK-9876][SQL][FOLLOWUP] Enable string and binary tes...

2016-07-05 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13389 LGTM, merging to master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13389: [SPARK-9876][SQL][FOLLOWUP] Enable string and bin...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13389#discussion_r69527639 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystWriteSupport.scala --- @@ -150,7 +150,8 @@ private[parquet

[GitHub] spark issue #14044: [SPARK-16360][SQL] Speed up SQL query performance by rem...

2016-07-05 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14044 Merged to master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #14012: [SPARK-16343][SQL] Improve the PushDownPredicate ...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/14012#discussion_r69522538 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1106,12 +1106,15 @@ object PushDownPredicate

[GitHub] spark pull request #14012: [SPARK-16343][SQL] Improve the PushDownPredicate ...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/14012#discussion_r69522366 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1106,12 +1106,15 @@ object PushDownPredicate

[GitHub] spark issue #14012: [SPARK-16343][SQL] Improve the PushDownPredicate rule to...

2016-07-05 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14012 add to whitelist --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #14038: [SPARK-16317][SQL] Add a new interface to filter ...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/14038#discussion_r69520473 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/fileSourceInterfaces.scala --- @@ -437,11 +442,26 @@ private[sql] object

[GitHub] spark issue #14038: [SPARK-16317][SQL] Add a new interface to filter files i...

2016-07-05 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14038 Left some comments, the overall structure looks good. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #14038: [SPARK-16317][SQL] Add a new interface to filter ...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/14038#discussion_r69520156 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/fileSourceInterfaces.scala --- @@ -172,6 +171,13 @@ case class

[GitHub] spark pull request #14038: [SPARK-16317][SQL] Add a new interface to filter ...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/14038#discussion_r69519931 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/fileSourceInterfaces.scala --- @@ -172,6 +171,13 @@ case class

[GitHub] spark pull request #14038: [SPARK-16317][SQL] Add a new interface to filter ...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/14038#discussion_r69519846 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/fileSourceInterfaces.scala --- @@ -230,6 +236,15 @@ trait FileFormat

[GitHub] spark pull request #14038: [SPARK-16317][SQL] Add a new interface to filter ...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/14038#discussion_r69519641 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/fileSourceInterfaces.scala --- @@ -172,6 +171,13 @@ case class

[GitHub] spark pull request #14038: [SPARK-16317][SQL] Add a new interface to filter ...

2016-07-05 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/14038#discussion_r69518421 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/fileSourceInterfaces.scala --- @@ -230,6 +236,15 @@ trait FileFormat

[GitHub] spark issue #12972: [SPARK-15198][SQL] Support for pushing down filters for ...

2016-07-04 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/12972 LGTM, merging to master. Sorry for leaving this PR for so long... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #14044: [SPARK-16360][SQL] Speed up SQL query performance by rem...

2016-07-04 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14044 LGTM pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #14044: [SPARK-16360][SQL] Speed up SQL query performance by rem...

2016-07-04 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14044 Agree with @hvanhovell. Analysis should never take so long a time for such a simple query. We should avoid duplicated analysis work, but fixing performance issue(s) within the analyzer seems

[GitHub] spark issue #13818: [SPARK-15968][SQL] Nonempty partitioned metastore tables...

2016-07-04 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13818 Shall we also have this in branch-2.0? This seems to be a pretty serious bug. cc @rxin. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13818: [SPARK-15968][SQL] Nonempty partitioned metastore tables...

2016-07-04 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13818 LGTM except for minor styling issues. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13818: [SPARK-15968][SQL] Nonempty partitioned metastore...

2016-07-04 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13818#discussion_r69448268 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala --- @@ -425,6 +425,28 @@ class ParquetMetastoreSuite extends

[GitHub] spark pull request #13818: [SPARK-15968][SQL] Nonempty partitioned metastore...

2016-07-04 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13818#discussion_r69448259 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala --- @@ -425,6 +425,28 @@ class ParquetMetastoreSuite extends

[GitHub] spark pull request #13818: [SPARK-15968][SQL] Nonempty partitioned metastore...

2016-07-04 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13818#discussion_r69448203 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala --- @@ -425,6 +425,28 @@ class ParquetMetastoreSuite extends

[GitHub] spark issue #14025: [DOC][SQL] update out-of-date code snippets using SQLCon...

2016-07-04 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14025 @tdas Would you please to help review streaming example code changes? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #14025: [DOC][SQL] update out-of-date code snippets using...

2016-07-04 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/14025#discussion_r69415295 --- Diff: docs/configuration.md --- @@ -1564,8 +1564,8 @@ spark.sql("SET -v").show(n=200, truncate=False) {% h

[GitHub] spark issue #14025: [WIP][DOC] update out-of-date code snippets using SQLCon...

2016-07-03 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14025 @WeichenXu123 Is this ready for review? If yes, please remove the WIP tag in the PR description. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #14009: [SPARK-16311][SQL] Metadata refresh should work on tempo...

2016-07-01 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14009 LGTM except for a minor styling issue. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #14009: [SPARK-16311][SQL] Metadata refresh should work o...

2016-07-01 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/14009#discussion_r69316671 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/MetadataCacheSuite.scala --- @@ -77,12 +77,12 @@ class MetadataCacheSuite extends QueryTest

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `PropagateEmptyRelation` optimize...

2016-07-01 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13906 Merged to master. @cloud-fan Sorry that I didn't notice your comment while merging it. We may address it in follow-up ones. --- If your project is set up for it, you can reply

[GitHub] spark issue #14013: [SPARK-16344][SQL][BRANCH-1.6] Decoding Parquet array of...

2016-07-01 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14013 @rdblue Verified that parquet-avro also suffers from this issue. Filed [PARQUET-651][1] to track it. [1]: https://issues.apache.org/jira/browse/PARQUET-651 --- If your project is set up

[GitHub] spark issue #14013: [SPARK-16344][SQL][BRANCH-1.6] Decoding Parquet array of...

2016-07-01 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14013 @rdblue Would you mind to help review this one? My initial investigation suggested that parquet-avro probably suffers the same issue. Will file a parquet-mr JIRA ticket soon. --- If your

[GitHub] spark issue #14014: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-01 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14014 cc @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #14014: [SPARK-16344][SQL] Decoding Parquet array of stru...

2016-07-01 Thread liancheng
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/14014 [SPARK-16344][SQL] Decoding Parquet array of struct with a single field named "element" ## What changes were proposed in this pull request? This PR ports #14013 to master and

[GitHub] spark issue #14013: [SPARK-16344][SQL] Decoding Parquet array of struct with...

2016-07-01 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/14013 cc @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #14013: [SPARK-16344][SQL] Decoding Parquet array of stru...

2016-07-01 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/14013#discussion_r69283877 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/CatalystRowConverter.scala --- @@ -481,13 +481,106 @@ private

[GitHub] spark pull request #14013: [SPARK-16344][SQL] Decoding Parquet array of stru...

2016-07-01 Thread liancheng
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/14013 [SPARK-16344][SQL] Decoding Parquet array of struct with a single field named "element" ## What changes were proposed in this pull request? Please refer to [SPARK-16344][1] f

[GitHub] spark pull request #14006: [SPARK-13015][MLlib][DOC] Replace example code in...

2016-06-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/14006#discussion_r69242197 --- Diff: docs/_plugins/include_example.rb --- @@ -85,20 +85,20 @@ def select_lines(code) .select { |l, i| l.include? "$exampl

[GitHub] spark pull request #13974: [SPARK-16296][SQL] add null check for key when cr...

2016-06-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13974#discussion_r69231758 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayBasedMapData.scala --- @@ -19,6 +19,11 @@ package

[GitHub] spark issue #13558: [SPARK-15820][PySpark][SQL]Add Catalog.refreshTable into...

2016-06-30 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13558 LGTM, merging to master and branch-2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-30 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13906 LGTM except for those comments @cloud-fan brought up. Thanks for working on this! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r69140767 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala --- @@ -0,0 +1,173 @@ +/* + * Licensed

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r69140206 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala --- @@ -0,0 +1,70 @@ +/* + * Licensed

[GitHub] spark issue #13994: [BUILD] Fix version in poms related to kafka-0-10

2016-06-30 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13994 Merging to branch-2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r69111075 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlanSuite.scala --- @@ -0,0 +1,173 @@ +/* + * Licensed

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r69110745 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala --- @@ -0,0 +1,70 @@ +/* + * Licensed

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r69109823 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala --- @@ -0,0 +1,70 @@ +/* + * Licensed

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r69109611 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala --- @@ -0,0 +1,70 @@ +/* + * Licensed

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r69109340 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala --- @@ -0,0 +1,70 @@ +/* + * Licensed

[GitHub] spark issue #13992: [SPARK-12177][TEST] Removed test to avoid compilation is...

2016-06-30 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13992 Merging to master and branch-2.0. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13989: [SPARK-16311][SQL] Improve metadata refresh

2016-06-30 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13989 In general, I think reconstructing a DataFrame/Dataset or using `REFRESH TABLE` may be a better approach to solve the problem this PR tries to solve. Did I missed some context here? --- If your

[GitHub] spark issue #13989: [SPARK-16311][SQL] Improve metadata refresh

2016-06-30 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13989 One concern of mine is that, analyzed plan, optimized plan, and executed (physical) plan stored in `QueryExecution` are all lazy vals, which means that they won't be re-optimized/planned

[GitHub] spark pull request #13989: [SPARK-16311][SQL] Improve metadata refresh

2016-06-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13989#discussion_r69090469 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -2307,6 +2307,19 @@ class Dataset[T] private[sql]( def distinct

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r69089371 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala --- @@ -0,0 +1,49 @@ +/* + * Licensed

[GitHub] spark issue #13992: [SPARK-12177][TEST] Removed test to avoid compilation is...

2016-06-30 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13992 LGTM pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark pull request #13989: [SPARK-16311][SQL] Improve metadata refresh

2016-06-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13989#discussion_r69083486 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala --- @@ -265,6 +265,11 @@ abstract class LogicalPlan

[GitHub] spark pull request #13989: [SPARK-16311][SQL] Improve metadata refresh

2016-06-30 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13989#discussion_r69081636 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala --- @@ -265,6 +265,11 @@ abstract class LogicalPlan

[GitHub] spark issue #13972: [SPARK-16294][SQL] Labelling support for the include_exa...

2016-06-30 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13972 Other example snippets in the SQL programming guide will be updated in follow-up PRs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13972: [SPARK-16294][SQL] Labelling support for the include_exa...

2016-06-30 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13972 Thanks for the review! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13906 @cloud-fan Yea, that's a good point. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-29 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r69065541 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala --- @@ -0,0 +1,49 @@ +/* + * Licensed

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-29 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r69065425 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala --- @@ -0,0 +1,49 @@ +/* + * Licensed

[GitHub] spark issue #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimizer

2016-06-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13906 My feeling is that, this optimization rule is mostly useful for binary plan nodes like inner join and intersection, where we can avoid scanning output of the non-empty side. On the other

[GitHub] spark pull request #13906: [SPARK-16208][SQL] Add `CollapseEmptyPlan` optimi...

2016-06-29 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13906#discussion_r69064054 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CollapseEmptyPlan.scala --- @@ -0,0 +1,49 @@ +/* + * Licensed

[GitHub] spark issue #13972: [SPARK-16294][SQL] Labelling support for the include_exa...

2016-06-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13972 @yinxusen Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #13972: [SPARK-16294][SQL] Labelling support for the include_exa...

2016-06-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13972 @yinxusen @mengxr Actually I found overlapped labelling is far more easier than I expected earlier... So did it in the last commit. Made the following experiment to illustrate the effect

[GitHub] spark issue #13972: [SPARK-16294][SQL] Labelling support for the include_exa...

2016-06-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13972 @yinxusen Thanks for the review! Also discussed with @mengxr. IIUC, overlapped labels is most useful for handling imports, since sometimes we may want to include one import line in multiple

[GitHub] spark issue #13846: [SPARK-16134][SQL] optimizer rules for typed filter

2016-06-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13846 Reverted the commit on branch-2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13846: [SPARK-16134][SQL] optimizer rules for typed filter

2016-06-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13846 LGTM, merged to master. (Also merged to branch-2.0 by mistake, will revert it ASAP. Sorry for the trouble.) --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #13972: [SPARK-16294][SQL] Labelling support for the include_exa...

2016-06-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13972 @yinxusen Could you please help review this one since you're the original author of this plugin? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #13972: [SPARK-16294][SQL] Labelling support for the include_exa...

2016-06-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13972 Also cc @yhuai and @rxin. The background here is that I'm going to extract snippets from actual Scala/Java/Python/R source files rather than hard-code them in the SQL programming guide

[GitHub] spark issue #13972: [SPARK-16294][SQL] Labelling support for the include_exa...

2016-06-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13972 @mengxr Could you please help review this PR? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #13972: [SPARK-16294][SQL] Labelling support for the incl...

2016-06-29 Thread liancheng
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/13972 [SPARK-16294][SQL] Labelling support for the include_example Jekyll plugin ## What changes were proposed in this pull request? This PR adds labelling support for the `include_example

[GitHub] spark issue #13968: [SPARK-16291][SQL] CheckAnalysis should capture nested a...

2016-06-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13968 cc @yhuai @cloud-fan @clockfly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #13893: [SPARK-14172][SQL] Hive table partition predicate not pa...

2016-06-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13893 @jiangxb1987 Please feel free to create a new JIRA ticket and PR for this, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request #13968: [SPARK-16291][SQL] CheckAnalysis should capture n...

2016-06-29 Thread liancheng
GitHub user liancheng opened a pull request: https://github.com/apache/spark/pull/13968 [SPARK-16291][SQL] CheckAnalysis should capture nested aggregate functions that reference no input attributes ## What changes were proposed in this pull request? `MAX(COUNT

[GitHub] spark issue #13835: [SPARK-16100][SQL] fix bug when use Map as the buffer ty...

2016-06-28 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13835 LGTM, merging to master and branch-2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13933: [SPARK-16236] [SQL] Add Path Option back to Load API in ...

2016-06-28 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13933 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #13720: [SPARK-16004] [SQL] Correctly display "Last Acces...

2016-06-28 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13720#discussion_r68855824 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala --- @@ -522,7 +523,7 @@ case class DescribeTableCommand(table

[GitHub] spark pull request #13720: [SPARK-16004] [SQL] Correctly display "Last Acces...

2016-06-28 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13720#discussion_r68855663 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala --- @@ -180,7 +180,8 @@ case class CatalogTable

[GitHub] spark pull request #13846: [SPARK-16134][SQL] optimizer rules for typed filt...

2016-06-28 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13846#discussion_r68753881 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/TypedFilterOptimizationSuite.scala --- @@ -23,54 +23,111 @@ import

[GitHub] spark pull request #13846: [SPARK-16134][SQL] optimizer rules for typed filt...

2016-06-28 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13846#discussion_r68753763 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/TypedFilterOptimizationSuite.scala --- @@ -23,54 +23,111 @@ import

[GitHub] spark pull request #13846: [SPARK-16134][SQL] optimizer rules for typed filt...

2016-06-28 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13846#discussion_r68751774 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -167,6 +169,43 @@ case class MapElements

[GitHub] spark pull request #13846: [SPARK-16134][SQL] optimizer rules for typed filt...

2016-06-28 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13846#discussion_r68751775 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/object.scala --- @@ -167,6 +169,43 @@ case class MapElements

[GitHub] spark pull request #13846: [SPARK-16134][SQL] optimizer rules for typed filt...

2016-06-28 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13846#discussion_r68750076 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ReferenceToExpressions.scala --- @@ -45,6 +45,7 @@ case class

[GitHub] spark pull request #13846: [SPARK-16134][SQL] optimizer rules for typed filt...

2016-06-28 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13846#discussion_r68750055 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala --- @@ -1637,55 +1654,31 @@ case class GetCurrentDatabase

[GitHub] spark issue #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger via SLF4J...

2016-06-27 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13918 Thanks, merged to master. @rxin Shall we have this in branch-2.0 at this stage? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #13913: [SPARK-10591][SQL][TEST] Add a testcase to ensure if `ch...

2016-06-27 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13913 Thanks! Merged to master and branch-2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger via SLF4J...

2016-06-27 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13918 LGTM except for one minor comment. Thanks for fixing this annoying issue! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request #13918: [SPARK-16221][SQL] Redirect Parquet JUL logger vi...

2016-06-27 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13918#discussion_r68557487 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala --- @@ -129,6 +129,8 @@ private[sql] class

[GitHub] spark pull request #13893: [SPARK-14172][SQL] Hive table partition predicate...

2016-06-27 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13893#discussion_r68556327 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/PruningSuite.scala --- @@ -141,6 +141,14 @@ class PruningSuite extends

[GitHub] spark issue #13889: [SQL][minor] Simplify data source predicate filter trans...

2016-06-24 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13889 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #13865: [SPARK-13709][SQL] Initialize deserializer with b...

2016-06-23 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13865#discussion_r68352349 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/QueryPartitionSuite.scala --- @@ -65,4 +68,77 @@ class QueryPartitionSuite extends QueryTest

[GitHub] spark pull request #13865: [SPARK-13709][SQL] Initialize deserializer with b...

2016-06-23 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13865#discussion_r68352258 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/QueryPartitionSuite.scala --- @@ -65,4 +68,77 @@ class QueryPartitionSuite extends QueryTest

[GitHub] spark pull request #13865: [SPARK-13709][SQL] Initialize deserializer with b...

2016-06-23 Thread liancheng
Github user liancheng commented on a diff in the pull request: https://github.com/apache/spark/pull/13865#discussion_r68335825 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala --- @@ -230,10 +234,21 @@ class HadoopTableReader( // Fill all

[GitHub] spark issue #13865: [SPARK-13709][SQL] Initialize deserializer with both tab...

2016-06-23 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13865 This is ready for review. cc @yhuai @cloud-fan @clockfly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #13870: [SPARK-16165][SQL] Fix the update logic for InMemoryTabl...

2016-06-23 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13870 LGTM, merging to master and branch-2.0. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #13871: [SPARK-16163] [SQL] Cache the statistics for logical pla...

2016-06-23 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/13871 LGTM except for the compilation error. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

<    1   2   3   4   5   6   7   8   9   10   >