[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

Ryan Blue (JIRA) Tue, 21 Jun 2016 10:16:51 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-16032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342216#comment-15342216
 ]


Ryan Blue commented on SPARK-16032:
-----------------------------------

bq. I don't think the package matters, the pre-insert is still an analyzer rule

It isn't a big problem, but it's confusing to have analyzer rules scattered 
everywhere. Where reasonable, we should keep the analysis rules together. I 
think this should be in the analyzer for CatalogRelation, with any special 
cases linked in from elsewhere.

bq. I think we must move forward to fix the behaviour first.

I think behavior that's already in 1.6.1 should remain. Yes, it's confusing, 
but jobs rely on it and I think it's too late to responsibly change it before 
2.0.

bq. It doesn't matter. This rule runs before optimizer.

Why add something we know will need to be removed by the optimizer? The 
original patch already handled this case, why change that?

bq. the usage of partitionBy is quite confusing

I disagree that it is confusing, I think it is more consistent with Hive's 
behavior. It was also verified in the same way it is currently checked in 
{{saveAsTable}} with {{APPEND}} so it is no more confusing than that. I've 
written other comments on why I think this is useful, so I won't repeat those 
arguments here.

bq. {{saveAsTable}} has a different code path from {{insertInto}}

{{saveAsTable}} [ends up using 
{{InsertIntoTable}}|https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala#L90]
 so we have the option of making it work with by-name resolution rather than 
disabling it.

> Audit semantics of various insertion operations related to partitioned tables
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-16032
>                 URL: https://issues.apache.org/jira/browse/SPARK-16032
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.0.0
>            Reporter: Cheng Lian
>            Assignee: Wenchen Fan
>            Priority: Blocker
>         Attachments: [SPARK-16032] Spark SQL table insertion auditing - 
> Google Docs.pdf
>
>
> We found that semantics of various insertion operations related to partition 
> tables can be inconsistent. This is an umbrella ticket for all related 
> tickets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-16032) Audit semantics of various insertion operations related to partitioned tables

Reply via email to