[jira] [Updated] (SPARK-27943) Implement default constraint with Column for Hive table

Reece Robinson (Jira) Tue, 03 Oct 2023 15:57:32 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-27943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Reece Robinson updated SPARK-27943:
-----------------------------------
    Attachment: Screenshot 2023-10-04 at 11.11.28 AM.png

> Implement default constraint with Column for Hive table
> -------------------------------------------------------
>
>                 Key: SPARK-27943
>                 URL: https://issues.apache.org/jira/browse/SPARK-27943
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Jiaan Geng
>            Priority: Major
>         Attachments: Screenshot 2023-10-04 at 11.11.28 AM.png
>
>
>  
>  *Background*
> Default constraint with column is ANSI standard.
> Hive 3.0+ has supported default constraint 
> ref:https://issues.apache.org/jira/browse/HIVE-18726
> But Spark SQL implement this feature not yet.
> *Design*
> Hive is widely used in production environments and is the standard in the 
> field of big data in fact.
> But Hive exists many version used in production and the feature between each 
> version are different.
> Spark SQL need to implement default constraint, but there are three points to 
> pay attention to in design:
> _First_, Spark SQL should reduce coupling with Hive.
> _Second_, default constraint could compatible with different versions of Hive.
> _Thrid_, Which expression of default constraint should Spark SQL support? I 
> think should support `literal`, `current_date()`, `current_timestamp()`. 
> Maybe other expression should also supported, like `Cast(1 as float)`, `1 + 
> 2` and so on.
> We want to save the metadata of default constraint into properties of Hive 
> table, and then we restore metadata from the properties after client gets 
> newest metadata.The implement is the same as other metadata (e.g. 
> partition,bucket,statistics).
> Because default constraint is part of column, so I think could reuse the 
> metadata of StructField. The default constraint will cached by metadata of 
> StructField.
>  
> *Tasks*
> This is a big work, wo I want to split this work into some sub tasks, as 
> follows:
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-27943) Implement default constraint with Column for Hive table

Reply via email to