[jira] [Commented] (FLINK-22827) Hive dialect supports CLUSTERED BY clause of CREATE TABLE DDL

2022-08-16 Thread luoyuxia (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580544#comment-17580544
 ] 

luoyuxia commented on FLINK-22827:
--

[~aidenma] Thanks for your attention. If you have any problem about it, please 
let me know.

> Hive dialect supports CLUSTERED BY clause of CREATE TABLE DDL
> -
>
> Key: FLINK-22827
> URL: https://issues.apache.org/jira/browse/FLINK-22827
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.13.1
>Reporter: Ma Jun
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> {code:java}
> # hive syntax:
> CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
> [ ( col_name1[:] col_type1 [ COMMENT col_comment1 ], ... ) ]
> [ COMMENT table_comment ]
> [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) 
> | ( col_name1, col_name2, ... ) ]
> [ CLUSTERED BY ( col_name1, col_name2, ...) 
> [ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... 
> ) ] 
> INTO num_buckets BUCKETS ]
> [ ROW FORMAT row_format ]
> [ STORED AS file_format ]
> [ LOCATION path ]
> [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
> [ AS select_statement ]
> {code}
>  
> {code:java}
> [ CLUSTERED BY ( col_name1, col_name2, ...) [ SORTED BY ( col_name1 [ ASC | 
> DESC ], col_name2 [ ASC | DESC ], ... ) ] 
> {code}
> Will Flink support the way of creating tables and supporting clustered by | 
> sort by into buckets in later versions?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-22827) Hive dialect supports CLUSTERED BY clause of CREATE TABLE DDL

2022-08-16 Thread Ma Jun (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17580540#comment-17580540
 ] 

Ma Jun commented on FLINK-22827:


[~luoyuxia] Thank you!  I merge you branch to my branch! 

> Hive dialect supports CLUSTERED BY clause of CREATE TABLE DDL
> -
>
> Key: FLINK-22827
> URL: https://issues.apache.org/jira/browse/FLINK-22827
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.13.1
>Reporter: Ma Jun
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> {code:java}
> # hive syntax:
> CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
> [ ( col_name1[:] col_type1 [ COMMENT col_comment1 ], ... ) ]
> [ COMMENT table_comment ]
> [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) 
> | ( col_name1, col_name2, ... ) ]
> [ CLUSTERED BY ( col_name1, col_name2, ...) 
> [ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... 
> ) ] 
> INTO num_buckets BUCKETS ]
> [ ROW FORMAT row_format ]
> [ STORED AS file_format ]
> [ LOCATION path ]
> [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
> [ AS select_statement ]
> {code}
>  
> {code:java}
> [ CLUSTERED BY ( col_name1, col_name2, ...) [ SORTED BY ( col_name1 [ ASC | 
> DESC ], col_name2 [ ASC | DESC ], ... ) ] 
> {code}
> Will Flink support the way of creating tables and supporting clustered by | 
> sort by into buckets in later versions?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-22827) Hive dialect supports CLUSTERED BY clause of CREATE TABLE DDL

2022-07-08 Thread Ma Jun (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17564198#comment-17564198
 ] 

Ma Jun commented on FLINK-22827:


close issue

> Hive dialect supports CLUSTERED BY clause of CREATE TABLE DDL
> -
>
> Key: FLINK-22827
> URL: https://issues.apache.org/jira/browse/FLINK-22827
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.13.1
>Reporter: Ma Jun
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> {code:java}
> # hive syntax:
> CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
> [ ( col_name1[:] col_type1 [ COMMENT col_comment1 ], ... ) ]
> [ COMMENT table_comment ]
> [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) 
> | ( col_name1, col_name2, ... ) ]
> [ CLUSTERED BY ( col_name1, col_name2, ...) 
> [ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... 
> ) ] 
> INTO num_buckets BUCKETS ]
> [ ROW FORMAT row_format ]
> [ STORED AS file_format ]
> [ LOCATION path ]
> [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
> [ AS select_statement ]
> {code}
>  
> {code:java}
> [ CLUSTERED BY ( col_name1, col_name2, ...) [ SORTED BY ( col_name1 [ ASC | 
> DESC ], col_name2 [ ASC | DESC ], ... ) ] 
> {code}
> Will Flink support the way of creating tables and supporting clustered by | 
> sort by into buckets in later versions?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-22827) Hive dialect supports CLUSTERED BY clause of CREATE TABLE DDL

2021-06-01 Thread Rui Li (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355059#comment-17355059
 ] 

Rui Li commented on FLINK-22827:


Hi [~aidenma], currently Flink doesn't support the semantics of bucketed/sorted 
hive tables. So even if the DDL allows you to create such a table, you probably 
won't get the desired behavior when writing to this table with Flink, e.g. the 
data won't be shuffled/sorted according to the bucket/sort key you specified.

> Hive dialect supports CLUSTERED BY clause of CREATE TABLE DDL
> -
>
> Key: FLINK-22827
> URL: https://issues.apache.org/jira/browse/FLINK-22827
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.13.1
>Reporter: Ma Jun
>Priority: Major
>
> {code:java}
> # hive syntax:
> CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
> [ ( col_name1[:] col_type1 [ COMMENT col_comment1 ], ... ) ]
> [ COMMENT table_comment ]
> [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) 
> | ( col_name1, col_name2, ... ) ]
> [ CLUSTERED BY ( col_name1, col_name2, ...) 
> [ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... 
> ) ] 
> INTO num_buckets BUCKETS ]
> [ ROW FORMAT row_format ]
> [ STORED AS file_format ]
> [ LOCATION path ]
> [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
> [ AS select_statement ]
> {code}
>  
> {code:java}
> [ CLUSTERED BY ( col_name1, col_name2, ...) [ SORTED BY ( col_name1 [ ASC | 
> DESC ], col_name2 [ ASC | DESC ], ... ) ] 
> {code}
> Will Flink support the way of creating tables and supporting clustered by | 
> sort by into buckets in later versions?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (FLINK-22827) Hive dialect supports CLUSTERED BY clause of CREATE TABLE DDL

2021-06-01 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-22827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355050#comment-17355050
 ] 

Jark Wu commented on FLINK-22827:
-

I updated the title to support this syntax in hive dialect. We recently 
supported a better Hive DDL&DQL compatibility in 1.13 version. I'm not sure 
whether this CLUSTERED BY syntax is supported or not. Maybe [~lirui] can answer 
here. 

> Hive dialect supports CLUSTERED BY clause of CREATE TABLE DDL
> -
>
> Key: FLINK-22827
> URL: https://issues.apache.org/jira/browse/FLINK-22827
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Hive
>Affects Versions: 1.13.1
>Reporter: Ma Jun
>Priority: Major
>
> {code:java}
> # hive syntax:
> CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
> [ ( col_name1[:] col_type1 [ COMMENT col_comment1 ], ... ) ]
> [ COMMENT table_comment ]
> [ PARTITIONED BY ( col_name2[:] col_type2 [ COMMENT col_comment2 ], ... ) 
> | ( col_name1, col_name2, ... ) ]
> [ CLUSTERED BY ( col_name1, col_name2, ...) 
> [ SORTED BY ( col_name1 [ ASC | DESC ], col_name2 [ ASC | DESC ], ... 
> ) ] 
> INTO num_buckets BUCKETS ]
> [ ROW FORMAT row_format ]
> [ STORED AS file_format ]
> [ LOCATION path ]
> [ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
> [ AS select_statement ]
> {code}
>  
> {code:java}
> [ CLUSTERED BY ( col_name1, col_name2, ...) [ SORTED BY ( col_name1 [ ASC | 
> DESC ], col_name2 [ ASC | DESC ], ... ) ] 
> {code}
> Will Flink support the way of creating tables and supporting clustered by | 
> sort by into buckets in later versions?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)