[ 
https://issues.apache.org/jira/browse/HIVE-28316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17855270#comment-17855270
 ] 

Zhihua Deng commented on HIVE-28316:
------------------------------------

The document is somehow out-dated, let me fix it. Thank you for pointing it 
out, [~linghengqian] !

> The documentation provides an ambiguous explanation regarding the mutually 
> exclusive nature of `STORED BY` and `STORED AS`
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-28316
>                 URL: https://issues.apache.org/jira/browse/HIVE-28316
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Qiheng He
>            Priority: Major
>
> - The documentation provides an ambiguous explanation regarding the mutually 
> exclusive nature of {*}STORED BY{*} and {*}STORED AS{*}.
> - As mentioned on 
> https://cwiki.apache.org/confluence/display/Hive/StorageHandlers , when the 
> {*}CREATE TABLE{*} statement specifies {*}STORED BY{*}, it should not also 
> specify {*}STORED AS{*}. The content in question is as follows.
> {code:bash}
> When STORED BY is specified, then row_format (DELIMITED or SERDE) and STORED 
> AS cannot be specified. Optional SERDEPROPERTIES can be specified as part of 
> the STORED BY clause and will be passed to the serde provided by the storage 
> handler.
> See CREATE TABLE and Row Format, Storage Format, and SerDe for more 
> information.
> Example:
> CREATE TABLE hbase_table_1(key int, value string)
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> "hbase.columns.mapping" = "cf:string",
> "hbase.table.name" = "hbase_table_0"
> );
> {code}
> - This is similarly reflected in the documentation at 
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL , where 
> {*}|{*} separates {*}STORED BY{*} from {*}STORED AS{*}, indicating their 
> distinct usage and mutual exclusivity.
> {code:bash}
> [
>    [ROW FORMAT row_format] 
>    [STORED AS file_format]
>      | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]  
> -- (Note: Available in Hive 0.6.0 and later)
> ]
> {code}
> - However, this contradicts the information provided in the Hive-Iceberg 
> Integration documentation at 
> https://cwiki.apache.org/confluence/display/Hive/Hive-Iceberg+Integration , 
> which explicitly gives examples demonstrating that {*}STORED BY{*} can 
> coexist with {*}STORED AS{*}. This creates an ambiguous interpretation. 
> {code:bash}
> The iceberg table currently supports three file formats: PARQUET, ORC & AVRO. 
> The default file format is Parquet. The file format can be explicitily 
> provided by using STORED AS <Format> while creating the table
> Example-1:
> CREATE TABLE ORC_TABLE (ID INT) STORED BY ICEBERG STORED AS ORC;
> {code}
> - Further early discussions on this topic can be found at 
> https://github.com/apache/shardingsphere/pull/31526 .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to