deniskuzZ commented on code in PR #70: URL: https://github.com/apache/hive-site/pull/70#discussion_r2483540441
########## content/docs/latest/language/writeordering.md: ########## @@ -0,0 +1,171 @@ +--- +title: "Apache Hive : Write Ordering" +date: 2025-10-31 +--- + +# Apache Hive : Write Ordering + +## Overview + +Write ordering controls the physical layout of data within table files. Unlike `SORT BY` which orders data during query execution, write ordering is applied at write time and persists in the stored files. + +Write ordering is supported for Iceberg tables and can be specified during table creation. + +Hive supports two write ordering strategies: +* **Regular Ordering**: Sort by one or more columns in a specified order +* **Z-Ordering**: Multi-dimensional clustering using space-filling curves + +--- + +## Regular Column Ordering + +### Version + +Introduced in Hive version [4.1.0](https://issues.apache.org/jira/browse/HIVE-28586) + +### Syntax + +```sql +CREATE TABLE table_name (column_definitions) +WRITE [LOCALLY] ORDERED BY column_name [ASC | DESC] [NULLS FIRST | NULLS LAST] + [, column_name [ASC | DESC] [NULLS FIRST | NULLS LAST] ]* +STORED BY ICEBERG +[STORED AS file_format]; +``` + +### Options + +* Sort Order + * `ASC`: Ascending order (default) + * `DESC`: Descending order +* Null Order + * `NULLS FIRST`: Null values sorted before non-null values + * `NULLS LAST`: Null values sorted after non-null values + +### Examples + +Single column: + +```sql +CREATE TABLE events ( + event_id BIGINT, + event_date DATE, + event_type STRING +) +WRITE LOCALLY ORDERED BY event_date DESC +STORED BY ICEBERG +STORED AS ORC; +``` + +Multiple columns with null handling: + +```sql +CREATE TABLE orders ( + order_id BIGINT, + order_date DATE, + customer_id INT, + amount DECIMAL(10,2) +) +WRITE ORDERED BY order_date DESC NULLS FIRST, order_id ASC +STORED BY ICEBERG; +``` + +### Use Cases + +Regular ordering is most effective for: + +* Time-series data with temporal access patterns +* Range queries on sorted columns +* Queries with consistent ORDER BY clauses +* Single-dimensional access patterns + +--- + +## Z-Ordering + +### Version + +Introduced in Hive version [4.2.0](https://issues.apache.org/jira/browse/HIVE-29133) + +### Overview + +Z-order applies a multi-dimensional clustering technique based on space-filling curves. This approach interleaves column values to co-locate related records across multiple dimensions, enabling efficient filtering on various column combinations. + +### Syntax + +```sql +CREATE TABLE table_name (column_definitions) +WRITE [LOCALLY] ORDERED BY ZORDER(column_name [, column_name ]*) +STORED BY ICEBERG +[STORED AS file_format]; +``` + +### Examples + +Two columns: Review Comment: maybe 1 column? what the diff in syntax between 2 and multiple ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
