kokila-19 commented on code in PR #6138:
URL: https://github.com/apache/hive/pull/6138#discussion_r2452718381
##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##########
@@ -929,9 +934,64 @@ public DynamicPartitionCtx createDPContext(
addCustomSortExpr(table, hmsTable, writeOperation, dpCtx,
getSortTransformSpec(table));
}
+ // Even if table has no explicit sort order, honor z-order if configured
+ Map<String, String> props = table.properties();
+ if ("ZORDER".equalsIgnoreCase(props.getOrDefault(SORT_ORDER, ""))) {
+ createZOrderCustomSort(props, dpCtx, table, hmsTable, writeOperation);
+ }
+
return dpCtx;
}
+ /**
+ * Adds a custom sort expression to the DynamicPartitionCtx that performs
local Z-ordering on write.
+ *
+ * Behavior:
+ * - Reads Z-order properties from 'sort.order' and 'sort.columns'
(comma-separated).
+ * - Resolves the referenced columns to their positions in the physical row
(taking into account
+ * ACID virtual columns offset for overwrite/update operations).
+ * - Configures a single ASC sort key with NULLS FIRST and injects a custom
key expression for
+ * Z-order
+ */
+ private void createZOrderCustomSort(Map<String, String> props,
DynamicPartitionCtx dpCtx, Table table,
+ org.apache.hadoop.hive.ql.metadata.Table hmsTable, Operation
writeOperation) {
+ String colsProp = props.get(SORT_COLUMNS);
+ if (StringUtils.isNotBlank(colsProp)) {
+ List<String> zCols = Arrays.stream(colsProp.split(",")).map(String::trim)
+ .filter(s -> !s.isEmpty()).collect(Collectors.toList());
+
+ Map<String, Integer> fieldOrderMap = Maps.newHashMap();
+ List<Types.NestedField> fields = table.schema().columns();
+ for (int i = 0; i < fields.size(); ++i) {
+ fieldOrderMap.put(fields.get(i).name(), i);
+ }
+ int offset = (shouldOverwrite(hmsTable, writeOperation) ?
+ ACID_VIRTUAL_COLS_AS_FIELD_SCHEMA : acidSelectColumns(hmsTable,
writeOperation)).size();
+
+ List<Integer> zIndices = zCols.stream().map(col -> {
+ Integer base = fieldOrderMap.get(col);
+ Preconditions.checkArgument(base != null, "Z-order column not found in
schema: %s", col);
+ return base + offset;
+ }).collect(Collectors.toList());
+
+
dpCtx.setCustomSortOrder(Lists.newArrayList(Collections.singletonList(1)));
Review Comment:
It's not strictly necessary to set these explicitly.
```
Default values:
Sort order: ascending
Null order: nulls last
```
However, in the context of Z-order logic, nulls come first since we encode
them as 0s. That said, the actual ordering can still vary depending on the
values of other columns.
I added them for clarity and control to explicitly define ASC sorting with
null handling and avoid relying on defaults.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]