kasakrisz commented on code in PR #5627:
URL: https://github.com/apache/hive/pull/5627#discussion_r1954304603
##########
iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java:
##########
@@ -857,7 +858,12 @@ public DynamicPartitionCtx createDPContext(
if (sortField.sourceId() == field.fieldId()) {
customSortPositions.add(pos);
customSortOrder.add(sortField.direction() == SortDirection.ASC ? 1
: 0);
- customSortNullOrder.add(sortField.nullOrder() ==
NullOrder.NULLS_FIRST ? 0 : 1);
+
+ NullOrdering nullOrdering = NullOrdering.NULLS_LAST;
+ if (sortField.nullOrder() == NullOrder.NULLS_FIRST) {
+ nullOrdering = NullOrdering.NULLS_FIRST;
+ }
+ customSortNullOrder.add(nullOrdering.getCode());
Review Comment:
There were two failing test without this change.
```
TestHiveIcebergInserts.testSortedInsert
TestHiveIcebergInserts.testSortedAndTransformedInsertIntoPartitionedTable
```
When I analyzed the cause of the failure of these tests I bumped into this
bug:
```
sortField.nullOrder() == NullOrder.NULLS_FIRST ? 0 : 1
```
maps `NULLS_FIRST` to `0` which actually means nulls last in Hive
https://github.com/apache/hive/blob/fa6813f725542bae127f9bec0921ebd01c946947/ql/src/java/org/apache/hadoop/hive/ql/util/NullOrdering.java#L30-L31
> We added a feature to configure the sort order of Iceberg tables. I
wondered if we can keep the semantics of WRITE LOCALLY ORDERED BY a
I found that null sort order is set here when it was not specified
explicitly at create table:
https://github.com/apache/hive/blob/fa6813f725542bae127f9bec0921ebd01c946947/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java#L292
and the defaults are defined at the parser level
https://github.com/apache/hive/blob/fa6813f725542bae127f9bec0921ebd01c946947/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g#L2205-L2206
This is not affected by the setting `hive.default.nulls.last`.
> sortField.nullOrder() is non-null and this change never takes effect in
the real world.
I don't understand this part. Could you please elaborate.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]