szehon-ho commented on code in PR #11555:
URL: https://github.com/apache/iceberg/pull/11555#discussion_r1903293655
##########
core/src/main/java/org/apache/iceberg/io/DeleteSchemaUtil.java:
##########
@@ -43,4 +43,15 @@ public static Schema pathPosSchema() {
public static Schema posDeleteSchema(Schema rowSchema) {
return rowSchema == null ? pathPosSchema() : pathPosSchema(rowSchema);
}
+
+ public static Schema posDeleteReadSchema(Schema rowSchema) {
Review Comment:
Somehow after the rebase this is needed to fix the unit tests (there must be
some intervening change related to delete readers), which fails saying 'row' is
required. Previously it used the method above `pathPosSchema(rowSchema)`,
which has 'row' as required.
Note that Spark and all readers now actually seem to no longer include the
'row' field in the read schema
https://github.com/apache/iceberg/blob/main/data/src/main/java/org/apache/iceberg/data/BaseDeleteLoader.java#L70.
But as this is a rewrite, I do want to read the row field and preserve it if
it is set by some tools.
So I am taking the strategy of RewritePositionDelete and actually reading
this field, but as optional to avoid the assert error if it is not found.
https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/PositionDeletesTable.java#L118.
(the reader there is derived from schema of metadata table
PositionDeletesTable).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]