aokolnychyi commented on code in PR #4812:
URL: https://github.com/apache/iceberg/pull/4812#discussion_r921571789
##########
core/src/main/java/org/apache/iceberg/MetadataColumns.java:
##########
@@ -53,6 +53,8 @@ private MetadataColumns() {
public static final String DELETE_FILE_ROW_FIELD_NAME = "row";
public static final int DELETE_FILE_ROW_FIELD_ID = Integer.MAX_VALUE - 103;
public static final String DELETE_FILE_ROW_DOC = "Deleted row values";
+ public static final int POSITION_DELETE_TABLE_PARTITION_FIELD_ID =
Integer.MAX_VALUE - 104;
Review Comment:
It seems we have a static column in the metadata table that we plan to
populate via the mechanism for metadata columns. It looks a little bit
suspicious to me.
I feel we should pick one of these options:
- Have only `path`, `pos`, `row` columns in the table and use `_partition`
and `_spec_id` metadata columns. That will mean we have to support filter
pushdown on metadata columns. It is easy to handle this on the Spark side but
we will also have to adapt ALL of our binding code to allow binding predicates
with metadata columns. The last part will be a big change.
- Make `partition` and `spec_id` columns static and use `DataTask`.
I kind of like using `FileScanTask` for this effort to support vectorized
reads so the first option seems preferable.
Thoughts, @szehon-ho @RussellSpitzer @rdblue?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]