[GitHub] [iceberg] rdblue commented on pull request #1175: Generilize the BasePartitionkey to abstract the common codes for spark and flink.

GitBox Fri, 10 Jul 2020 15:20:27 -0700


rdblue commented on pull request #1175:
URL: https://github.com/apache/iceberg/pull/1175#issuecomment-656919000



   @openinx, I opened an alternative to this PR, #1195. Please take a look.
   
   This solution looks fairly clean for producing a `PartitionKey` for a 
specific format, but it requires building a subclass of `PartitionKey` for 
every row representation as well as new `Accessor` classes. I'd like to make it 
possible to reuse the existing `PartitionKey` class as well as the existing 
`Accessor` implementations (produced by `Schema.accessorForField(id)`) that are 
currently used for expression evaluation.
   
   The approach I took in the other PR is to reuse the existing accessors, 
which accept a `StructLike`. To make that work, I just needed to add a wrapper 
class that adapts Spark's `InternalRow` to `StructLike`, and that converts 
Spark objects to Iceberg's internal representation. I think that's going to be 
a better long-term approach than multiple `PartitionKey` classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] rdblue commented on pull request #1175: Generilize the BasePartitionkey to abstract the common codes for spark and flink.

Reply via email to