Hey Ryan, Thanks for the help! I managed to make this PartitionSpecVisitor approach work to get the 'numBuckets' and 'width' parameters for BUCKET and TRUNCATE transforms, however, I found some shortcomings. E.g. Even if I want to know one specific Transform's parameter, I have to get them for all the partitions in the given PartitionSpec. But what is more concerning is that I can only retrieve a (sourceId/sourceName -> transform param) mapping that works as long as one column is used for one partition transform only. Even in this case there might be a solution to e.g. return a (sourceName + <transform type prefix> -> transform param) mapping but this is getting way more complicated than it should be to get a single param from an object.
I see two ways to make life easier (when querying transform params): 1) Modify PartitionSpecVisitors function to accept a fieldId as well along with sourceName and sourceId 2) Modify the accessibility of Bucket and Truncate classes so that the numBuckets() and width() functions could be accessed outside from the package. What do you think? Gabor On Tue, Sep 22, 2020 at 8:55 PM Ryan Blue <[email protected]> wrote: > Hi Gabor, > > Right now, I think the only way to get those parameters is to implement a > `PartitionSpecVisitor`, which will be passed the parameters. We can > definitely improve the API here where we need to. Initially, I wanted to > avoid having code that would special case transforms instead of delegating > to the Transform API. That's why it is so locked down. > > rb > > On Tue, Sep 22, 2020 at 7:33 AM Gabor Kaszab <[email protected]> > wrote: > >> Hey, >> >> I'm working on the integration of Apache Iceberg project into Apache >> Impala. Currently, I'm investigating how to implement partition transforms >> that have parameters (Bucket and Truncate) and I haven't found a way to >> retrieve their parameters (numBuckets and width) from table metadata >> through the Iceberg API. >> >> I see that there are functions for this purpose (numBuckets() >> <https://github.com/apache/iceberg/blob/master/api/src/main/java/org/apache/iceberg/transforms/Bucket.java#L75> >> and width() >> <https://github.com/apache/iceberg/blob/master/api/src/main/java/org/apache/iceberg/transforms/Truncate.java#L56>) >> but I found that the classes Bucket and Truncate are only accessible within >> their packages and I'm not able to import them into Impala project. I can >> import the base class Transforms but that doesn't provide an interface for >> my needs. >> >> Without this support I won't be able to implement a few things, e.g. SHOW >> CREATE TABLE just to name one. >> >> Am I missing something? Is there a way to get the parameters of a >> partition transform through the API? >> >> Cheers, >> Gabor >> >> >> > > -- > Ryan Blue > Software Engineer > Netflix >
