Hi Dyno, You’re right that we need a matrix for *bindByTransformedValue*. My concern with *toSourceTypeValue* is that the conversion is not one-to-one, and exposing an API with such loosely defined return semantics is generally not a good design choice.
Given that this is a low-level API change, I would prefer to involve more people in the discussion before moving forward. Thanks, Peter Dyno Fu <[email protected]> ezt írta (időpont: 2026. jan. 27., K, 4:08): > thanks Peter, > For `bindByTransformedValue`, from the signature, it would need to > encode a conversion matrix between different transformations. we can > also use `toSourceTypeValue` to implement `bindByTransformedValue` if you > think `bindByTransformedValue` as an api is semantically more meaningful. > -Dyno > > On Tue, Dec 16, 2025 at 2:57 AM Péter Váry <[email protected]> > wrote: > >> Hi Team, >> >> Thanks Dyno for bringing this up on the dev list! >> >> For the others, the original goal is that if we have two transformations >> where *T1.satisfiesOrderOf(T2)*, then given a partition value P1 for T1, >> we should be able to derive the corresponding partition value P2 for T2 >> (for example, the day 2025-10-18 exactly determines the month 2025-10). One >> possible approach is the API Dyno proposed, which would be part of the >> Transform interface. I’ve included your suggested Javadoc at the end of >> this message for reference. >> >> The alternative we discussed was something like: >> >> *<P> SerializableFunction<S, T> bindByTransformedValue(Transform<?, P> >> otherTransform, P otherOutput)* >> >> >> This is a very low-level API, and I’d prefer to extend it only if no >> better alternative exists. If you have other ideas or suggestions, we’d be >> happy to hear them. >> >> Thanks, >> Peter >> >> The javadoc for the API proposed by Dyno: >> >> >> * /*** >> * * Converts a transformed partition value back to a representative >> source type value.* >> * ** >> * * <p>This method returns a source value that would produce the given >> transformed value when this* >> * * transform is applied. For temporal transforms, this returns the >> start of the period (e.g.,* >> * * start of hour, day, month, or year). For truncate transforms, this >> returns the truncated value* >> * * as-is since it preserves the source type.* >> * ** >> * * <p>This is useful for chaining transforms when {@link >> #satisfiesOrderOf(Transform)} is true,* >> * * allowing conversion from a finer granularity to a coarser one by >> converting back to source type* >> * * and reapplying the coarser transform.* >> * ** >> * * @param sourceType the source type for this transform* >> * * @param transformedValue the transformed partition value* >> * * @return a source value that would produce this transformed value, >> or null if the input is null* >> * * @throws UnsupportedOperationException if this transform does not >> support conversion back to* >> * * source type* >> * */* >> default S toSourceTypeValue(Type sourceType, T transformedValue) { >> >> >> >> Dyno Fu <[email protected]> ezt írta (időpont: 2025. dec. 15., H, 20:53): >> >>> Hello Iceberg devs, >>> >>> I’d like to reopen the discussion on >>> https://github.com/apache/iceberg/pull/14281 (“Core: Group binpack >>> fileGroup by output partitionSpec”) that was marked as stable last week. >>> >>> This patch introduces an enhancement to the rewrite_data_files action: >>> instead of grouping files by the current table partition spec, it groups >>> them by the output partition spec provided in the rewrite parameters. This >>> behavior enables more efficient bin-packing of small files when rolling >>> data up into a coarser or alternate partition layout. >>> >>> the current concern for the implementation is the introduce of the the >>> new api >>> >>> default S toSourceTypeValue(Type sourceType, T transformedValue) >>> >>> which is used to normalize the partition value back to the source type. >>> for example an hour transform value of `489118` to a timestamp `2025-10-18 >>> 22:00:00` so that a different partition transform (e.g. day transform) can >>> apply to it. >>> >>> what's your opinion on whether this is the right abstraction or any >>> alternative? >>> @pvary please share your thoughts as our discussion over slack. >>> appreciated. thanks. >>> >>> regards, >>> Dyno >>> >>> -- >>> reality, with all its ambiguities, does the job just fine. >>> >> > > -- > reality, with all its ambiguities, does the job just fine. >
