I think it would be fine to have an int64 with a duration logical type
parameterized by unit. That fits with the timestamp types that we have
defined, which are also parameterized by unit. Iceberg is going to be more
strict about what can be produced, but this is probably a good idea for
Parquet.

On Thu, Jul 10, 2025 at 10:38 AM Micah Kornfield <[email protected]>
wrote:

> >
> > Quick question: is there a good reason not to just have a logical
> Duration
> > that annotates an int64 and let the unit be parameterized instead of hard
> > coding it to be nanoseconds?
>
>
> It adds additional effort/complexity for engines to consider each unit
> (perhaps minimal)? I do think it's reasonable to parameterize `TimeUnit`
> for consistency and future proofing but for now we should say it only
> supports Nanoseconds (unless someone is signing up to support the other
> units in the implementation).
>
> I think the point was raised previously that hard-coded names were
> preferred but I don't recall if that was when we were still calling this
> DayTime?
>
> Cheers,
> Micah
>
> On Thu, Jul 10, 2025 at 9:58 AM Matt Topol <[email protected]> wrote:
>
> > Quick question: is there a good reason not to just have a logical
> Duration
> > that annotates an int64 and let the unit be parameterized instead of hard
> > coding it to be nanoseconds?
> >
> > That would at least allow the full 10k years for other units, and allow
> > better compression if nanosecond precision isn't needed.
> >
> > Thoughts?
> >
> > On Wed, Jul 9, 2025, 11:02 AM Micah Kornfield <[email protected]>
> > wrote:
> >
> > > OK to summarize what I think the current proposal for interval type is
> > two
> > > new logical types:
> > >
> > > 1.  YearMonth interval annotates an int32.
> > > 2.  DurationNanos annotates an int64.
> > >
> > > There is now a separate thread, on int128 vs FLBA. Given the current
> > > proposal I don't think this blocks anything.  The main difficulty in
> > adding
> > > a newly annotated physical type would be API design allowing a
> > potentially
> > > wider type in the future.  I think this is tractable but any blockers
> > could
> > > be discovered in the implementation phase?
> > >
> > > > +1 to FLBA and VLBA. What would BIT represent? Could you elaborate
> > >
> > > I think the intent would be boolean.
> > >
> > > On Wednesday, July 9, 2025, Alkis Evlogimenos
> > > <[email protected]> wrote:
> > >
> > > > On Wed, Jul 9, 2025 at 11:05 AM Antoine Pitrou <[email protected]>
> > > wrote:
> > > >
> > > > > But if we were designing a new Parquet format from scratch, I would
> > > > > definitely advocate for a reduced set of 3 physical types: BIT,
> FLBA
> > > > > and VLBA.
> > > > >
> > > >
> > > > +1 to FLBA and VLBA. What would BIT represent? Could you elaborate?
> > > >
> > >
> >
>

Reply via email to