Re: [DISCUSS] INT96 stats

Alkis Evlogimenos Wed, 25 Jun 2025 05:16:16 -0700

Spark needs to start writing INT64 nanos first to be able to replace INT96
which is in nanos if data is at nano granularity. This is why I linked that
ticket which is a prerequisite to switching to INT64 in many cases.


I understand the concerns around changing a deprecated aspect of the
parquet spec. The reason we decided to bring this forward is because:
1. there are a lot of parquet files with the right INT96 stats outthere
(Photon has been writing them for years)
2. all engines ignore the INT96 stats so Photon writing them didn't break
anyone
3. Spark is (slowly) moving away from INT96
4. our change is very narrow, backwards compatible and can improve current
workloads while (3) is ongoing

Let's discuss more at the sync tonight.

> If we are going to standardize an ordering for INT96, rather than parsing
"created_by" fields, wouldn't it make more sense to add a new ColumnOrder
value (like what's proposed for PARQUET-2249 [1])? Then we don't need to
maintain a list of known good writers.

We do not have to add another ColumnOrder value since INT96 is a *physical*
type and can only take timestamps in the specified format. This was
arguably a design wart as it should have been a FIXED_LEN_BYTE_ARRAY(12)
with logical type INT96_TIMESTAMP, for which a different ColumnOrder would
make sense. In this case we are lucky this is a physical type without
logical type attached because otherwise, we couldn't have made this change
in a backwards compatible way as easily.

On Sat, Jun 21, 2025 at 12:57 AM Ed Seidl <[email protected]> wrote:

> If we are going to standardize an ordering for INT96, rather than parsing
> "created_by" fields, wouldn't it make more sense to add a new ColumnOrder
> value (like what's proposed for PARQUET-2249 [1])? Then we don't need to
> maintain a list of known good writers.
>
> Ed
>
> [1] https://github.com/apache/parquet-format/pull/221
>
> On 2025/06/19 10:15:13 Andrew Lamb wrote:
> > > While INT96 is now deprecated, it's still the default timestamp type in
> > > Spark, resulting in a significant amount of existing data written in
> this
> > > format.
> >
> > I agree with Gang and Antoine that the better solution is to change Spark
> > to write non deprecated parquet data types.
> >
> > It seems there is an issue in the Spark JIRA to do this[1] but the only
> > feedback on the associated PR [2] is that it is a breaking change.
> >
> > If Spark is going to keep writing INT96 timestamps indefinitely, I
> suggest
> > we un-deprecate the INT96 timestamps to reflect the ecosystem reality
> that
> > they will be here for a while rather than pretending they are really
> > deprecated.
> >
> > Andrew
> >
> > [1]: https://issues.apache.org/jira/browse/SPARK-51359
> > [2]: https://github.com/apache/spark/pull/50215#issuecomment-2715147840
> >
> > p.s. as an aside, is anyone from DataBricks pushing spark to change
> > timestamp type? Or will the focus be to  improve INT96 timestamps
> instead?
> >
> >
> > On Wed, Jun 18, 2025 at 10:50 PM Gang Wu <[email protected]> wrote:
> >
> > > It seems not adding too much value to improve a deprecated feature
> > > especially
> > > when there are abundant Parquet implementations in the wild. IIRC,
> > > parquet-java
> > > is planning to release 1.16.0 for new data types like variant and
> geometry.
> > > It is
> > > also the last version to support Java 8. All deprecated APIs might get
> > > removed
> > > from 2.0.0 so I'm not sure if older Spark versions are able to
> leverage the
> > > int96
> > > stats. The right way to go is to push forward the adoption of timestamp
> > > logical
> > > types.
> > >
> > > Best,
> > > Gang
> > >
> > > On Thu, Jun 19, 2025 at 12:31 AM Micah Kornfield <
> [email protected]>
> > > wrote:
> > >
> > > > Hi Alkis,
> > > > Is this the right thread link?  It seems to be a discussion on
> Timestamp
> > > > Nano support (which IIUC won't use int96, but I'm not sure this
> covers
> > > > changing the behavior for existing timestamps, which I think are at
> > > either
> > > > millisecond or microsecond granularity)?
> > > >
> > > > there will be customers that want to interface with legacy systems
> > > > > with INT96. This is why we decided in doing both.
> > > >
> > > >
> > > > It might help to elaborate on the time-frame here.  Since it appears
> > > > reference implementations of parquet are not currently writing
> > > statistics,
> > > > if we merge these changes when they will be picked up in Spark?
> Would the
> > > > plan be to backport the parquet-java to older version of Spark
> (otherwise
> > > > the legacy systems wouldn't really make use or emit stats anyways)?
> What
> > > > is the delta between Spark picking up these changes and
> transitioning off
> > > > of Int96 by default?   Is the expectation that even once the default
> is
> > > > changed in spark to not use int96, there will be a large number of
> users
> > > > that will override the default to write int96?
> > > >
> > > > Thanks,
> > > > Micah
> > > >
> > > > On Wed, Jun 18, 2025 at 1:35 AM Alkis Evlogimenos
> > > > <[email protected]> wrote:
> > > >
> > > > > We are also driving that in parallel:
> > > > > https://lists.apache.org/thread/y2vzrjl1499j5dvbpg3m81jxdhf4b6of.
> > > > >
> > > > > Even when Spark defaults to INT64 there will be old versions of
> Spark
> > > > > running, there will be customers that want to interface with legacy
> > > > systems
> > > > > with INT96. This is why we decided in doing both.
> > > > >
> > > > > On Wed, Jun 18, 2025 at 9:53 AM Antoine Pitrou <[email protected]
> >
> > > > wrote:
> > > > >
> > > > > >
> > > > > > Can we get Spark to stop emitting INT96? They are not being an
> > > > > > extremely good community player here.
> > > > > >
> > > > > > Regards
> > > > > >
> > > > > > Antoine.
> > > > > >
> > > > > >
> > > > > > On Fri, 13 Jun 2025 15:17:51 +0200
> > > > > > Alkis Evlogimenos
> > > > > > <[email protected]>
> > > > > > wrote:
> > > > > > > Hi folks,
> > > > > > >
> > > > > > > While INT96 is now deprecated, it's still the default timestamp
> > > type
> > > > in
> > > > > > > Spark, resulting in a significant amount of existing data
> written
> > > in
> > > > > this
> > > > > > > format.
> > > > > > >
> > > > > > > Historically, parquet-mr/java has not emitted or read
> statistics
> > > for
> > > > > > INT96.
> > > > > > > This was likely due to the fact that standard byte comparison
> on
> > > the
> > > > > > INT96
> > > > > > > representation doesn't align with logical comparisons,
> potentially
> > > > > > leading
> > > > > > > to incorrect min/max values. This is unfortunate because
> timestamp
> > > > > > filters
> > > > > > > are extremely common and lack of stats limits optimization
> > > > > opportunities.
> > > > > > >
> > > > > > > Since its inception Photon <
> > > > https://www.databricks.com/product/photon>
> > > > > > emitted
> > > > > > > and utilized INT96 statistics by employing a logical
> comparator,
> > > > > ensuring
> > > > > > > their correctness. We have now implemented
> > > > > > > <https://github.com/apache/parquet-java/pull/3243> the same
> > > support
> > > > > > within
> > > > > > > parquet-java.
> > > > > > >
> > > > > > > We'd like to get the community's thoughts on this addition. We
> > > > > anticipate
> > > > > > > that most users may not be directly affected due to the
> declining
> > > use
> > > > > of
> > > > > > > INT96. However, we are interested in identifying any potential
> > > > > drawbacks
> > > > > > or
> > > > > > > unforeseen issues with this approach.
> > > > > > >
> > > > > > > Cheers
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] INT96 stats

Reply via email to