Re: [DISCUSS] Partition tuples in v4

Anoop Johnson Mon, 04 May 2026 08:00:30 -0700

Amogh,

That is a good point. But the partition and stats-based evaluation paths
are typically separate. For partition evaluation, we compare against an
exact value, and for stats-based pruning, we look at the range of values in
the column stats.

Even if we store partition values in the content stats, it would follow the
partition evaluation path. The new V4 manifest reader would just need to
look at the partition value's lower_bound in the content stats instead of
an explicit partition tuple field. The partition evaluator itself will be
unchanged.

This is conceptually no different than the current partition tuple. Storing
it in content_stats with only lower_bound preserves the same semantics, but
aligns with how the rest of the column stats are stored.

But let's discuss the tradeoffs of the various options.  Looking forward to
the discussion in an hour.

Best,
Anoop

On Sun, May 3, 2026 at 6:45 PM Amogh Jahagirdar <[email protected]> wrote:

> I realized I gave a poor example of the semantic issue with removing upper
> bound for partition outputs, but the crux is that in that modeling the
> stats on partition outputs would be treated in a special way where upper
> bound being null means it's partitioned rather than "unknown", which is
> inconsistent with the other stats.
>
>>

Re: [DISCUSS] Partition tuples in v4

Reply via email to