Hi,

I'm +1 to making v2 the default, say after this release.

It seems most of the features brought up as concerns on Spark side in the
thread Gabor linked have been implemented (like position delete lifecycle).

But Anton's point is also good.  Even if some delete file features are
missing, V2 is not only about delete files, which are not produced by
default in Spark, and Flink(?), but rather the fixes for partition spec
evolution / snapshot id inheritance.  Hence it makes sense to me, from that
angle.

Thanks
Szehon

On Wed, May 24, 2023 at 12:34 AM Gabor Kaszab
<gaborkas...@cloudera.com.invalid> wrote:

> Hey Anton,
>
> Just adding a note that back around January the same topic was brought up
> on this mail list. There the conclusion was to use the 'table-default.'
> catalog level property to create V2 tables by default.
> https://lists.apache.org/thread/9ct0p817qxqqdnv7nb35kghsfygjkqdf
>
> I'm not saying that we shouldn't default to V2 just drawing attention to
> this previous conversation.
>
> Cheers,
> Gabor
>
> On Wed, May 24, 2023 at 12:04 AM Anton Okolnychyi
> <aokolnyc...@apple.com.invalid> wrote:
>
>> Hi folks,
>>
>> Would it be appropriate for us to consider changing the default table
>> format version for new tables from v1 to v2?
>>
>> I don’t think defaulting to v2 tables means all readers have to support
>> delete files. DELETE, UPDATE, MERGE operations will only produce delete
>> files if configured explicitly.
>>
>> The primary reason I am starting this thread is to avoid our workarounds
>> in v1 spec evolution, and snapshot ID inheritance. The latter is critical
>> for the performance of rewriting manifests.
>>
>> Any thoughts?
>>
>> - Anton
>
>

Reply via email to