Hi, devs > This PIP is missing a very important link, how to maintain compatibility? > What is behavior for old tags?
I think in order to maintain its previous compatibility, new fields tagCreateTime and tagTimeRetained in Tag should use default value when old version snapshot/tag json to new version Tag Object. I have updated PR(https://github.com/apache/paimon/pull/3159) and part "Compatibility, Deprecation, and Migration Plan" of PIP-20. On Sun, Apr 7, 2024 at 2:06 PM Jingsong Li <[email protected]> wrote: > > Hi, > > This PIP is missing a very important link, how to maintain > compatibility? What is behavior for old tags? > > Please note, Paimon is a storage system, and we need to constantly > maintain its previous compatibility (even some backward compatibility) > during design. > > Best, > Jingsong > > On Wed, Apr 3, 2024 at 4:42 PM wj wang <[email protected]> wrote: > > > > Thank you very much for your valuable suggestions. > > Next, I will provide a PR. > > > > On Wed, Apr 3, 2024 at 4:18 PM yu zelin <[email protected]> wrote: > > > > > Agree. > > > > > > On Wed, Apr 3, 2024 at 11:14 AM Jingsong Li <[email protected]> > > > wrote: > > > > > > > > reuse the 'Snapshot#timeMillis' field > > > > > > > > Don't do this, tag is just snapshot reference, it cannot alter snapshot > > > > fields. > > > > > > > > > the TTL has higher priority > > > > > > > > We should maintain a behavior similar to snapshot expiration, as long > > > > as one of the conditions hits, then delete it without setting any > > > > priority > > > > > > > > Best, > > > > Jingsong > > > > > > > > On Wed, Apr 3, 2024 at 10:33 AM wj wang <[email protected]> wrote: > > > > > > > > > > Thanks jingsong li and yu zelin for reply. > > > > > > > > > > > I think it's similar to the snapshot expire, where both the number > > > and > > > > > time are used to determine whether it should be deleted. This is > > > > > reasonable, and the hit should be deleted. > > > > > > > > > > OK, I will do. > > > > > > > > > > > > > > > > Java API 'createTag': Use 'Duration' as parameter instead of > > > 'String'. > > > > I > > > > > think it's better. > > > > > > > > > > OK > > > > > > > > > > > > > > > > For the field 'tagCreateTime' in class 'Tag': I think we can just > > > > > > use > > > > the > > > > > 'Snapshot#timeMillis' field. The 'timeMillis' is the create time of > > > > > the > > > > > snapshot, I think the time won't be used when we read the > > > > > corresponding > > > > > tag. So I think we can just reuse the field, what do you think? And if > > > do > > > > > so, > > > > > > > > > > I think it's possible to reuse the 'Snapshot#timeMillis' field for > > > > > auto > > > > > created tags, but I don't think 'Snapshot#timeMillis' field can be > > > > > used > > > > for > > > > > non-auto created tags. > > > > > what do you think? > > > > > > > > > > > > > > > > in the tags system table, 'commit_time' can be renamed to > > > 'create_time' > > > > > or 'tag_create_time' or other name. > > > > > > > > > > I think create-time and time-retained is good. > > > > > > > > > > > > > > > > Should we add TTL to auto-created tags? I think we should. Users can > > > > set > > > > > the same TTL for all auto-created tags by table options.My suggestion > > > of > > > > > how to handle `tag.num-retained-max` and TTL is: the TTL has higher > > > > > priority. When we try to expire an auto-created tag, we first found > > > > > candidates by `tag.num-retained-max`, then if the candidate's survival > > > > time > > > > > is less than TTL, we don't expire it. > > > > > > > > > > OK, I will do. > > > > > > > > > > > > > > > > > > > > On Tue, Apr 2, 2024 at 5:26 PM yu zelin <[email protected]> wrote: > > > > > > > > > > > Thanks wj for driving this! I'd like to give some inputs: > > > > > > > > > > > > 1. Java API 'createTag': Use 'Duration' as parameter instead of > > > > 'String'. I > > > > > > think it's better. > > > > > > > > > > > > 2. For the field 'tagCreateTime' in class 'Tag': I think we can just > > > > use > > > > > > the 'Snapshot#timeMillis' field. > > > > > > The 'timeMillis' is the create time of the snapshot, I think the > > > > > > time > > > > won't > > > > > > be used when we read > > > > > > the corresponding tag. So I think we can just reuse the field, what > > > > do you > > > > > > think? And if do so, > > > > > > in the tags system table, 'commit_time' can be renamed to > > > > 'create_time' or > > > > > > 'tag_create_time' or > > > > > > other name. > > > > > > > > > > > > 3. Should we add TTL to auto-created tags? I think we should. Users > > > > can set > > > > > > the same TTL for > > > > > > all auto-created tags by table options.My suggestion of how to > > > > > > handle > > > > > > `tag.num-retained-max` > > > > > > and TTL is: the TTL has higher priority. When we try to expire > > > > auto-created > > > > > > tag, we first found > > > > > > candidates by `tag.num-retained-max`, then if the candidate's > > > survival > > > > time > > > > > > is less than TTL, we > > > > > > don't expire it. > > > > > > > > > > > > Best regards, > > > > > > Zelin Yu > > > > > > > > > > > > > > > > > > On Mon, Apr 1, 2024 at 9:54 AM <[email protected]> wrote: > > > > > > > > > > > > > Hi devs: > > > > > > > > > > > > > > I would like to start a discussion of PIP-20: Introduce TTL for > > > tags > > > > > > which > > > > > > > are not auto-created. [1]. Currently, Paimon has automatic > > > > > > > clearing > > > > > > > mechanisms for tags created by TagAutoCreation, but not for other > > > > tags. > > > > > > It > > > > > > > can't meet our demands.For example:1、The current tag cleanup > > > > mechanism > > > > > > may > > > > > > > lead to resource-wasting.2、Tag does not support TTL, so it is not > > > > > > flexible > > > > > > > to use. > > > > > > > This PIP aims to > > > > > > > support each Tag has its own TTL, so that the user can use the tag > > > > more > > > > > > flexibly and reduce the probability of resource waste.And > > > > > > > Paimon keep up with other data lake products such as Iceberg. > > > > > > > Looking forward to your feedback, thanks. > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=300026341 > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > wangwj > > > > > > > > > > > > >
