rdblue commented on issue #845: Add persistent IDs to partition fields
URL: https://github.com/apache/incubator-iceberg/pull/845#issuecomment-608612302
 
 
   I'm only about half-way done reviewing this, but I wanted to capture some 
thoughts about forward-compatibility that was raised by @chenjunjiedada.
   
   If there are already multiple partition specs, then the IDs may be reused 
and can even conflict. This isn't something we can change because manifest 
files embed the field IDs in their schemas. That means assignment when there 
are no IDs must be from 1000 and should be independent across different 
partition specs.
   
   If an older version writes to the table, then it may remove any assigned 
partition IDs. That means that for any format v1 table, we must remain 
compatible with the current assignment strategy. That way, IDs can be removed 
by an old writer and will be the same when they are reassigned.
   
   This also means that evolution is limited in v1 tables. To ensure that IDs 
can be reassigned correctly if they are removed, partition fields cannot be 
dropped or reordered in any way. Otherwise, reassignment would be incorrect. 
That means no removing partition fields, no reordering partition fields, and no 
adding partition fields unless they are added at the end of the spec.
   
   We will be able to make more evolution changes when we can guarantee that 
all partition fields have IDs that won't be removed. We'll make the IDs a 
requirement in v2 tables.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to