[jira] [Commented] (THRIFT-5340) Document schema evolution features

Jens Geyer (Jira) Sat, 23 Jan 2021 02:13:04 -0800


    [ 
https://issues.apache.org/jira/browse/THRIFT-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270606#comment-17270606
 ]


Jens Geyer commented on THRIFT-5340:
------------------------------------

{quote}
Adding an optional field to the end of a struct is fully compatible 
{quote}

First, there is required, optional (with keywords) and the implicit "default". 
Default is comparable to optional, but not exactly the same.

Adding a non-required field at any place is fully compatible, as long as the ID 
has not been unsed before in the context of that struct with a different 
meaning. 

{quote}
Removing an optional field (while not re-using the identifier) is fully 
compatible
{quote}

Removing a non-required field ...

{quote}
Adding a required field to the end of a struct is forwards compatible. The new 
schema encodes the new field and the old schema gracefully omits it as its 
unknown
Conversely, removing a required field from the end of the struct is backwards 
compatible. The old schema encodes the new field and the new schema omits it as 
its unknown
{quote}

Adding a required field *at any place* is not fully compatible. Older *readers* 
will not know nothing about the newly added field, including the fact that the 
field is required. So they will not only be unable to process the data and just 
skip it, it will also pass unnoticed that the now required field is not set. 
Other than that, nothing bad will happen. Older *writers* however will write 
messages that newer readers regard as incomplete due to the missing required 
field, so this is *not* compatible. 

Same with removing a required field from any place, just in the opposite 
direction. If a new writer writes a message without the required field, the 
other end will fail and discard the entire message, because it is incomplete. 

That's why the saying goes "required is forever". 

{quote}
Changing a field from optional to required is forwards compatible. The new 
schema always sets the field and the old schema can parse it correctly.
Conversely, following the same reasoning, changing a field from required to 
optional is backwards compatible
{quote}

As long as both sides know about the field and actually populate it, yes. If 
any side does not populate a value for the field, the side checking for a 
required field being properly filled will throw.

{quote}
Adding a choice to an existing union is backwards compatible. The new schema 
can always parse the data produced by the old schema as the new choices are a 
superset of the previous choices. Conversely, removing a choice from an 
existing union is forwards compatible
{quote}

Mainly because there are no required fields in unions.

{quote}
Changing an enumeration into a 32-bit integer is backwards compatible as the 
set of valid enumeration constants are a subset of the range of values in the 
integer. Conversely, turning a 32-bit integer into an enumeration is forwards 
compatible only. The data produced by the new schema will always become a valid 
int32 value according to the old schema
{quote}

Interesting use case, but agree.

{quote}
Similar to the union case, adding a new enumeration constant is backward 
compatible and removing an enumeration constant (without re-using it in the 
future) is forwards compatible
{quote}

That's the idea.

{quote}
Changing a string field into a "binary" field is backwards compatible. Every 
UTF-8 string produced by the old schema is of course a valid byte-string from 
the point of view of the new schema
{quote}

Will probably work. Never tested. The recommendation is not fiddle with data 
types, instead if you change the type it should be considered a different 
field. 

{quote}
Conversely, changing a "binary" field into a string field is forwards 
compatible. It is not backwards compatible as not every byte-array is a valid 
UTF-8 string
{quote}

Same here. You know, *relying on undocumented behaviour is what breaks 
applications in the long run*, so I personally would not recommend such 
practice to anyone. Same with the enums above.


> Document schema evolution features
> ----------------------------------
>
>                 Key: THRIFT-5340
>                 URL: https://issues.apache.org/jira/browse/THRIFT-5340
>             Project: Thrift
>          Issue Type: Improvement
>          Components: Documentation
>            Reporter: Juan Cruz Viotti
>            Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> I could not find a section in the documentation outlining the schema 
> evolution/versioning features that Thrift provides.
> In case there is none, I volunteer to write the first draft, as I've been 
> writing a paper involving Apache Thrift as part of my MSc at University of 
> Oxford, and ran plenty of schema evolution experiments.
> Please let me know your thoughts and where would this section fit!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (THRIFT-5340) Document schema evolution features

Reply via email to