leerho commented on code in PR #162:
URL: 
https://github.com/apache/datasketches-website/pull/162#discussion_r1509520250


##########
docs/Architecture/LargeScale.md:
##########
@@ -33,7 +33,9 @@ layout: doc_page
        * The C++ Core is extended using the python binding library 
[pybind11](https://github.com/pybind/pybind11) enabling high performance 
operation from Python.
 
 ### Cross Language Binary Compatibility
-* Sketches serialized from C++ or Python can be interpreted by compatible Java 
sketches and visa versa. 
+* Sketches serialized from C++ or Python can be interpreted by compatible Java 
sketches and visa versa.
+
+* All sketches have a serialized form which is able to be deserialized by any 
version of the library since the sketch was introduced.
 

Review Comment:
   “Prediction is very difficult, especially if it's about the future!” -- 
Niels Bohr.  
   
   I don't know of any software that guarantees forward compatibility forever. 
Which means old code can always read structures created by future code.  Even 
international standards bodies don't guarantee that.  The Java language doesn't 
guarantee that with its class version IDs. Non-compatible changes can occur for 
lots of reasons including changes required for security reasons, obsolescence 
of language features, or new capabilities that were not imagined when the 
original code was created.  
   
   We recognize the challenge in large system environments, with different 
languages, different platforms all potentially using different versions of the 
software.  And we are trying our best to provide capabilities to at least allow 
these large environments to be able to interchange serialized sketches across 
languages and platforms efficiently. We are not aware of any other open-source 
sketch library that even provides this capability.  Cross version compatibility 
of software is a challenge that all platforms face in general.  It is up to the 
platform maintainers to keep their software up-to-date, and this not new and 
not different here.
   
   Nonetheless, to put your mind somewhat at ease, realize that we have two 
levels of versioning in our library (this is true across all of our languages):
   
   - **Software Version**: this is the release version, published via Apache 
and specified in the POM file or equivalent, this can change relatively 
frequently based on bug fixes and introduction of new capabilities.  Here, we 
try very hard to obey the principles of Semantic Versioning as specified by 
[semver.org](https://semver.org).  
   
   - **Serialization Version**: (SerVer) This is a small integer placed in the 
preamble of the serialized byte array that indicates the version of the 
serialized structure for the sketch.  A single SerVer may represent multiple 
structures all based on the same sketch when stored in different states, e.g., 
Single Item, Compact, Updatable, etc).   This SerVer changes VERY rarely, if at 
all.  Of all of our sketches, only 3, (Theta, KLL and Sampling) have more than 
one SerVer. There are and will be many Software Versions of the same sketch 
that still use the same SerVer.  When we are forced (rarely) to update the 
SerVer, we provide the capability in the Software Version of the code 
associated with the new SerVer the ability to read and convert the old SerVer 
to the new SerVer.  This is why our newest Software Versions can still read and 
interpret older SerVer serialized sketches that go back to when our project was 
started at Yahoo (2012), and before we went open-source (2015).  
   
   This means that as long as the SerVer is the same, older Software Versions 
should be able to read sketch images created by newer software versions.  But 
the APIs may be different, obviously.  An older SW version will not be able to 
take advantage of new features introduced in new SW versions, but it should be 
able to do what it did before.  In other words, there will be no loss of access 
to the serialized sketch and its older SW version capabilities.  As a user, you 
don't need to worry about or be able to access the SerVer.  If a sketch is 
presented with a new SerVer that it is not compatible with, the sketch should 
throw an exception and say what the problem is, just like Java does.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to