ZacBlanco commented on code in PR #162:
URL:
https://github.com/apache/datasketches-website/pull/162#discussion_r1509226108
##########
docs/Architecture/LargeScale.md:
##########
@@ -33,7 +33,9 @@ layout: doc_page
* The C++ Core is extended using the python binding library
[pybind11](https://github.com/pybind/pybind11) enabling high performance
operation from Python.
### Cross Language Binary Compatibility
-* Sketches serialized from C++ or Python can be interpreted by compatible Java
sketches and visa versa.
+* Sketches serialized from C++ or Python can be interpreted by compatible Java
sketches and visa versa.
+
+* All sketches have a serialized form which is able to be deserialized by any
version of the library since the sketch was introduced.
Review Comment:
Chiming in to add some more motivation to why we want this clarified.
It's common in our world to have many different systems writing and querying
data. E.g. Apache Spark, Flink, [Presto](https://github.com/prestodb/presto),
etc. Our concern is that some of these systems may use different versions of
the datasketches library to generate serialized sketches. We just need to be
aware of the guarantees that the library makes on the binary format so that we
can guarantee compatibility to the best of our ability, not just on upgrades
from version to version of one piece of software, but also so that each
different system can potentially understand sketches generated by other systems.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]