+1 In <cajpuwmcqpcapjgqwrsfpluzn4smfmlhhrek+qyvnnjdze-j...@mail.gmail.com> "[VOTE] Formalizing "Extension Type" metadata in Arrow binary protocol" on Mon, 10 Jun 2019 15:28:22 -0500, Wes McKinney <wesmck...@gmail.com> wrote:
> hi folks, > > In two mailing list threads [1] [2] we have discussed adding an > "extension type" mechanism to the Arrow binary/IPC protocol. The idea > is to be able to "annotate" built-in Arrow data types with a type name > and serialized type data/metadata so that users can implement their > own custom columnar data containers that contain application-defined > business logic not built-in to the Arrow libraries. This is designed > to be non-obtrusive: readers who are not aware of an extension type > can interact with the built-in Arrow type opaquely, and propagate the > extension metadata unmodified > > As two examples: > > * "uuid" may annotate "fixed size binary of value width 16 bytes" > * "latitude-longitude" may annotate "struct<lat: double, lon: double>" > or similar > > An implementation may provide specialized columnar containers with > additional business logic around manipulating such data in-memory as > required for application development > > We also have prototype implementations of this mechanism ready to go > in C++ and Java. I have proposed language additions to the > specification [3] and the C++ implementation with the following > tenets: > > - The custom_metadata Flatbuffers field shall use the colon character > ":" as a namespace separator > - "ARROW" is designated as a reserved namespace in custom_metadata, > for example "ARROW:property" > - There may be multiple levels of namespacing, for example: > "ARROW:myorg:property_name" > - Extension type fields "ARROW:extension:name" and > "ARROW:extension:metadata" are reserved in custom_metadata to enable > serialization of extension type information > - The details of implementation and how extension types are exposed to > library users is implementation dependent > > Please vote to accept these changes (see [3] for the actual changes). > The vote will be open for at least 72 hours > > [ ] +1: Adopt these changes into the Arrow columnar format specification > [ ] +0: . . . > [ ] -1: I disagree because . . . > > Here is my vote: +1 > > [1]: > https://lists.apache.org/thread.html/96c3f5fe64f45a4c5ccac0562dbfd356b76cd722aa521100b5988d40@%3Cdev.arrow.apache.org%3E > [2]: > https://lists.apache.org/thread.html/f1fc039471a8a9c06f2f9600296a20d4eb3fda379b23685f809118ee@%3Cdev.arrow.apache.org%3E > [3]: https://github.com/apache/arrow/pull/4332