The discussion around adding another interval type to the Schema.fbs raises
the issue of when do we decide to add a new type to the Schema.fbs vs using
other means (primarily extension types [1]).

A few criteria come to mind that could help decide (feedback welcome):

1.  Is the type a new parameterization of an existing type?
    - If Yes, and we believe the parameterization is useful and can be done
in a forward/backward compatible manner then we would update Schema.fbs.

2.  Does the type itself have its own specification for processing (e.g.
JSON, BSON, Thrift, Avro, Protobuf)?
  - If yes, we would NOT add them to Schema.fbs.  I think this would
potentially yield too many new types.

3.  Is the underlying encoding of the type already semantically supported
by a type? (e.g. if we want to encode physical lengths like meters these
can be represented by an integer).
   - If yes, we would NOT update the specification.  This seems like the
exact use-case that extension types are meant for.

* How does this apply to Interval? *
Interval extends an existing type in the specification and multiple "packed
fields" cannot be easily communicated with the current version of the
specification.  Hence, I feel comfortable making the addition to Schema.fbs

* What does this mean for other common types? *

I think as types come up that are very common but we don't want to add to
the Schema.fbs we should invest in formalizing them as "Well Known"
Extension types.  In this scenario, we would update the specification to
include how to specify the extension type metadata (and still require at
least two libraries support the Extension type before inclusion as "Well
Known").

* Practical implications *

I think this means the type system in Schema.fbs is mostly closed (i.e.
there is a high bar for adding new types). One potentially useful type to
have would be a "packed struct" that supports something similar to python
struct library [2].  I think this would likely cover many extension type
use-cases.

Thoughts?

-Micah

[1] https://arrow.apache.org/docs/format/Columnar.html#extension-types
[2] https://docs.python.org/3/library/struct.html

Reply via email to