I think "inheritance" and "composition" are more concerns for
implementations than they are for spec (I could be wrong here).

So it seems that it would be sufficient to write the HLLSKETCH's canonical
definition as "this is an extension of the JSON logical type and supports
all the same storage types" and then allow implementations to use whatever
inheritance / composition scheme they want to behind the scenes.

On Tue, Apr 30, 2024 at 7:47 AM Matt Topol <zotthewiz...@gmail.com> wrote:

> I think the biggest blocker to doing this is the way that we pass extension
> types through IPC. Extension types are sent as their underlying storage
> type with metadata key-value pairs of specific keys "ARROW:extension:name"
> and "ARROW:extension:metadata". Since you can't have multiple values for
> the same key in the metadata, this would prevent the ability to define an
> extension type in terms of another extension type as you wouldn't be able
> to include the metadata for the second-level extension part.
>
> i.e. you'd be able to have "ARROW:extension:name" => "HLLSKETCH", but you
> wouldn't be able to *also* have "ARROW:extension:name" => "JSON" for its
> storage type. So the storage type needs to be a valid core Arrow data type
> for this reason.
>
> On Tue, Apr 30, 2024 at 10:16 AM Ian Cook <ianmc...@apache.org> wrote:
>
> > The vote on adding a JSON canonical extension type [1] got me wondering:
> Is
> > it possible to define an extension type that is based on a canonical
> > extension type? If so, how?
> >
> > For example, say I wanted to define a (non-canonical) HLLSKETCH extension
> > type that corresponds to the type that Redshift uses for HyperLogLog
> > sketches and is represented as JSON [2]. Is there a way to do this by
> > building on the JSON canonical extension type?
> >
> > [1] https://lists.apache.org/thread/4dw3dnz6rjp5wz2240mn299p51d5tvtq
> > [2] https://docs.aws.amazon.com/redshift/latest/dg/r_HLLSKTECH_type.html
> >
> > Ian
> >
>

Reply via email to