Re: UNION Type in Calcite

Jacques Nadeau Thu, 17 Dec 2015 11:36:08 -0800

Yes, the model is basically bounded dynamic typing. It means that there are
some things we can check but other things we can't do. For example, we
can't/shouldn't implicitly cast an ANY_OF(.,.) value.


On Thu, Dec 17, 2015 at 11:13 AM, Julian Hyde <[email protected]> wrote:

> Alright, now I understand what you want. I don’t know of a programming
> language that has that type system and that is a red flag - we will
> struggle to find a consistent semantics for this.
>
> I know a little about algebraic data types [1] - they are used by Haskell,
> among other languages - and these seem to be the richest type system out
> there (albeit for static typing), but even these use tagged unions.
>
> Julian
>
> [1] https://en.wikipedia.org/wiki/Algebraic_data_type
>
>
> > On Dec 17, 2015, at 10:52 AM, andrew <[email protected]> wrote:
> >
> > Julien,
> >
> > I think the difference is that, whereas in a C union you have a number
> of named variables, e.g. u.i, in our case we are dealing not with typed
> variable declarations, but rather with the types themselves. You would not
> be able to reference into the union’s members with a “.”; there are no
> members to reference.
> >
> > As Jacques mentioned, I think it’s easier to think of this as akin to
> ANY. Maybe thinking of it as ANY_OF(INT, ARRAY(INT)) makes it easier to
> consider.
> >
> > If you want to keep the definition of UNION closer to that of a C
> struct, then perhaps we can:
> >
> > A) Add the ANY_OF type
> > B) Modify ANY to be parametrizable with zero of more types
> >
> > - A
> >
> >
> >> On Dec 16, 2015, at 8:54 PM, Julian Hyde <[email protected]> wrote:
> >>
> >> Jacques,
> >>
> >> What kind of a union type were you thinking of? I was thinking of
> >> something like a C union, where you still need to use a field to
> >> indicate which sub-type you want. In C if you have
> >>
> >> typedef union {
> >>   int i;
> >>   double d;
> >> } u;
> >>
> >> void foo(u);
> >> void foo(int);
> >> void foo(double);
> >>
> >> and you write
> >>
> >> union u;
> >> foo(u);
> >>
> >> then the first "foo" gets called, and if you write
> >>
> >> foo(u.d);
> >>
> >> then the last "foo" gets called.
> >>
> >> The only difference between union and struct in C is that in the
> >> union, the members occupy the same storage. So what I'm proposing for
> >> Calcite is basically a struct.
> >>
> >> Julian
> >>
> >>
> >> On Wed, Dec 16, 2015 at 6:36 PM, Jacques Nadeau <[email protected]>
> wrote:
> >>> I don't think it would. We want to still do validation and we won't be
> >>> returning a struct, we'll be returning one or the other. Think of this
> as a
> >>> narrowing of the ANY type to a subset of known possibilities. Something
> >>> that fits a particular possibility should be allowed but for example,
> in
> >>> the (varbinary, varbinary[]) case, you should be able to use functions
> that
> >>> only support varbinary or varbinary[] but not functions that expect
> int or
> >>> varchar.
> >>>
> >>> Without specifically stating c.i or c.ai, you wouldn't get any
> validation.
> >>> Additionally, I'd expect the validator to reject valid expressions
> such as
> >>> c + 4 (where c is the union field).
> >>>
> >>> On Wed, Dec 16, 2015 at 4:23 PM, andrew <[email protected]> wrote:
> >>>
> >>>> That sounds like it might fit the bill. Thanks Julien.
> >>>>
> >>>>
> >>>>> On Dec 16, 2015, at 1:30 PM, Julian Hyde <[email protected]>
> wrote:
> >>>>>
> >>>>> You could declare a STRUCT(i INT, ai ARRAY(INT)) and make sure
> exactly
> >>>> one of i and ai is set.
> >>>>>
> >>>>>> On Dec 16, 2015, at 10:51 AM, andrew <[email protected]> wrote:
> >>>>>>
> >>>>>> I’m wondering if it would be possible to add a UNION type in
> Calcite.
> >>>>>>
> >>>>>> My use case is that I am developing a backend storage engine using
> >>>> Calcite that only partially declares its schema. Specifically, the
> engine
> >>>> declares columns such as ‘int’ when it actually means the column can
> >>>> contain ‘int’ or ‘array of int’. There is no way to tell without
> actually
> >>>> reading the table if the data is scalar or array. Indeed it can be
> both.
> >>>>>>
> >>>>>> My idea is that if Calcite had a UNION type, I could declare the
> >>>> columns as e.g. UNION(INT, ARRAY(INT)).
> >>>>>>
> >>>>>> Does this sound reasonable? Or is there a better way of handling
> this
> >>>> situation using the current features of Calcite?
> >>>>>>
> >>>>>> Thanks.
> >>>>>> - A
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >
>
>

Re: UNION Type in Calcite

Reply via email to