Yes, the model is basically bounded dynamic typing. It means that there are some things we can check but other things we can't do. For example, we can't/shouldn't implicitly cast an ANY_OF(.,.) value.
On Thu, Dec 17, 2015 at 11:13 AM, Julian Hyde <[email protected]> wrote: > Alright, now I understand what you want. I don’t know of a programming > language that has that type system and that is a red flag - we will > struggle to find a consistent semantics for this. > > I know a little about algebraic data types [1] - they are used by Haskell, > among other languages - and these seem to be the richest type system out > there (albeit for static typing), but even these use tagged unions. > > Julian > > [1] https://en.wikipedia.org/wiki/Algebraic_data_type > > > > On Dec 17, 2015, at 10:52 AM, andrew <[email protected]> wrote: > > > > Julien, > > > > I think the difference is that, whereas in a C union you have a number > of named variables, e.g. u.i, in our case we are dealing not with typed > variable declarations, but rather with the types themselves. You would not > be able to reference into the union’s members with a “.”; there are no > members to reference. > > > > As Jacques mentioned, I think it’s easier to think of this as akin to > ANY. Maybe thinking of it as ANY_OF(INT, ARRAY(INT)) makes it easier to > consider. > > > > If you want to keep the definition of UNION closer to that of a C > struct, then perhaps we can: > > > > A) Add the ANY_OF type > > B) Modify ANY to be parametrizable with zero of more types > > > > - A > > > > > >> On Dec 16, 2015, at 8:54 PM, Julian Hyde <[email protected]> wrote: > >> > >> Jacques, > >> > >> What kind of a union type were you thinking of? I was thinking of > >> something like a C union, where you still need to use a field to > >> indicate which sub-type you want. In C if you have > >> > >> typedef union { > >> int i; > >> double d; > >> } u; > >> > >> void foo(u); > >> void foo(int); > >> void foo(double); > >> > >> and you write > >> > >> union u; > >> foo(u); > >> > >> then the first "foo" gets called, and if you write > >> > >> foo(u.d); > >> > >> then the last "foo" gets called. > >> > >> The only difference between union and struct in C is that in the > >> union, the members occupy the same storage. So what I'm proposing for > >> Calcite is basically a struct. > >> > >> Julian > >> > >> > >> On Wed, Dec 16, 2015 at 6:36 PM, Jacques Nadeau <[email protected]> > wrote: > >>> I don't think it would. We want to still do validation and we won't be > >>> returning a struct, we'll be returning one or the other. Think of this > as a > >>> narrowing of the ANY type to a subset of known possibilities. Something > >>> that fits a particular possibility should be allowed but for example, > in > >>> the (varbinary, varbinary[]) case, you should be able to use functions > that > >>> only support varbinary or varbinary[] but not functions that expect > int or > >>> varchar. > >>> > >>> Without specifically stating c.i or c.ai, you wouldn't get any > validation. > >>> Additionally, I'd expect the validator to reject valid expressions > such as > >>> c + 4 (where c is the union field). > >>> > >>> On Wed, Dec 16, 2015 at 4:23 PM, andrew <[email protected]> wrote: > >>> > >>>> That sounds like it might fit the bill. Thanks Julien. > >>>> > >>>> > >>>>> On Dec 16, 2015, at 1:30 PM, Julian Hyde <[email protected]> > wrote: > >>>>> > >>>>> You could declare a STRUCT(i INT, ai ARRAY(INT)) and make sure > exactly > >>>> one of i and ai is set. > >>>>> > >>>>>> On Dec 16, 2015, at 10:51 AM, andrew <[email protected]> wrote: > >>>>>> > >>>>>> I’m wondering if it would be possible to add a UNION type in > Calcite. > >>>>>> > >>>>>> My use case is that I am developing a backend storage engine using > >>>> Calcite that only partially declares its schema. Specifically, the > engine > >>>> declares columns such as ‘int’ when it actually means the column can > >>>> contain ‘int’ or ‘array of int’. There is no way to tell without > actually > >>>> reading the table if the data is scalar or array. Indeed it can be > both. > >>>>>> > >>>>>> My idea is that if Calcite had a UNION type, I could declare the > >>>> columns as e.g. UNION(INT, ARRAY(INT)). > >>>>>> > >>>>>> Does this sound reasonable? Or is there a better way of handling > this > >>>> situation using the current features of Calcite? > >>>>>> > >>>>>> Thanks. > >>>>>> - A > >>>>>> > >>>>>> > >>>>> > >>>> > >>>> > > > >
