Alright, now I understand what you want. I don’t know of a programming language that has that type system and that is a red flag - we will struggle to find a consistent semantics for this.
I know a little about algebraic data types [1] - they are used by Haskell, among other languages - and these seem to be the richest type system out there (albeit for static typing), but even these use tagged unions. Julian [1] https://en.wikipedia.org/wiki/Algebraic_data_type > On Dec 17, 2015, at 10:52 AM, andrew <[email protected]> wrote: > > Julien, > > I think the difference is that, whereas in a C union you have a number of > named variables, e.g. u.i, in our case we are dealing not with typed variable > declarations, but rather with the types themselves. You would not be able to > reference into the union’s members with a “.”; there are no members to > reference. > > As Jacques mentioned, I think it’s easier to think of this as akin to ANY. > Maybe thinking of it as ANY_OF(INT, ARRAY(INT)) makes it easier to consider. > > If you want to keep the definition of UNION closer to that of a C struct, > then perhaps we can: > > A) Add the ANY_OF type > B) Modify ANY to be parametrizable with zero of more types > > - A > > >> On Dec 16, 2015, at 8:54 PM, Julian Hyde <[email protected]> wrote: >> >> Jacques, >> >> What kind of a union type were you thinking of? I was thinking of >> something like a C union, where you still need to use a field to >> indicate which sub-type you want. In C if you have >> >> typedef union { >> int i; >> double d; >> } u; >> >> void foo(u); >> void foo(int); >> void foo(double); >> >> and you write >> >> union u; >> foo(u); >> >> then the first "foo" gets called, and if you write >> >> foo(u.d); >> >> then the last "foo" gets called. >> >> The only difference between union and struct in C is that in the >> union, the members occupy the same storage. So what I'm proposing for >> Calcite is basically a struct. >> >> Julian >> >> >> On Wed, Dec 16, 2015 at 6:36 PM, Jacques Nadeau <[email protected]> wrote: >>> I don't think it would. We want to still do validation and we won't be >>> returning a struct, we'll be returning one or the other. Think of this as a >>> narrowing of the ANY type to a subset of known possibilities. Something >>> that fits a particular possibility should be allowed but for example, in >>> the (varbinary, varbinary[]) case, you should be able to use functions that >>> only support varbinary or varbinary[] but not functions that expect int or >>> varchar. >>> >>> Without specifically stating c.i or c.ai, you wouldn't get any validation. >>> Additionally, I'd expect the validator to reject valid expressions such as >>> c + 4 (where c is the union field). >>> >>> On Wed, Dec 16, 2015 at 4:23 PM, andrew <[email protected]> wrote: >>> >>>> That sounds like it might fit the bill. Thanks Julien. >>>> >>>> >>>>> On Dec 16, 2015, at 1:30 PM, Julian Hyde <[email protected]> wrote: >>>>> >>>>> You could declare a STRUCT(i INT, ai ARRAY(INT)) and make sure exactly >>>> one of i and ai is set. >>>>> >>>>>> On Dec 16, 2015, at 10:51 AM, andrew <[email protected]> wrote: >>>>>> >>>>>> I’m wondering if it would be possible to add a UNION type in Calcite. >>>>>> >>>>>> My use case is that I am developing a backend storage engine using >>>> Calcite that only partially declares its schema. Specifically, the engine >>>> declares columns such as ‘int’ when it actually means the column can >>>> contain ‘int’ or ‘array of int’. There is no way to tell without actually >>>> reading the table if the data is scalar or array. Indeed it can be both. >>>>>> >>>>>> My idea is that if Calcite had a UNION type, I could declare the >>>> columns as e.g. UNION(INT, ARRAY(INT)). >>>>>> >>>>>> Does this sound reasonable? Or is there a better way of handling this >>>> situation using the current features of Calcite? >>>>>> >>>>>> Thanks. >>>>>> - A >>>>>> >>>>>> >>>>> >>>> >>>> >
