Re: uniontypes and Spark

Ryan Schachte Fri, 31 Jul 2020 11:57:43 -0700

Hi Owen,
Just to clarify. So in my parser as I am mapping fields from Avro to ORC
equivalent, any time I encounter a union type on the Avro side I map to
struct type on the ORC side and the N-1 one out of the N fields is to rid
of the null?


Best,
Ryan

On Fri, Jul 31, 2020 at 10:52 AM Owen O'Malley <owen.omal...@gmail.com>
wrote:

> Ryan,
>    I did just look at that code in Spark last week. The problem as you
> correctly surmised is that Spark
> doesn't have a uniontype. I think we probably need a fix that converts the
> uniontype into a struct for
> Spark. In such a translation, you would have fields for each variant of the
> union and N-1 of the N fields
> would be null for each row.
>
> .. Owen
>
> On Thu, Jul 30, 2020 at 9:19 AM Ryan Schachte <coderyanschac...@gmail.com>
> wrote:
>
> > I am writing ORC binaries in Java and they deserialize perfectly with the
> > Apache ORC jar on the docs that I've used to validate the data. The
> schemas
> > looks good, etc.
> >
> > When reading this information via Spark, we are encountering failures -
> in
> > particular
> >
> > mismatched input '<' expecting '>'(line 1, pos 6569)
> > taxPercent:uniontype<int,float>,
> >
> > Does Spark support uniontypes like this? Just curious what some
> > plausible work arounds could be.
> > Thanks.
> >
>

Re: uniontypes and Spark

Reply via email to