Re: [Spark SQL] error in performing dataset union with complex data type (struct, list)

Alessandro Solimando Sun, 03 Jun 2018 07:37:06 -0700

Hi Pranav,
I don´t have an answer to your issue, but what I generally do in this cases
is to first try to simplify it to a point where it is easier to check
what´s going on, and then adding back ¨pieces¨ one by one until I spot the
error.


In your case I can suggest to:

1) project the dataset to the problematic column only (column 21 from your
log)
2) use explode function to have one element of the array per line
3) flatten the struct

At each step use printSchema() to double check if the types are as you
expect them to be, and if they are the same for both datasets.

Best regards,
Alessandro

On 2 June 2018 at 19:48, Pranav Agrawal <pranav.mn...@gmail.com> wrote:

> can't get around this error when performing union of two datasets
> (ds1.union(ds2)) having complex data type (struct, list),
>
>
> *18/06/02 15:12:00 INFO ApplicationMaster: Final app status: FAILED,
> exitCode: 15, (reason: User class threw exception:
> org.apache.spark.sql.AnalysisException: Union can only be performed on
> tables with the compatible column types.
> array<struct<id:int,booking_id:int,shifting_status:int,shifting_reason:int,shifting_metadata:string>>
> <>
> array<struct<id:int,booking_id:int,shifting_status:int,shifting_reason:int,shifting_metadata:string>>
> at the 21th column of the second table;;*
> As far as I can tell, they are the same. What am I doing wrong? Any help /
> workaround appreciated!
>
> spark version: 2.2.1
>
> Thanks,
> Pranav
>

Re: [Spark SQL] error in performing dataset union with complex data type (struct, list)

Reply via email to