Unfortunately, this is a query where we just don't have an efficiently
implementation yet.  You might try switching the table order.

Here is the JIRA for doing something more efficient:
https://issues.apache.org/jira/browse/SPARK-2212


On Fri, Jul 18, 2014 at 7:05 AM, Pei-Lun Lee <pl...@appier.com> wrote:

> Hi,
>
> We have a query with left joining and got this error:
>
> Caused by: org.apache.spark.SparkException: Job aborted due to stage
> failure: Task 1.0:0 failed 4 times, most recent failure: Exception failure
> in TID 5 on host ip-10-33-132-101.us-west-2.compute.internal:
> com.esotericsoftware.kryo.KryoException: Buffer overflow. Available: 0,
> required: 1
>
> Looks like spark sql tried to do a broadcast join and collecting one of
> the table to master but it is too large.
>
> How do we explicitly control the join behavior like this?
>
> --
> Pei-Lun Lee
>
>

Reply via email to