Hi Mathew, thanks for answering this, I've also tried with a simple case
class and it works fine.
I'm using this case class structure, which is failing:
import java.text.SimpleDateFormat
import java.util.Calendar
import scala.annotation.tailrec
trait TabbedToString {
_: Product =>
override
Hi,
I have tried simple test like this:
case class A(id: Long)
val sample = spark.range(0,10).as[A]
sample.createOrReplaceTempView("sample")
val df = spark.emptyDataset[A]
val df1 = spark.sql("select * from sample").as[A]
df.union(df1)
It runs ok. And for nullabillity I thought that issue has
Ok, great,
Well I havn't provided a good example of what I'm doing. Let's assume that
my case class is
case class A(tons of fields, with sub classes)
val df = sqlContext.sql("select * from a").as[A]
val df2 = spark.emptyDataset[A]
df.union(df2)
This code will throw the exception.
Is this
Yes, unfortunately. This should actually be fixed, and the union's schema
should have the less restrictive of the DataFrames.
On Mon, May 8, 2017 at 12:46 PM, Dirceu Semighini Filho <
dirceu.semigh...@gmail.com> wrote:
> HI Burak,
> By nullability you mean that if I have the exactly the same
HI Burak,
By nullability you mean that if I have the exactly the same schema, but one
side support null and the other doesn't, this exception (in union dataset)
will be thrown?
2017-05-08 16:41 GMT-03:00 Burak Yavuz :
> I also want to add that generally these may be caused by
I also want to add that generally these may be caused by the `nullability`
field in the schema.
On Mon, May 8, 2017 at 12:25 PM, Shixiong(Ryan) Zhu wrote:
> This is because RDD.union doesn't check the schema, so you won't see the
> problem unless you run RDD and hit
This is because RDD.union doesn't check the schema, so you won't see the
problem unless you run RDD and hit the incompatible column problem. For
RDD, You may not see any error if you don't use the incompatible column.
Dataset.union requires compatible schema. You can print ds.schema and
> On May 8, 2017, at 11:07 AM, Dirceu Semighini Filho
> wrote:
>
> Hello,
> I've a very complex case class structure, with a lot of fields.
> When I try to union two datasets of this class, it doesn't work with the
> following error :
> ds.union(ds1)
> Exception in
Hello,
I've a very complex case class structure, with a lot of fields.
When I try to union two datasets of this class, it doesn't work with the
following error :
ds.union(ds1)
Exception in thread "main" org.apache.spark.sql.AnalysisException: Union
can only be performed on tables with the