Perfect ! That's what I was looking for.
Thanks Sun !
On Tue, Aug 2, 2016 at 6:58 PM, Sun Rui wrote:
> import org.apache.spark.sql.catalyst.encoders.RowEncoder
> implicit val encoder = RowEncoder(df.schema)
> df.mapPartitions(_.take(1))
>
> On Aug 3, 2016, at 04:55, Dragisa Krsmanovic
> wrote:
import org.apache.spark.sql.catalyst.encoders.RowEncoder
implicit val encoder = RowEncoder(df.schema)
df.mapPartitions(_.take(1))
> On Aug 3, 2016, at 04:55, Dragisa Krsmanovic wrote:
>
> I am trying to use mapPartitions on DataFrame.
>
> Example:
>
> import spark.implicits._
> val df: DataFra
You are converting DataFrame to Dataset[Entry].
DataFrame is Dataset[Row].
mapPertitions works fine with simple Dataset. Just not with DataFrame.
On Tue, Aug 2, 2016 at 4:50 PM, Ted Yu wrote:
> Using spark-shell of master branch:
>
> scala> case class Entry(id: Integer, name: String)
> defin
Using spark-shell of master branch:
scala> case class Entry(id: Integer, name: String)
defined class Entry
scala> val df = Seq((1,"one"), (2, "two")).toDF("id", "name").as[Entry]
16/08/02 16:47:01 DEBUG package$ExpressionCanonicalizer:
=== Result of Batch CleanExpressions ===
!assertnotnull(inpu
I am trying to use mapPartitions on DataFrame.
Example:
import spark.implicits._
val df: DataFrame = Seq((1,"one"), (2, "two")).toDF("id", "name")
df.mapPartitions(_.take(1))
I am getting:
Unable to find encoder for type stored in a Dataset. Primitive types (Int,
String, etc) and Product types