Re: how to mark a (bean) class with schema for catalyst ?

2017-05-09 Thread Yang
ah.. thanks , your code also works for me, I figured it's because I tried to encode a tuple of (MyClass, Int): package org.apache.spark /** */ import org.apache.spark.sql.catalyst.util.{ArrayData, GenericArrayData} import org.apache.spark.sql.types._ import org.apache.spark.sql.{Encoders,

Re: how to mark a (bean) class with schema for catalyst ?

2017-05-09 Thread Michael Armbrust
Must be a bug. This works for me in Spark 2.1. On Tue, May 9, 2017 at 12:10 PM, Yang wrote: > somehow the

Re: how to mark a (bean) class with schema for catalyst ?

2017-05-09 Thread Yang
somehow the schema check is here https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala#L697-L750 supposedly beans are to be handled, but it's not clear to me which line handles the type of beans. if that's clear, I could

Re: how to mark a (bean) class with schema for catalyst ?

2017-05-09 Thread Yang
2.0.2 with scala 2.11 On Tue, May 9, 2017 at 11:30 AM, Michael Armbrust wrote: > Which version of Spark? > > On Tue, May 9, 2017 at 11:28 AM, Yang wrote: > >> actually with var it's the same: >> >> >> scala> class Person4 { >> | >> |

Re: how to mark a (bean) class with schema for catalyst ?

2017-05-09 Thread Michael Armbrust
Which version of Spark? On Tue, May 9, 2017 at 11:28 AM, Yang wrote: > actually with var it's the same: > > > scala> class Person4 { > | > | @scala.beans.BeanProperty var X:Int = 1 > | } > defined class Person4 > > scala> val personEncoder =

Re: how to mark a (bean) class with schema for catalyst ?

2017-05-09 Thread Yang
actually with var it's the same: scala> class Person4 { | | @scala.beans.BeanProperty var X:Int = 1 | } defined class Person4 scala> val personEncoder = Encoders.bean[Person4](classOf[Person4]) personEncoder: org.apache.spark.sql.Encoder[Person4] = class[x[0]: int] scala> val

Re: how to mark a (bean) class with schema for catalyst ?

2017-05-09 Thread Yang
Thanks Michael. I could not use case class here since I need to later modify the output of getX() so that the output is dynamically generated. the bigger context is this: I want to implement topN(), using a BoundedPriorityQueue. basically I include a queue in reduce(), or aggregateByKey(), but

Re: how to mark a (bean) class with schema for catalyst ?

2017-05-09 Thread Michael Armbrust
I think you are supposed to set BeanProperty on a var as they do here . If you are using scala though I'd consider using the case

how to mark a (bean) class with schema for catalyst ?

2017-05-09 Thread Yang
I'm trying to use Encoders.bean() to create an encoder for my custom class, but it fails complaining about can't find the schema: class Person4 { @scala.beans.BeanProperty def setX(x:Int): Unit = {} @scala. beans.BeanProperty def getX():Int = {1} } val personEncoder = Encoders.bean[