I made the code even more simpler still getting the error error: value toDF is not a member of Seq[com.whil.batch.Teamuser] [ERROR] val df = Seq(Teamuser("t1","u1","r1")).toDF()
object Test { def main(args: Array[String]) { val spark = SparkSession .builder .appName(getClass.getSimpleName) .getOrCreate() import spark.implicits._ val sqlContext = spark.sqlContext import sqlContext.implicits._ val df = Seq(Teamuser("t1","u1","r1")).toDF() df.printSchema() } } case class Teamuser(teamid:String, userid:String, role:String) On Thu, Mar 23, 2017 at 1:07 PM, Yong Zhang <java8...@hotmail.com> wrote: > Not sure I understand this problem, why I cannot reproduce it? > > > scala> spark.version > res22: String = 2.1.0 > > scala> case class Teamuser(teamid: String, userid: String, role: String) > defined class Teamuser > > scala> val df = Seq(Teamuser("t1", "u1", "role1")).toDF > df: org.apache.spark.sql.DataFrame = [teamid: string, userid: string ... 1 > more field] > > scala> df.show > +------+------+-----+ > |teamid|userid| role| > +------+------+-----+ > | t1| u1|role1| > +------+------+-----+ > > scala> df.createOrReplaceTempView("teamuser") > > scala> val newDF = spark.sql("select teamid, userid, role from teamuser") > newDF: org.apache.spark.sql.DataFrame = [teamid: string, userid: string ... 1 > more field] > > scala> val userDS: Dataset[Teamuser] = newDF.as[Teamuser] > userDS: org.apache.spark.sql.Dataset[Teamuser] = [teamid: string, userid: > string ... 1 more field] > > scala> userDS.show > +------+------+-----+ > |teamid|userid| role| > +------+------+-----+ > | t1| u1|role1| > +------+------+-----+ > > > scala> userDS.printSchema > root > |-- teamid: string (nullable = true) > |-- userid: string (nullable = true) > |-- role: string (nullable = true) > > > Am I missing anything? > > > Yong > > > ------------------------------ > *From:* shyla deshpande <deshpandesh...@gmail.com> > *Sent:* Thursday, March 23, 2017 3:49 PM > *To:* user > *Subject:* Re: Converting dataframe to dataset question > > I realized, my case class was inside the object. It should be defined > outside the scope of the object. Thanks > > On Wed, Mar 22, 2017 at 6:07 PM, shyla deshpande <deshpandesh...@gmail.com > > wrote: > >> Why userDS is Dataset[Any], instead of Dataset[Teamuser]? Appreciate your >> help. Thanks >> >> val spark = SparkSession >> .builder >> .config("spark.cassandra.connection.host", cassandrahost) >> .appName(getClass.getSimpleName) >> .getOrCreate() >> >> import spark.implicits._ >> val sqlContext = spark.sqlContext >> import sqlContext.implicits._ >> >> case class Teamuser(teamid:String, userid:String, role:String) >> spark >> .read >> .format("org.apache.spark.sql.cassandra") >> .options(Map("keyspace" -> "test", "table" -> "teamuser")) >> .load >> .createOrReplaceTempView("teamuser") >> >> val userDF = spark.sql("SELECT teamid, userid, role FROM teamuser") >> >> userDF.show() >> >> val userDS:Dataset[Teamuser] = userDF.as[Teamuser] >> >> >