Ryan, you are right. That was issue. It works now. Thanks. On Thu, Mar 23, 2017 at 8:26 PM, Ryan <ryan.hd....@gmail.com> wrote:
> you should import either spark.implicits or sqlContext.implicits, not > both. Otherwise the compiler will be confused about two implicit > transformations > > following code works for me, spark version 2.1.0 > > object Test { > def main(args: Array[String]) { > val spark = SparkSession > .builder > .master("local") > .appName(getClass.getSimpleName) > .getOrCreate() > import spark.implicits._ > val df = Seq(TeamUser("t1", "u1", "r1")).toDF() > df.printSchema() > spark.close() > } > } > > case class TeamUser(teamId: String, userId: String, role: String) > > > On Fri, Mar 24, 2017 at 5:23 AM, shyla deshpande <deshpandesh...@gmail.com > > wrote: > >> I made the code even more simpler still getting the error >> >> error: value toDF is not a member of Seq[com.whil.batch.Teamuser] >> [ERROR] val df = Seq(Teamuser("t1","u1","r1")).toDF() >> >> object Test { >> def main(args: Array[String]) { >> val spark = SparkSession >> .builder >> .appName(getClass.getSimpleName) >> .getOrCreate() >> import spark.implicits._ >> val sqlContext = spark.sqlContext >> import sqlContext.implicits._ >> val df = Seq(Teamuser("t1","u1","r1")).toDF() >> df.printSchema() >> } >> } >> case class Teamuser(teamid:String, userid:String, role:String) >> >> >> >> >> On Thu, Mar 23, 2017 at 1:07 PM, Yong Zhang <java8...@hotmail.com> wrote: >> >>> Not sure I understand this problem, why I cannot reproduce it? >>> >>> >>> scala> spark.version >>> res22: String = 2.1.0 >>> >>> scala> case class Teamuser(teamid: String, userid: String, role: String) >>> defined class Teamuser >>> >>> scala> val df = Seq(Teamuser("t1", "u1", "role1")).toDF >>> df: org.apache.spark.sql.DataFrame = [teamid: string, userid: string ... 1 >>> more field] >>> >>> scala> df.show >>> +------+------+-----+ >>> |teamid|userid| role| >>> +------+------+-----+ >>> | t1| u1|role1| >>> +------+------+-----+ >>> >>> scala> df.createOrReplaceTempView("teamuser") >>> >>> scala> val newDF = spark.sql("select teamid, userid, role from teamuser") >>> newDF: org.apache.spark.sql.DataFrame = [teamid: string, userid: string ... >>> 1 more field] >>> >>> scala> val userDS: Dataset[Teamuser] = newDF.as[Teamuser] >>> userDS: org.apache.spark.sql.Dataset[Teamuser] = [teamid: string, userid: >>> string ... 1 more field] >>> >>> scala> userDS.show >>> +------+------+-----+ >>> |teamid|userid| role| >>> +------+------+-----+ >>> | t1| u1|role1| >>> +------+------+-----+ >>> >>> >>> scala> userDS.printSchema >>> root >>> |-- teamid: string (nullable = true) >>> |-- userid: string (nullable = true) >>> |-- role: string (nullable = true) >>> >>> >>> Am I missing anything? >>> >>> >>> Yong >>> >>> >>> ------------------------------ >>> *From:* shyla deshpande <deshpandesh...@gmail.com> >>> *Sent:* Thursday, March 23, 2017 3:49 PM >>> *To:* user >>> *Subject:* Re: Converting dataframe to dataset question >>> >>> I realized, my case class was inside the object. It should be defined >>> outside the scope of the object. Thanks >>> >>> On Wed, Mar 22, 2017 at 6:07 PM, shyla deshpande < >>> deshpandesh...@gmail.com> wrote: >>> >>>> Why userDS is Dataset[Any], instead of Dataset[Teamuser]? Appreciate your >>>> help. Thanks >>>> >>>> val spark = SparkSession >>>> .builder >>>> .config("spark.cassandra.connection.host", cassandrahost) >>>> .appName(getClass.getSimpleName) >>>> .getOrCreate() >>>> >>>> import spark.implicits._ >>>> val sqlContext = spark.sqlContext >>>> import sqlContext.implicits._ >>>> >>>> case class Teamuser(teamid:String, userid:String, role:String) >>>> spark >>>> .read >>>> .format("org.apache.spark.sql.cassandra") >>>> .options(Map("keyspace" -> "test", "table" -> "teamuser")) >>>> .load >>>> .createOrReplaceTempView("teamuser") >>>> >>>> val userDF = spark.sql("SELECT teamid, userid, role FROM teamuser") >>>> >>>> userDF.show() >>>> >>>> val userDS:Dataset[Teamuser] = userDF.as[Teamuser] >>>> >>>> >>> >> >