Hi, I have been working a bit with RDD, and am now taking a look at DataFrames. The schema definition using case classes looks very attractive;
https://spark.apache.org/docs/1.4.0/sql-programming-guide.html#inferring-the-schema-using-reflection <https://spark.apache.org/docs/1.4.0/sql-programming-guide.html#inferring-the-schema-using-reflection> Is a DataFrame more efficient (space-wise) than an RDD for the same case class? And in general, when should DataFrames be preferred over RDDs, and vice versa? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/DataFrame-more-efficient-than-RDD-tp23857.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org