Thank you for suggestions! From: Reynold Xin [mailto:r...@databricks.com] Sent: Friday, May 08, 2015 11:10 AM To: Will Benton Cc: Ulanov, Alexander; dev@spark.apache.org Subject: Re: Easy way to convert Row back to case class
In 1.4, you can do row.getInt("colName") In 1.5, some variant of this will come to allow you to turn a DataFrame into a typed RDD, where the case class's field names match the column names. https://github.com/apache/spark/pull/5713 On Fri, May 8, 2015 at 11:01 AM, Will Benton <wi...@redhat.com<mailto:wi...@redhat.com>> wrote: This might not be the easiest way, but it's pretty easy: you can use Row(field_1, ..., field_n) as a pattern in a case match. So if you have a data frame with foo as an int column and bar as a String columns and you want to construct instances of a case class that wraps these up, you can do something like this: // assuming Record is declared as case class Record(foo: Int, bar: String) // and df is a data frame df.map { case Row(foo: Int, bar: String) => Record(foo, bar) } best, wb ----- Original Message ----- > From: "Alexander Ulanov" > <alexander.ula...@hp.com<mailto:alexander.ula...@hp.com>> > To: dev@spark.apache.org<mailto:dev@spark.apache.org> > Sent: Friday, May 8, 2015 11:50:53 AM > Subject: Easy way to convert Row back to case class > > Hi, > > I created a dataset RDD[MyCaseClass], converted it to DataFrame and saved to > Parquet file, following > https://spark.apache.org/docs/latest/sql-programming-guide.html#interoperating-with-rdds > > When I load this dataset with sqlContext.parquetFile, I get DataFrame with > column names as in initial case class. I want to convert this DataFrame to > RDD to perform RDD operations. However, when I convert it I get RDD[Row] and > all information about row names gets lost. Could you suggest an easy way to > convert DataFrame to RDD[MyCaseClass]? > > Best regards, Alexander > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org<mailto:dev-unsubscr...@spark.apache.org> For additional commands, e-mail: dev-h...@spark.apache.org<mailto:dev-h...@spark.apache.org>