Hi, Mark! Which kind of error message do you get?
The simplest way to convert RDD to DF is just importing implicits and use toDF import spark.implicits._ val df = rdd.toDF :-) 2016년 11월 29일 (화) 오전 1:26, Mark Mikolajczak <m...@flayranalytics.co.uk>님이 작성: > > > Hi All, > > Hoping you can help: > > > I have created an RDD from a NOSQL database and I want to convert the RDD > to a data frame. I have tried many options but all result in errors. > > val df = sc.couchbaseQuery(test).map(_.value).collect().foreach(println) > > > {"accountStatus":"AccountOpen","custId":"140034"} > {"accountStatus":"AccountOpen","custId":"140385"} > {"accountStatus":"AccountClosed","subId":"10795","custId":"139698","subStatus":"Active"} > {"accountStatus":"AccountClosed","subId":"11364","custId":"140925","subStatus":"Paused"} > {"accountStatus":"AccountOpen","subId":"10413","custId":"138842","subStatus":"Active"} > {"accountStatus":"AccountOpen","subId":"10414","custId":"138842","subStatus":"Active"} > {"accountStatus":"AccountClosed","subId":"11314","custId":"140720","subStatus":"Paused"} > {"accountStatus":"AccountOpen","custId":"139166"} > {"accountStatus":"AccountClosed","subId":"10735","custId":"139558","subStatus":"Paused"} > {"accountStatus":"AccountOpen","custId":"139575"} > df: Unit = () > > I have tried adding .toDF() to the end of my code and also creating a > schema and using createDataFrame but receive errors. Whats the best > approach to converting the RDD to Dataframe? > > import org.apache.spark.sql.types._ > > // The schema is encoded in a string > val schemaString = "accountStatus subId custId subStatus" > > // Generate the schema based on the string of schema > val fields = schemaString.split(" ") > .map(fieldName => StructField(fieldName, StringType, nullable = true)) > val schema = StructType(fields) > > // > > val peopleDF = spark.createDataFrame(df,schema) > > > -- Taejun Kim Data Mining Lab. School of Electrical and Computer Engineering University of Seoul