I have a schema RDD with thw following Schema : scala> mainRDD.printSchema root |-- COL1: integer (nullable = false) |-- COL2: integer (nullable = false) |-- COL3: string (nullable = true) |-- COL4: double (nullable = false) |-- COL5: string (nullable = true)
Now, I transform the mainRDD like this : scala> val sdf1 = new SimpleDateFormat("yyyy-mm-dd hh:mm:ss.SSS"); val calendar = Calendar.getInstance() scala> val mappedRDD : SchemaRDD = intf_ddRDD.map{ r => | val end_time = sdf1.parse(r(2).toString); | calendar.setTime(end_time); | val r2 = new java.sql.Timestamp(end_time.getTime); | val hour: Long = calendar.get(Calendar.HOUR_OF_DAY); | (r(0).toString.toInt, r(1).toString.toInt, r2, hour, r(3).toString.toDouble, r(4).toString) | } scala>mappedRDD.printSchema root |-- _1: integer (nullable = false) |-- _2: integer (nullable = false) |-- _3: timestamp (nullable = true) |-- _4: long (nullable = false) |-- _5: double (nullable = false) |-- _6: string (nullable = true) But the issue is, despite specifying the mainRDD as SchemaRDD, it becomes just an RDD (notice that the column names are lost in mappedRDD) So, how can I do the above transformation on one SchemaRDD (mainRDD) to get another SchemaRDD (mappedRDD) with a different Schema. Please help me out. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Transform-a-Schema-RDD-to-another-Schema-RDD-with-a-different-schema-tp22112.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org