I try to parse a tab seperated file in Spark 1.5 with a json section as efficent as possible. The file looks like follows:
value1<tab>value2<tab>{json} How can I parse all fields inc the json fields into a RDD directly? If I use this peace of code: val jsonCol = sc.textFile("/data/input").map(l => l.split("\t",3)).map(x => x(2).trim()).cache() val json = sqlContext.read.json(jsonCol).rdd I will loose value1 and value2!!! I'm open for any idea! ----- I'm using Spark 1.5 -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Parse-tab-seperated-file-inc-json-efficent-tp24691.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org