This seem to work import org.apache.spark.sql._ val rdd = df2.rdd.map { case Row(j: String) => j } spark.read.json(rdd).show()
However I wonder if this any inefficiency here ? since I have to apply this function for billion rows.
This seem to work import org.apache.spark.sql._ val rdd = df2.rdd.map { case Row(j: String) => j } spark.read.json(rdd).show()
However I wonder if this any inefficiency here ? since I have to apply this function for billion rows.