Hi All, I am working with dataframes and have been struggling with this thing, any pointers would be helpful.
I've a Json file with the schema like this, links: array (nullable = true) | |-- element: struct (containsNull = true) | | |-- desc: string (nullable = true) | | |-- id: string (nullable = true) I want to fetch id and desc as an RDD like this RDD[(String,String)] i am using dataframes *df.select("links.desc","links.id <http://links.id/>").rdd* the above dataframe is returning an RDD like this RDD[(List(String),List(String)] So, links:[{"one","1"},{"two","2"},{"three","3"}] json should return and RDD[(one,1),(two,2),(three,3)] can anyone tell me how the dataframe select should be modified?