Re: Split a row into multiple rows Java

2018-08-08 Thread Manu Zhang
The following may help although in Scala. The idea is to firstly concat each value with time, assembly all time_value into an array and explode, and finally split time_value into time and value. val ndf = df.select(col("name"), col("otherName"), explode( array(concat_ws(":", col("v1"),

Re: Split a row into multiple rows Java

2018-08-07 Thread nookala
+-+-++++ | name|otherName|val1|val2|val3| +-+-++++ | bob| b1| 1| 2| 3| |alive| c1| 3| 4| 6| | eve| e1| 7| 8| 9| +-+-++++ I need this to become +-+-++- |

Re: Split a row into multiple rows Java

2018-08-01 Thread Anton Puzanov
you can always use array+explode, I don't know if its the most elegant/optimal solution (would be happy to hear from the experts) code example: //create data Dataset test= spark.createDataFrame(Arrays.asList(new InternalData("bob", "b1", 1,2,3), new InternalData("alive", "c1", 3,4,6),

RE: Split a row into multiple rows Java

2018-08-01 Thread nookala
Pivot seems to do the opposite of what I want, convert rows to columns. I was able to get this done in python, but would like to do this in Java idfNew = idf.rdd.flatMap((lambda row: [(row.Name, row.Id, row.Date, "0100",row.0100),(row.Name, row.Id, row.Date, "0200",row.0200),row.Name, row.Id,

RE: Split a row into multiple rows Java

2018-07-31 Thread Patil, Prashasth
Hi, Have you tried using spark dataframe's Pivot feature ? -Original Message- From: nookala [mailto:srinook...@gmail.com] Sent: Thursday, July 26, 2018 7:33 AM To: user@spark.apache.org Subject: Split a row into multiple rows Java I'm trying to generate multiple rows from a single row