Hi all,

I'm trying to group X rows in a single one without shuffling the date.

I was thinking doing something like that :
val myDF = Seq(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11).toDF("myColumn")
myDF.withColumn("myColumn", expr("sliding(myColumn, 3)"))

expected result:
myColumn
[1,2,3]
[4,5,6]
[7,8,9]
[10, 11]

Any insight on how to implement this?
I saw in MlLib a SlidingRDD but I wanted to stay at Dataframe abstraction
https://spark.apache.org/docs/1.3.1/api/java/org/apache/spark/mllib/rdd/SlidingRDD.html

Thanks

Reply via email to