Hi Sorry for this scala/spark newbie question. I am creating RDD which represent large time series this way: val data = sc.textFile("somefile.csv")
case class Event( time: Double, x: Double, vztot: Double ) val events = data.filter(s => !s.startsWith("GMT")).map{s => val r = s.split(";") ... Event(time, x, vztot ) } I would like to process those RDD in order to reduce them by some filtering. For this I noticed that sliding could help but I was not able to use it so far. Here is what I did: import org.apache.spark.mllib.rdd.RDDFunctions._ val eventsfiltered = events.sliding(3).map(Seq(e0, e1, e2) => Event(e0.time, (e0.x+e1.x+e2.x)/3.0, (e0.vztot+e1.vztot+e2.vztot)/3.0)) Thanks for your help -- PGP KeyID: 2048R/EA31CFC9 subkeys.pgp.net