Re: Async RDD saves

2020-08-08 Thread Antonin Delpeuch (lists)
, Aug 7, 2020 at 10:38 AM Sean Owen <mailto:sro...@gmail.com>> wrote: > > Why do you need to do it, and can you just use a future in your > driver code? > > On Fri, Aug 7, 2020 at 9:01 AM Antonin Delpeuch (lists) > mailto:li...@antonin.delpeuch.eu>>

Async RDD saves

2020-08-07 Thread Antonin Delpeuch (lists)
Hi all, Following my request on the user mailing list [1], there does not seem to be any simple way to save RDDs to the file system in an asynchronous way. I am looking into implementing this, so I am first checking whether there is consensus around the idea. The goal would be to add methods

Re: Map with state for RDDs

2020-05-24 Thread Antonin Delpeuch (lists)
On 24/05/2020 11:27, Antonin Delpeuch (lists) wrote: > With this formulation, zipWithIndex would be a special case of > mapWithState (so it could be refactored to be expressed as such). Forget about this part, it would obviously not, since zipWithIndex can compute the size of each par

Re: Map with state for RDDs

2020-05-24 Thread Antonin Delpeuch (lists)
ion, zipWithIndex would be a special case of mapWithState (so it could be refactored to be expressed as such). Antonin On 24/05/2020 10:58, Antonin Delpeuch (lists) wrote: > Hi, > > Spark Streaming has a `mapWithState` API to run a map on a stream while > maintaining a state as el

Map with state for RDDs

2020-05-24 Thread Antonin Delpeuch (lists)
Hi, Spark Streaming has a `mapWithState` API to run a map on a stream while maintaining a state as elements are read. The core RDD API does not seem to have anything similar. Given a RDD of elements of type T, an initial state of type S and a map function (S,T) -> (S,T), return an RDD of Ts

Re: RDD order guarantees

2020-05-06 Thread Antonin Delpeuch (lists)
Thanks a lot for the reply Steve! If you don't see a way to fix this in Spark itself, then I will try to improve the docs. Antonin On 06/05/2020 17:19, Steve Loughran wrote: > > > On Tue, 7 Apr 2020 at 15:26, Antonin Delpeuch > wrote: > > Hi, > >