Yeah it's awkward, the transforms being done are fairly time sensitive, so I don't want them to wait 60 seconds or more.
I might have to move the code from a transform into a custom receiver instead, so they'll be processed outside the window length. A buffered writer is a good idea too, thanks. Thanks, Ewan From: Ashic Mahtab [mailto:as...@live.com] Sent: 31 December 2015 13:50 To: Ewan Leith <ewan.le...@realitymine.com>; Apache Spark <user@spark.apache.org> Subject: RE: Batch together RDDs for Streaming output, without delaying execution of map or transform functions Hi Ewan, Transforms are definitions of what needs to be done - they don't execute until and action is triggered. For what you want, I think you might need to have an action that writes out rdds to some sort of buffered writer. -Ashic. ________________________________ From: ewan.le...@realitymine.com<mailto:ewan.le...@realitymine.com> To: user@spark.apache.org<mailto:user@spark.apache.org> Subject: Batch together RDDs for Streaming output, without delaying execution of map or transform functions Date: Thu, 31 Dec 2015 11:35:37 +0000 Hi all, I'm sure this must have been solved already, but I can't see anything obvious. Using Spark Streaming, I'm trying to execute a transform function on a DStream at short batch intervals (e.g. 1 second), but only write the resulting data to disk using saveAsTextFiles in a larger batch after a longer delay (say 60 seconds). I thought the ReceiverInputDStream window function might be a good help here, but instead, applying it to a transformed DStream causes the transform function to only execute at the end of the window too. Has anyone got a solution to this? Thanks, Ewan