Hi,

I have an RDD which crashes the driver when being collected.  I want to
send the data on its partitions out to S3 without bringing it back to the
driver. I try calling rdd.foreachPartition, but the data that gets sent has
not gone through the chain of transformations that I need.  It's the data
as it was ingested initially.  After specifying my chain of
transformations, but before calling foreachPartition, I call rdd.count in
order to force the RDD to transform.  The data it sends out is still not
transformed.  How do I get the RDD to send out transformed data when
calling foreachPartition?

Thanks

Reply via email to