Hi, I have an RDD which crashes the driver when being collected. I want to send the data on its partitions out to S3 without bringing it back to the driver. I try calling rdd.foreachPartition, but the data that gets sent has not gone through the chain of transformations that I need. It's the data as it was ingested initially. After specifying my chain of transformations, but before calling foreachPartition, I call rdd.count in order to force the RDD to transform. The data it sends out is still not transformed. How do I get the RDD to send out transformed data when calling foreachPartition?
Thanks