I have a streaming job which writes data to S3. I know there are saveAs
functions helping write data to S3. But it bundles all elements then writes out
to S3. So my first question - Is there any way to let saveAs functions
write data in batch or single elements instead of whole bundle?
I am playing with some data using (stand alone) spark-shell (Spark version
1.6.0) by executing `spark-shell`. The flow is simple; a bit like cp -
basically moving local 100k files (the max size is 190k) to S3. Memory is
configured as below
export SPARK_DRIVER_MEMORY=8192M
export