Dear all, I need to run a series of transformations that map a RDD into another RDD. The computation changes over times and so does the resulting RDD. Each results is then saved to the disk in order to do further analysis (for example variation of the result over time).
The question is, if I save the RDDs in the same file, is it appended to the existing file or not ? And If I write into different files each time I want to save the result I may end with many little files and I read everywhere that hadoop doesn't like many little files. Does spark ok with that ? Cheers, Jaonary