Thank you Piotr, that's what happened. In fact, there are about 100 files on each worker node in a directory corresponding to the write.
Any way to tone that down a bit (maybe 1 file per worker)? Or, write a single file somewhere? On Mon, Sep 26, 2016 at 12:44 AM, Piotr Smoliński < piotr.smolinski...@gmail.com> wrote: > Hi Peter, > > The blank file _SUCCESS indicates properly finished output operation. > > What is the topology of your application? > I presume, you write to local filesystem and have more than one worker > machine. > In such case Spark will write the result files for each partition (in the > worker which > holds it) and complete operation writing the _SUCCESS in the driver node. > > Cheers, > Piotr > > > On Mon, Sep 26, 2016 at 4:56 AM, Peter Figliozzi <pete.figlio...@gmail.com > > wrote: > >> Both >> >> df.write.csv("/path/to/foo") >> >> and >> >> df.write.format("com.databricks.spark.csv").save("/path/to/foo") >> >> results in a *blank* file called "_SUCCESS" under /path/to/foo. >> >> My df has stuff in it.. tried this with both my real df, and a quick df >> constructed from literals. >> >> Why isn't it writing anything? >> >> Thanks, >> >> Pete >> > >