Re: Writing Dataframe to CSV yields blank file called "_SUCCESS"

Peter Figliozzi Mon, 26 Sep 2016 05:31:28 -0700

Thank you Piotr, that's what happened.  In fact, there are about 100 files
on each worker node in a directory corresponding to the write.


Any way to tone that down a bit (maybe 1 file per worker)?  Or, write a
single file somewhere?


On Mon, Sep 26, 2016 at 12:44 AM, Piotr Smoliński <
piotr.smolinski...@gmail.com> wrote:

> Hi Peter,
>
> The blank file _SUCCESS indicates properly finished output operation.
>
> What is the topology of your application?
> I presume, you write to local filesystem and have more than one worker
> machine.
> In such case Spark will write the result files for each partition (in the
> worker which
> holds it) and complete operation writing the _SUCCESS in the driver node.
>
> Cheers,
> Piotr
>
>
> On Mon, Sep 26, 2016 at 4:56 AM, Peter Figliozzi <pete.figlio...@gmail.com
> > wrote:
>
>> Both
>>
>> df.write.csv("/path/to/foo")
>>
>> and
>>
>> df.write.format("com.databricks.spark.csv").save("/path/to/foo")
>>
>> results in a *blank* file called "_SUCCESS" under /path/to/foo.
>>
>> My df has stuff in it.. tried this with both my real df, and a quick df
>> constructed from literals.
>>
>> Why isn't it writing anything?
>>
>> Thanks,
>>
>> Pete
>>
>
>

Re: Writing Dataframe to CSV yields blank file called "_SUCCESS"

Reply via email to