Yes that is correct, that would cause computation twice. If you want the
computation to happen only once you can cache the dataframe and call count
and write on the cached dataframe.

Regards,
Keith.

http://keith-chapman.com


On Mon, May 20, 2019 at 6:43 PM Rishi Shah <rishishah.s...@gmail.com> wrote:

> Hi All,
>
> Just wanted to confirm my understanding around actions on dataframe. If
> dataframe is not persisted at any point, & count() is called on a dataframe
> followed by write action --> this would trigger dataframe computation twice
> (which could be the performance hit for a larger dataframe).. Could anyone
> please help confirm?
>
> --
> Regards,
>
> Rishi Shah
>

Reply via email to