... To note that if I execute collectAsList on the dataset at the beginning
of the program....

What do you think  collectAsList does?



   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Sat, 11 Mar 2023 at 18:29, sam smith <qustacksm2123...@gmail.com> wrote:

> Hello guys,
>
> I am launching through code (client mode) a Spark program to run in
> Hadoop. If I execute on the dataset methods of the likes of show() and
> count() or collectAsList() (that are displayed in the Spark UI) after
> performing heavy transformations on the columns then the mentioned methods
> will cause the execution to freeze on Hadoop and that independently of the
> dataset size (intriguing issue for small size datasets!).
> Any idea what could be causing this type of issue?
> To note that if I execute collectAsList on the dataset at the beginning of
> the program (before performing the transformations on the columns) then the
> method yields results correctly.
>
> Thanks.
> Regards
>
>

Reply via email to