I needed the same for debugging and I just added "count" action in debug
mode for every step I was interested in. It's very time-consuming, but I
debug not very often.

2016-10-20 2:17 GMT-07:00 Andreas Hechenberger <inter...@hechenberger.me>:

> Hey awesome Spark-Dev's :)
>
> i am new to spark and i read a lot but now i am stuck :( so please be
> kind, if i ask silly questions.
>
> I want to analyze some algorithms and strategies in spark and for one
> experiment i want to know the size of the intermediate results between
> iterations/jobs. Some of them are written to disk and some are in the
> cache, i guess. I am not afraid of looking into the code (i already did)
> but its complex and have no clue where to start :( It would be nice if
> someone can point me in the right direction or where i can find more
> information about the structure of spark core devel :)
>
> I already setup the devel environment and i can compile spark. It was
> really awesome how smoothly the setup was :) Thx for that.
>
> Servus
> Andy
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


-- 


*Sincerely yoursEgor Pakhomov*

Reply via email to