Hello,

sparkMeasure is a great tool that is indeed helpful for me but
unfortunately, it doesn't measure the network communication time/cost.

it is stated as a limitation in the GitHub page :


   - The currently available Spark task metrics can give you precious
   quantitative information on resources used by the executors, however there
   do not allow to fully perform time-based analysis of the workload
   performance, notably they do not expose the time spent doing I/O or network
   traffic.


Can anyone suggest me another solution to get the network cost/time?

best regards

Le jeu. 21 mars 2019 à 16:00, Saikat Kanjilal <sxk1...@hotmail.com> a
écrit :

> How about using this:  https://github.com/LucaCanali/sparkMeasure
>
> Sent from my iPhone
>
> On Mar 21, 2019, at 7:46 AM, asma zgolli <zgollia...@gmail.com> wrote:
>
> Hello ,
>
> is there a way to get the network statistics, server and distribution
> statistics from spark?
>
> I m looking for that information in order to work on network communication
> performance.
>
> thank you very much for your help
> kind regards
> Asma ZGOLLI
>
> PhD student in data engineering - computer science
>
>
>

-- 
Asma ZGOLLI

PhD student in data engineering - computer science
Email : zgollia...@gmail.com
email alt:  asma.zgo...@univ-grenoble-alpes.fr <zgollia...@gmail.com>
Tel : (+33) 07 52 95 04 45
        (+216) 50 126 797
Skype : asma_zgolli

Reply via email to