Using Spark 1.6.2, I want to understand what « Duration » really mean (and why 
is slow).

Running a simple SELECT COUNT against a parquet file, stored within HDFS:

NODE_LOCAL 1 / DATA02 2018/02/19 09:54:27 5 s 30 ms 8.8 MB (hadoop) / 3010830 8 
ms 77.2 KB / 1666

This means "took 5 secondes to read 8 M from HDFS » ? 

Thomas Decaux

Reply via email to