Hi xiefeng,

Even if your RDDs are tiny and reduced to one partition, there is always
orchestration overhead (sending tasks to executor(s), reducing results,
etc., these things are not free).

If you need fast, [near] real-time processing, look towards
spark-streaming.

Regards,
-- 
  Bedrytski Aliaksandr
  sp...@bedryt.ski

On Mon, Sep 5, 2016, at 04:36, xiefeng wrote:
> The spark context will be reused, so the spark context initialization
> won't
> affect the throughput test.
> 
> 
> 
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Why-does-spark-take-so-much-time-for-simple-task-without-calculation-tp27628p27657.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to