that the cluster is made of identical nodes in terms of HW so its
not like one of the nodes just "works" quicker.
Thanks,
Borislav
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-work-distribution-among-execs-tp26502p26508.html
Sent from the Apache
to be less. To prove the same, use a even
bigger input for your job.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-work-distribution-among-execs-tp26502p26506.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
ng>
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-work-distribution-among-execs-tp26502p26504.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To u
-
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-work-distribution-among-execs-tp26502.html
> To start a new topic under Apache Spark User List, email
> ml-node+s100156
the cause for this behaviour?
2. Any ideas how to achieve a more balanced performance?
Thanks,
Borislav
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-work-distribution-among-execs-tp26502.html
Sent from the Apache Spark User List mailing li
Hi,
I'm running a Spark 1.6.0 on YARN on a Hadoop 2.6.0 cluster.
I observe a very strange issue.
I run a simple job that reads about 1TB of json logs from a remote HDFS
cluster and converts them to parquet, then saves them to the local HDFS of
the Hadoop cluster.
I run it with 25 executors with