Re: writing to hdfs on master node much faster

2015-04-20 Thread Sean Owen
What machines are HDFS data nodes -- just your master? that would explain it. Otherwise, is it actually the write that's slow or is something else you're doing much faster on the master for other reasons maybe? like you're actually shipping data via the master first in some local computation? so

Re: writing to hdfs on master node much faster

2015-04-20 Thread Tamas Jambor
Not sure what would slow it down as the repartition completes equally fast on all nodes, implying that the data is available on all, then there are a few computation steps none of them local on the master. On Mon, Apr 20, 2015 at 12:57 PM, Sean Owen so...@cloudera.com wrote: What machines are

RE: writing to hdfs on master node much faster

2015-04-20 Thread Evo Eftimov
on the other 2 nodes -Original Message- From: Sean Owen [mailto:so...@cloudera.com] Sent: Monday, April 20, 2015 12:57 PM To: jamborta Cc: user@spark.apache.org Subject: Re: writing to hdfs on master node much faster What machines are HDFS data nodes -- just your master? that would explain