TD, Thanks. This helps a lot.
So in that case increasing the number of workers in case 2 to 8 or 12 then the aggregate read bandwidth is going to increase. At least conceptually? -Soumya On Fri, Dec 5, 2014 at 10:51 PM, Tathagata Das <tathagata.das1...@gmail.com> wrote: > That depends! See inline. I am assuming that when you said replacing > local disk with HDFS in case 1, you are connected to a separate HDFS > cluster (like case 1) with a single 10G link. Also assumign that all > nodes (1 in case 1, and 6 in case 2) are the worker nodes, and the > spark application driver is running somewhere else. > > On Fri, Dec 5, 2014 at 7:31 PM, Soumya Simanta <soumya.sima...@gmail.com> > wrote: > > I'm trying to understand the conceptual difference between these two > > configurations in term of performance (using Spark standalone cluster) > > > > Case 1: > > > > 1 Node > > 60 cores > > 240G of memory > > 50G of data on local file system > > > > Case 2: > > > > 6 Nodes > > 10 cores per node > > 40G of memory per node > > 50G of data on HDFS > > nodes are connected using a 10G network > > > > I just wanted to validate my understanding. > > > > 1. Reads in case 1 will be slower compared to case 2 because, in case 2 > all > > 6 nodes can read the data in parallel from HDFS. However, if I change the > > file system to HDFS in Case 1, my read speeds will be conceptually the > same > > as case 2. Correct ? > > > > If filesystem was HDFS in case 1, it still may not be conceptually > same as case 2. It depends on what is the bottleneck of the system. In > case 2, the total network b/w with which you can read from the HDFS > cluster is 6 x 10G, but in case 1 it will still be 10G. So if the HDFS > cluster has very high aggregate read b/w and network is the bottleneck > in case 2, then case 1 will be less throughput than case 2. And vice > versa, if the HDFS read b/w is the bottleneck, then conceptually case > 1 will be same as case 2. > > And there are other issues like memory read b/w as well. Say the HDFS > read b/w is the bottleneck is case 1 seems to be same as case 2. So > the aggregate read b/w is say 50Gbps (less than 6 x 10Gbps network > b/w). Can the single node of case 1 do memory transfers (reading from > NIC to memory for processing) at 50Gbps? That would depend on memory > speed, number of memory cards, etc. > > So all of these need to be considered. And thats how hardwares need to > be designed so that all these parameters are balanced :) > > > > 2. Once the data is loaded, case 1 will execute operations faster because > > there is no network overhead and all shuffle operations are local. > > > > Assuming case 1 = case 2, yes. Potentially. If you are using local > disk to shuffle files, some files systems are not happy with 60 > threads trying to read and write from disk. Those can give rise to > weird behavior. Ignoring that, conceptually, yes. > > > > 3. Obviously, case 1 is bad from a fault tolerance point of view because > we > > have a single point of failure. > > > > Yeah. If that single node dies, there are no more resources left to > continue the processing. > > > Thanks > > -Soumya > > > > > > > > > > > > > > >