Parallel disk IO? But the effect should be less noticeable compared to Hadoop which reads/writes a lot. Much depends on how often Spark persists on disk. Depends on the specifics of the RAID controller as well.
If you write to HDFS as opposed to local file system this may be a big factor as well. On Tue, Mar 8, 2016 at 8:34 AM, Eddie Esquivel <eduardo.esqui...@gmail.com> wrote: > Hello All, > In the Spark documentation under "Hardware Requirements" it very clearly > states: > > We recommend having *4-8 disks* per node, configured *without* RAID (just > as separate mount points) > > My question is why not raid? What is the argument\reason for not using > Raid? > > Thanks! > -Eddie > -- Alex Kozlov