Hello, I understand Spark can be used with Hadoop or standalone. I have certain questions related to use of the correct FS for Spark data.
What is the efficiency trade-off in feeding data to Spark from NFS v HDFS? If one is not using Hadoop, is it still usual to house data in HDFS for Spark to read from because of better reliability compared to NFS? Should data be stored on local FS (not NFS) only for Spark jobs which run on single machine? Regards, Ashish