Hello,

I understand Spark can be used with Hadoop or standalone. I have certain
questions related to use of the correct FS for Spark data.

What is the efficiency trade-off in feeding data to Spark from NFS v HDFS?

If one is not using Hadoop, is it still usual to house data in HDFS for
Spark to read from because of better reliability compared to NFS?

Should data be stored on local FS (not NFS) only for Spark jobs which run
on single machine?

Regards,
Ashish

Reply via email to