Re: Is JavaSparkContext.wholeTextFiles distributed?

Vadim Vararu Tue, 26 Apr 2016 08:04:47 -0700

Spark can create distributed datasets from any storage source supportedby Hadoop, including your local file system, HDFS, Cassandra, HBase,Amazon S3 <http://wiki.apache.org/hadoop/AmazonS3>, etc. Spark supportstext files, SequenceFiles<http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/SequenceFileInputFormat.html>,and any other Hadoop InputFormat<http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapred/InputFormat.html>.

I could not find an concrete statement where it says either the read(more than one file) is distributed or not.


On 26.04.2016 18:00, Hyukjin Kwon wrote:

then this would not be distributed

Re: Is JavaSparkContext.wholeTextFiles distributed?

Reply via email to