Is JavaSparkContext.wholeTextFiles distributed?

2016-04-26 Thread Vadim Vararu
Hi guys, I'm trying to read many filed from s3 using JavaSparkContext.wholeTextFiles(...). Is that executed in a distributed manner? Please give me a link to the place in documentation where it's specified. Thanks, Vadim.

Re: Is JavaSparkContext.wholeTextFiles distributed?

2016-04-26 Thread Vadim Vararu
Spark can create distributed datasets from any storage source supported by Hadoop, including your local file system, HDFS, Cassandra, HBase, Amazon S3 , etc. Spark supports text files, SequenceFiles

unsubscribe

2016-05-04 Thread Vadim Vararu
unsubscribe - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org

Lifecycle of a map function

2020-04-07 Thread Vadim Vararu
Hi all, I'm trying to guess understand what is the lifecycle of a map function in spark/yarn context. My understanding is that function is instantiated on the master and then passed to each executor (serialized/deserialized). What I'd like to confirm is that the function is