>From S3. As the dependency of df will be on s3. And because rdds are not replicated. On 8 May 2015 23:02, "Peter Rudenko" <petro.rude...@gmail.com> wrote:
> Hi, i have a next question: > > val data = sc.textFile("s3:///")val df = data.toDF > df.saveAsParquetFile("hdfs://") > df.someAction(...) > > if during someAction some workers would die, would recomputation download > files from s3 or from hdfs parquet? > > Thanks, > Peter Rudenko > >