Hm, thanks.
Do you know what this setting mean: https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala#L1178 ?

Thanks,
Peter Rudenko

On 2015-05-08 17:48, ayan guha wrote:

From S3. As the dependency of df will be on s3. And because rdds are not replicated.

On 8 May 2015 23:02, "Peter Rudenko" <petro.rude...@gmail.com <mailto:petro.rude...@gmail.com>> wrote:

    Hi, i have a next question:

    |val data = sc.textFile("s3:///") val df = data.toDF
    df.saveAsParquetFile("hdfs://") df.someAction(...) |

    if during someAction some workers would die, would recomputation
    download files from s3 or from hdfs parquet?

    Thanks,
    Peter Rudenko

    ​


Reply via email to