In that case, the file exists in parts across machines. No, tasks won't
re-read the whole file; no task does or can do that. Failed partitions are
reprocessed, but as in the first pass, the same partition is processed.

On Fri, Jan 21, 2022 at 12:00 PM Siddhesh Kalgaonkar <
kalgaonkarsiddh...@gmail.com> wrote:

> Hello team,
>
> I am aware that in case of memory issues when a task fails, it will try to
> restart 4 times since it is a default number and if it still fails then it
> will cause the entire job to fail.
>
> But suppose if I am reading a file that is distributed across nodes in
> partitions. So, what will happen if a partition fails that holds some data?
> Will it re-read the entire file and get that specific subset of data since
> the driver has the complete information? or will it copy the data to the
> other working nodes or tasks and try to run it?
>

Reply via email to