Okay, so suppose I have 10 records distributed across 5 nodes and the
partition of the first node holding 2 records failed. I understand that it
will re-process this partition but how will it come to know that XYZ
partition was holding XYZ data so that it will pick again only those
records and reprocess it? In case of failure of a partition, is there a
data loss? or is it stored somewhere?

Maybe my question is very naive but I am trying to understand it in a
better way.

On Fri, Jan 21, 2022 at 11:32 PM Sean Owen <sro...@gmail.com> wrote:

> In that case, the file exists in parts across machines. No, tasks won't
> re-read the whole file; no task does or can do that. Failed partitions are
> reprocessed, but as in the first pass, the same partition is processed.
>
> On Fri, Jan 21, 2022 at 12:00 PM Siddhesh Kalgaonkar <
> kalgaonkarsiddh...@gmail.com> wrote:
>
>> Hello team,
>>
>> I am aware that in case of memory issues when a task fails, it will try
>> to restart 4 times since it is a default number and if it still fails then
>> it will cause the entire job to fail.
>>
>> But suppose if I am reading a file that is distributed across nodes in
>> partitions. So, what will happen if a partition fails that holds some data?
>> Will it re-read the entire file and get that specific subset of data since
>> the driver has the complete information? or will it copy the data to the
>> other working nodes or tasks and try to run it?
>>
>

Reply via email to