beyond1920 commented on issue #10803: URL: https://github.com/apache/hudi/issues/10803#issuecomment-2016687994
@nsivabalan @Ytimetravel Another data loss case caused by the whole stage retry. There are 4 cases that the task retry: * Task is slow, another speculation task is retried * The task failed and retry * The stage failed and retry * The executor failed and retry For the third point, which stage retried. Task attempt number might be back to the original value. Using attempt number to identify the block is not enough to handle this case. It might leads to wrong result to compare blocks size of each attempt no. We might need to using stageAttemptNumber and AttemptNumber to identify it, or other solution. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org