nsivabalan commented on PR #7517: URL: https://github.com/apache/hudi/pull/7517#issuecomment-1360405299
Let's ignore MDT for now. I have some basic doubt on MOR table inner workings. So, how does extraneous log files are ignored while reading a committed data from DT? ie. let's say we make commit3 which had some spark retries. So, instead of logFile1, we now have logFile1 and logFile2, where only logFile2 is valid. our marker based re-concilliation is not going to delete nor add rollback block for logFile1. So, when someone does snapshot read, where exactly we skip logFile1? It should be part of AbstractLogRecordReader right. I could not locate it only. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org