xushiyan commented on issue #3714: URL: https://github.com/apache/hudi/issues/3714#issuecomment-927529180
@jainpriyansh786 though dedup read paths can help reduce risk of re-reading, it is a design choice where a platform product like Hudi may not want to interfere with users input, i.e., to leave the responsibility to users to ensure the intended input paths, and Hudi just load data as instructed. What if users sometimes want to load the same data multiple times and try specifying duplicate paths for testing purpose? Doing extra dedup logic behind the scene may give surprises. Closing this for now. Feel free to leave follow-up comments. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org