vinothchandar commented on issue #1798: URL: https://github.com/apache/hudi/issues/1798#issuecomment-655880193
@zherenyu831 one thing I don’t understand from your original description is wat you mean by 4000+ files vs 600+ files. If it’s the same result then how can the files be different , when your are just loading the entire table.. I suspect if the based filtering is happening during one and not during another. Your query is on the hudi commit_time which will be the same regardless.. Can you confirm that you can do df.count() with both paths and the result is the same? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org