Antauri commented on issue #1394: [HUDI-656][Performance] Return a dummy Spark relation after writing the DataFrame URL: https://github.com/apache/incubator-hudi/pull/1394#issuecomment-611032718 Present in 0.5.2-incubating which we're using. We're in development of a framework that does S3 to S3 ingestion using Hudi and using Spark SQL writers (not RDDs). We have year=x/month=y/day=z/bin=q partitioning. For 3 days and 575 paths each it takes 3+ minutes between repetitive "listing leaf files and directories". In total some 9 minutes for just 3 days. Any idea when 0.6.0 will be released? And does adding "Hive" as the metastore helps in reducing this listing or it doesn't matter? Thank you kind!
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services