[ https://issues.apache.org/jira/browse/HUDI-637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vinoth Chandar updated HUDI-637: -------------------------------- Fix Version/s: (was: 0.5.2) 0.6.0 > Investigate slower hudi queries in S3 vs HDFS > --------------------------------------------- > > Key: HUDI-637 > URL: https://issues.apache.org/jira/browse/HUDI-637 > Project: Apache Hudi (incubating) > Issue Type: Task > Components: Performance > Reporter: Balaji Varadarajan > Priority: Major > Fix For: 0.6.0 > > > Hudi queries in S3 takes abnormally longer time compared to hdfs. > S3 listing itself is not taking that long of time. > PERFORMANCE BUG: > the metadata list performance is likely causing performance issues with hudi. > > {{scala> stopwatch(\{ sql("SELECT * FROM > ap_invoices_all_compacted_s3").count})}} > {{Elapsed time: 1m 55.078473113s > res2: Long = xxxxxxxxxxxx}} > {{}} > {{scala> stopwatch(\{ sql("SELECT * FROM ap_invoices_all_compacted").count}) > – this is the exact same table in hdfs}} > {{Elapsed time: 6.581217052s > res3: Long = xxxxxxxxxxx}} -- This message was sent by Atlassian Jira (v8.3.4#803005)