machadoluiz commented on issue #8824: URL: https://github.com/apache/hudi/issues/8824#issuecomment-1572195137
@ad1happy2go, the runtime increment happens gradually. In a specific example, it reached 2 minutes and 30 seconds around 300 commits (or 10 months). This poses a challenge for us, given it represents less than a year's worth of data. Is there any way that could improve this performance, or is this a trade-off we must deal with? Does Hudi perform operations using actual data or just metadata in the background? Does this mean that if we expand the size of the database, the cost/runtime will increase proportionally for managing the metadata? Or is this related only to the filenames, in which case this cost will be somewhat constant, regardless of the size of the database? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org