deniskuzZ commented on PR #5929: URL: https://github.com/apache/hive/pull/5929#issuecomment-3451199713
> I don't think retry is the good way, it introduces another issue, the `COLUMN_STATS` in `TABLE_PARAMS` would be got overwritten, and lead to some column markers missing, so it might make the stats import useless. For updating the stats, we have already had the in-process lock, it doesn't solve the parallel imports across the HMS instances, somehow we need the distributed lock to achieve it, the DB lock is one way. Locking TBLS and PARTITIONS isn’t necessarily a better solution. In large-scale, highly concurrent systems, locking often becomes a scalability bottleneck. Instead, most systems adopt optimistic locking strategies. Each operation assumes no conflict and only validates at commit, reducing contention and avoiding blocking. The idea is to balance consistency with throughput. Extending this principle to stats import (which btw is not a core functionality) we could consider a mechanism based on versioning rather than traditional locking. Such an approach generally scales far better in multi-instance HMS environments than depending solely on table or partition locks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
