deniskuzZ commented on PR #5929:
URL: https://github.com/apache/hive/pull/5929#issuecomment-3451199713

   > I don't think retry is the good way, it introduces another issue, the 
`COLUMN_STATS` in `TABLE_PARAMS` would be got overwritten, and lead to some 
column markers missing, so it might make the stats import useless. For updating 
the stats, we have already had the in-process lock, it doesn't solve the 
parallel imports across the HMS instances, somehow we need the distributed lock 
to achieve it, the DB lock is one way.
   
   Locking TBLS and PARTITIONS isn’t necessarily a better solution. In 
large-scale, highly concurrent systems,  locking often becomes a scalability 
bottleneck. Instead, most systems adopt optimistic locking strategies. Each 
operation assumes no conflict and only validates at commit, reducing contention 
and avoiding blocking.
   
   The idea is to balance consistency with throughput. Extending this principle 
to stats import (which btw is not a core functionality) we could consider a 
mechanism based on versioning rather than traditional locking. Such an approach 
generally scales far better in multi-instance HMS environments than depending 
solely on table or partition locks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to