[ https://issues.apache.org/jira/browse/IMPALA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933617#comment-15933617 ]
Sailesh Mukil commented on IMPALA-4456: --------------------------------------- [~henryr] True, but we need to solve this some other way (i.e. without RW locks), I updated the description and the title. > Address scalability issues of query_exec_state_lock_ > ---------------------------------------------------- > > Key: IMPALA-4456 > URL: https://issues.apache.org/jira/browse/IMPALA-4456 > Project: IMPALA > Issue Type: Improvement > Components: Backend > Affects Versions: Impala 2.7.0 > Reporter: Sailesh Mukil > Labels: performance, scalability > > The process wide query_exec_state_map_lock_ can be a highly contended lock. > It is used as a mutex currently. > Changing it to a reader-writer lock and using it as a writer lock strictly > only when necessary can give us some big wins with relatively less effort. > On running some tests, I've also noticed that changing it to a RW lock > reduces the number of clients created for use between nodes. The reason being > ReportExecStatus() that runs in the context of a thrift connection thread, > waits for fairly long periods of time for the query_exec_state_map_lock_ on > high load systems, causing the RPC sender to be blocked on the RPC. This in > turn requires other threads on the sender node to create new client > connections to send RPCs instead of reusing old ones, which contribute to > nodes hitting their connection limit. > So, this relatively small change can give us wins not just in terms of > performance, but scalability too. > EDIT: Trying a WIP patch with RW locks on a larger cluster has showed that it > works well (very well) in some cases, but regresses at other times. The > reason being, we can't decide wether to starve readers or writers, and > performance varies (drastically) according to the workload and the starve > option. We need to come up with other ideas to address this issue (A > suggestion is to have some sort of bucketed locking scheme, where we split > the query_exec_state_map_ into buckets where each bucket has its own lock, > thereby reducing lock contention). -- This message was sent by Atlassian JIRA (v6.3.15#6346)