Dan Hecht has posted comments on this change.

Change subject: IMPALA-1972/IMPALA-3882: Fix client_request_state_map_lock_ 
contention
......................................................................


Patch Set 7:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/6707/7/be/src/service/impala-beeswax-server.cc
File be/src/service/impala-beeswax-server.cc:

PS7, Line 296: NULL
nit: change that one too to at least keep functions consistent


http://gerrit.cloudera.org:8080/#/c/6707/7/be/src/service/impala-http-handler.cc
File be/src/service/impala-http-handler.cc:

PS7, Line 721: just return
that's not what the code does (it also sets plan_metadata_unavailable), please 
rephrase.  Could rephrase the whole comment as: 

If the query plan isn't generated, avoid waiting for the lock, which could take 
a while if catalog metadata is being loaded.


PS7, Line 730: adopt_lock_t
shouldn't that be deleted?


http://gerrit.cloudera.org:8080/#/c/6707/7/tests/custom_cluster/test_query_concurrency.py
File tests/custom_cluster/test_query_concurrency.py:

PS7, Line 32: The intention here is to check contention on the 
query_exec_state_map_lock_
This is talking about how the old code worked, which won't make sense to people 
reading the current code (after this change). It should say something like:

The intention is to check that the webserver does not hold any global locks or 
otherwise prevent impalad from servicing new requests.


PS7, Line 54: This creates lock contention on the coordinator by
            :     calling QuerySummaryHandler() method
This is no longer true with your fix. How about saying:

This is to verify that QuerySummaryHandler() does not hold any global locks 
that would, for example, prevent another query from starting.


PS7, Line 74: time.sleep(2)
I'm worried that this will be flaky, especially with ASAN.  Instead of this 
delay, couldn't we just wait for in_flight_queries to become 1?  And you could 
use the parameter to get_in_flight_queries() to do that by passing some largish 
value. That has the advantage that we'll wait only as long as necessary for the 
value to change to 1, so we can have a relatively long timeout (rather than 
delay).


PS7, Line 83: time.sleep(2)
this delay is a bit harder to eliminate.  How about we increase 
--stress_metadata_loading_pause_injection_ms to something really large, say 
1000 seconds (which doesn't matter -- we don't actually need the queries to 
finish planning to end the test, right?). 

And then we can use a larger timeout here, but we don't need to delay for it. 
We can just do:

inflight_query_ids = impalad.service.get_in_flight_queries(30)

which will poll the webui once per second and give up after 30 seconds.


-- 
To view, visit http://gerrit.cloudera.org:8080/6707
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ie44daa93e3ae4d04d091261f3ec4891caffe8026
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to