[ https://issues.apache.org/jira/browse/IMPALA-6662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on IMPALA-6662 stopped by Tim Armstrong. --------------------------------------------- > Make stress test resilient to hangs due to client crashes > --------------------------------------------------------- > > Key: IMPALA-6662 > URL: https://issues.apache.org/jira/browse/IMPALA-6662 > Project: IMPALA > Issue Type: Bug > Components: Infrastructure > Reporter: Sailesh Mukil > Assignee: Sailesh Mukil > Priority: Critical > > The concurrent_select.py process starts multiple sub processes (called query > runners), to run the queries. It also starts 2 threads called the query > producer thread and the query consumer thread. The query producer thread adds > queries to a query queue and the query consumer thread pulls off the queue > and feeds the queries to the query runners. > The query runner, once it gets queries, does the following: > {code:java} > (pseudo code. Real code here: > https://github.com/apache/impala/blob/d49f629c447ea59ad73ceeb0547fde4d41c651d1/tests/stress/concurrent_select.py#L583-L595) > with _submit_query_lock: > increment(num_queries_started) > run_query() # One runner crashes here. > increment(num_queries_finished) > {code} > One of the runners crash inside run_query(), thereby never incrementing > num_queries_finished. > Another thread that's supposed to check for memory leaks (but actually > doesn't), periodically acquires '_submit_query_lock' and waits for the number > of running queries to reach 0 before releasing the lock: > https://github.com/apache/impala/blob/d49f629c447ea59ad73ceeb0547fde4d41c651d1/tests/stress/concurrent_select.py#L449-L511 > However, in the above case, the number of running queries will never reach 0 > because one of the query runners hasn't incremented 'num_queries_finished' > and exited. Therefore, the poll_mem_usage() function will hold the lock > indefinitely, causing no new queries to be submitted, nor the stress test to > complete running. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org