[ https://issues.apache.org/jira/browse/LENS-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rajat Khandelwal updated LENS-1169: ----------------------------------- Resolution: Fixed Status: Resolved (was: Patch Available) > Stopping Query Service is incorrect > ----------------------------------- > > Key: LENS-1169 > URL: https://issues.apache.org/jira/browse/LENS-1169 > Project: Apache Lens > Issue Type: Bug > Reporter: Rajat Khandelwal > Assignee: Rajat Khandelwal > Fix For: 2.6 > > Attachments: LENS-1169.01.patch, LENS-1169.02.patch > > > Stopping lens server basically stops all services. For query service, the > current flow is this: > * Preapre stopping: > ** Interrupt All threads (query submitter, purger, status poller etc) > * Persist state > * Stop > ** join all threads ( as mentioned above) > Each of the threads is basically running in a large loop like the following: > {noformat} > while (!stopped && !this.isInterrupted()) { > try { > } catch(InterruptedException) { > return > } > } > {noformat} > Now, interrupting a thread will cause InterruptException in the thread only > when the thread is waiting/sleeping. > So, the thread can exit in two ways: > * By receiving interrupt > * If an interrupt isn't received, it'll complete the current iteration loop > and then exit. > So there can be a scenario like the following (I faced such a scenario while > working on LENS-904): > * Stop is called from outside > * Prepare stopping. Let's say QuerySubmitter didn't receive the interrupt and > will exit after completing its current iteration. > * Persist: > ** Persist part1: Persisting driver states. e.g. HiveDriver keeps a map of > query handle to hive operation handle. > * QuerySubmitter submits the query to hive, changes the state of query to > LAUNCHED and exits. > * Persist: > ** Persist part 2: Persisting queries. This persists the query mentioned in > the above point as LAUNCHED. > Now, on start, the states will be read back, query's state will be LAUNCHED, > and HiveDriver won't have the operation handle corresponding to this query. > This will cause the query to fail in next status update. > Proposed Solution: > Interrupt and join the threads before persisting in the prepareStopping > phase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)