>>> I prefer re-spawning the thread specifically because of the embedded 
>>> situation.   
Actually I'd like to know the way embedding the ZK server. Does the application 
holds the reference of ZooKeeper server, if yes, there is a way by checking 
ZooKeeperServer# isRunning().
If not, it would be good to reach to a common understanding or agreement for 
the embedded server approach. I could see ZOOKEEPER-1072 is open for discussion.


>>> I mean ideally if we know how to resolve the issue we should just resolve 
>>> the issue instead of relying on an external system like monitoring. 
Actually this is again a debatable topic. But my opinion is, restarting the 
server would be simple idea compare to pinpointing the actual cause and finding 
the remedies. Mostly this would involves too much of tiny conditions and would 
introduce complexities. What do you say ?


-Rakesh

-----Original Message-----
From: Asta, Greg [mailto:greg.a...@omnigon.com] 
Sent: 01 April 2014 21:31
To: u...@zookeeper.apache.org; mi...@cs.stanford.edu; dev@zookeeper.apache.org
Subject: RE: Thread handling

I prefer re-spawning the thread specifically because of the embedded situation. 
  I mean ideally if we know how to resolve the issue we should just resolve the 
issue instead of relying on an external system like monitoring.

-Greg

-----Original Message-----
From: mutsuz...@gmail.com [mailto:mutsuz...@gmail.com] On Behalf Of Michi 
Mutsuzaki
Sent: Tuesday, April 01, 2014 5:22 AM
To: dev@zookeeper.apache.org
Cc: u...@zookeeper.apache.org
Subject: Re: Thread handling

+1 for shutting down on a critical thread death.

Does 'shutdown' mean calling System.exit or throwing some kind of exception? 
Some applications use ZooKeeper embedded in their JVM, and they might not like 
ZooKeeper calling System.exit.

--Michi

On Mon, Mar 31, 2014 at 9:03 PM, Rakesh R <rake...@huawei.com> wrote:
>>>> This is how I handle the critical threads in my client apps that use 
>>>> Zookeeper.
>>>> Keep a reference to the thread and periodically make sure it's still alive 
>>>> and well - respawn it if it is not.
>
> Thanks Greg for the inputs. Please see ZK-1907, I've included an initial 
> proposal patch to kick start the discussions.
> Another approach is simply shutdown if a critical thread dies, so the 
> monitoring tool can easily detect and take necessary actions. The proposed 
> patch is based on this approach.
>
> -Rakesh
>
> -----Original Message-----
> From: Asta, Greg [mailto:greg.a...@omnigon.com]
> Sent: 31 March 2014 23:24
> To: u...@zookeeper.apache.org; dev@zookeeper.apache.org
> Subject: RE: Thread handling
>
> " If we have a 'DeathWatcher 'or some other mechanism in place to monitor all 
> the critical threads. It can take a decision like - bring down the process if 
> required, or shutdown the quorumpeer and go for LE again etc.
> Now the monitoring or management tool will knows about the situation and can 
> act upon.
>
> Appreciate any thoughts ?"
>
> This is how I handle the critical threads in my client apps that use 
> Zookeeper.  Keep a reference to the thread and periodically make sure it's 
> still alive and well - respawn it if it is not.
>
> Thanks,
> Greg
>
>
> -----Original Message-----
> From: Rakesh R [mailto:rake...@huawei.com]
> Sent: Thursday, March 27, 2014 10:39 AM
> To: dev@zookeeper.apache.org; u...@zookeeper.apache.org
> Subject: Thread handling
>
> Hi All,
>
> Server has many critical threads running and co-ordinating each other like  
> RequestProcessor chains et. When going through each threads, most of them 
> having the similar structure like:
>
> public void run() {
>         try {
>               while(running)
>                    // processing logic
>               }
>         } catch (InterruptedException e) {
>             LOG.error("Unexpected interruption", e);
>         } catch (RequestProcessorException e) {
>             LOG.error("Unexpected exception", e);
>         } catch (Exception e) {
>             LOG.error("Unexpected exception", e);
>         }
>         LOG.info("...exited loop!");
> }
>
> I feel, we could improve our threads  in our system. From the design I could 
> see, there could be a chance of silently leaving the thread in case of any 
> exception(abnormal or any functional issue too) If this happens in the 
> production, the server would get hanged forever and will not be able to 
> deliver its role.
>
> If we have a 'DeathWatcher 'or some other mechanism in place to monitor all 
> the critical threads. It can take a decision like - bring down the process if 
> required, or shutdown the quorumpeer and go for LE again etc.
> Now the monitoring or management tool will knows about the situation and can 
> act upon.
>
> Appreciate any thoughts ?
>
> Thanks in advance,
> Rakesh R

Reply via email to