[ 
https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samir Ahmic updated HBASE-7386:
-------------------------------
    Attachment: HBASE-7386-master-00.patch

Here is first fully operational patch for running hbase processes under python 
supervisor control. Patch creates new bin/supervisord and conf/supervisord 
directories first dir contains supporting scripts for managing cluster 
(start/stop/restart/check) and second one supervisor config files. 
In order to test this you will need to instal python supervisor (3.3.2) usually 
with "pip install supervisor==3.3.2". No additional steps are required, 
configure your hbase as usual and go to /bin/supervisor and run 
./start-supervisord-hbase.sh. 
There is also python script zk_cleaner.py which acts as process event listener 
in charge to remove mater/rs znode when process is in stoping or exit state.
This is first version of patch and whole idea will need more testing and code 
polishing, all comments and suggestions are welcome.

> Investigate providing some supervisor support for znode deletion
> ----------------------------------------------------------------
>
>                 Key: HBASE-7386
>                 URL: https://issues.apache.org/jira/browse/HBASE-7386
>             Project: HBase
>          Issue Type: Task
>          Components: master, regionserver, scripts
>            Reporter: Gregory Chanan
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 3.0.0
>
>         Attachments: HBASE-7386-bin.patch, HBASE-7386-bin-v2.patch, 
> HBASE-7386-bin-v3.patch, HBASE-7386-conf.patch, HBASE-7386-conf-v2.patch, 
> HBASE-7386-conf-v3.patch, HBASE-7386-master-00.patch, HBASE-7386-src.patch, 
> HBASE-7386-v0.patch, supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the 
> underlying process so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs 
> around and waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the 
> below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 6) Hides stdout/stderr[6]
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
> I suspect running in via something like supervisor.d can solve these issues 
> if we provide the right support.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to