[ 
https://issues.apache.org/jira/browse/TRAFODION-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15408186#comment-15408186
 ] 

ASF GitHub Bot commented on TRAFODION-2142:
-------------------------------------------

GitHub user DaveBirdsall opened a pull request:

    https://github.com/apache/incubator-trafodion/pull/640

    [TRAFODION-2142] Add script to restart HBase for developer regressions

    This check-in adds a script, keepHBaseUp.py, that can be used in tandem 
with developer regressions to keep regressions from getting hung when the 
local_hadoop HMaster goes away (as it is wont to do on busy workstations). The 
script periodically checks to see if the HMaster is up. If it isn't, it 
attempts to start it. It will retry the start, at geometrically longer 
intervals, until over an hour passes without success, then it gives up.
    
    One way to use this script is to start it running in one shell window, then 
start the developer regressions in another.
    
    The script logs the times when it does its checks, so the developer can 
correlate HBase down scenarios with regression test failures.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/DaveBirdsall/incubator-trafodion Trafodion2142

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-trafodion/pull/640.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #640
    
----
commit 498dd2f1b1e8588e4bf56556a951164a4154ddec
Author: Dave Birdsall <dbirds...@apache.org>
Date:   2016-08-04T17:31:06Z

    [TRAFODION-2142] Add script to restart HBase for developer regressions

----


> Test script to restart HBase automatically in local_hadoop test settings
> ------------------------------------------------------------------------
>
>                 Key: TRAFODION-2142
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2142
>             Project: Apache Trafodion
>          Issue Type: Improvement
>          Components: foundation
>    Affects Versions: any
>            Reporter: David Wayne Birdsall
>            Assignee: David Wayne Birdsall
>            Priority: Minor
>             Fix For: 2.1-incubating
>
>
> In development environments, developers often use local_hadoop for unit and 
> developer regression testing. Often these test environments are on 
> workstations shared between many developers. When running regressions 
> overnight, quite frequently the HMaster process will die due to timeouts if 
> the workstation is particularly busy. This sometimes causes HBase errors 
> during the tests but more often causes hangs. It would be nice to have a tool 
> that will monitor HMaster and if it goes away, try to restart it. It has been 
> observed that restarting it often resolves the hangs, allowing the regression 
> run to continue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to