I would like to draw your attention to JIRA TRAFODION-2001
<https://issues.apache.org/jira/browse/TRAFODION-2001>  which specifies
changes in configuration and operational components to support elasticity in
Trafodion. My intent is to generate discussion, obtain feedback, correct
mistakes, add missing items, and obtain consensus for when to integrate
these changes into the mainline code. Inherent with this capability is the
likelihood that other aspects of managing a Trafodion instance will require
changes and possibly enhancements. At a minimum, these enhancements change
the way current key process components are configured and managed, and the
old way goes away (this means that you will want to know the details of this
JIRA if you are an active contributor to Trafodion).

 

I am adding the contents of this email as an initial comment in the
TRAFODION-2001 JIRA and request that all feedback be done as comments in the
JIRA. I thank you in advance.

 

A little background, most of the implementation was done in the spring of
2015 and donated to the Apache Foundation at the end of September 2015. I am
in the process of merging these changes to the current Trafodion baseline in
my private fork.

 

Here is where I need your active participation and to help with that here is
a brief summary:

 

First, review the document attached to TRAFODION-2001
<https://issues.apache.org/jira/browse/TRAFODION-2001>  JIRA, as you will
need its context for what follows here.

 

Current state:

Trafodion Foundation components:

'monitor/shell':

*         'persist config/exec/info' commands are implemented

o   A 'persist kill' command is not currently specified, which I believe to
be an unintended omission and needs to be added (it is an incomplete story
without it as stopping persistent processes whose number grows and contracts
based on node membership cannot be done with one simple command).

o   Some important items to consider with a 'persist kill' command:

*  Will return an error when used with DTM persistent processes (the
transaction manager process should not be stopped in haphazard way)

*         Are there other persistent processes that should also be protected
in this manner?

*  Should it return an error with TSID persistent processes?

o   The implementation of the 'persist kill' command corrects a problem with
the code generated in the 'sscpstop', and 'ssmpstop'.

*  The current code generated does not take into account new processes
created when nodes are added.

*         'node config' command is implemented

*         'node add/delete' commands - TODO - in process

 

'scripts' changes implemented

*         Compilation of Trafodion configuration file, 'sqconfig', with new
'persist' section is implemented ('sqgen', Et. Al. scripts)

o   The generation of 'gomon.cold' is greatly simplified as are the
'<xxx>start' scripts

*         Creation and display of configuration data base is implemented

 

Location of merged changes:

git remote add zcorrea_fork [email protected]:zcorrea/incubator-trafodion
<mailto:[email protected]:zcorrea/incubator-trafodion> 

Branch: zcorrea_fork/TRAFODION-2001

 

Impact to other components:

Hadoop/Trafodion Installation

*         The ability to add and remove servers in an existing cluster
implies the provisioning and removal of operational resources of those
servers.

o   Trafodion depends on Hadoop and there is an implied order of
provisioning and operational readiness when adding servers to a cluster.

o   This order will be the reverse when removing servers from a cluster.

Trafodion components

*         Existing functionality in Trafodion assumes that when an instance
is started, its static configuration does not change. Nodes may go down,
i.e., fail, but the number of configured node remains static. This will no
longer be true as node membership will expand and contract in the life time
of a instance after initial instance startup.

 

I look forward to your feedback,

Zalo
Gonzalo Correa



 

Reply via email to