[ 
https://issues.apache.org/jira/browse/TRAFODION-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15278491#comment-15278491
 ] 

Gonzalo E Correa commented on TRAFODION-1980:
---------------------------------------------

The 'shell' invokes the 'sqnodestatus' script which uses the NODE_LIST as the 
working list of nodes in the cluster and it returns the name of the node along 
with the state as follows:

node-1 [ UP ]
node-2 [ UP ]
node-3 [ DOWN ]
node-4 [ UP ]

The above results are when contains the following: NODE_LIST="node-1 node-2 
node-3 node-4"

If the sqconfig node section were to contain an additional node, e.g., 
"node-5", and NODE_LIST only contained the above four nodes, the internal logic 
that processes the physical node state would only process the first four node 
and terminate early. Since the internal list is based on the sqconfig, i.e., 
'cluster.conf', contents, the 'node-5' entry would have the state it was 
initialized with. Prior to the fix, the initial state was 'StateUp' which did 
not reflect the state of a non-existent node, which should be 'StateDown'.

I could have fixed this in a different manner, but it made more sense to assume 
all nodes in the cluster are down until unless the 'sqnodestatus' script 
returns an UP state. However, there is still a problem in that "node-5" could 
exist and the NODE_LIST does not contain its name and this would give the 
opposite state for a node that exists and is UP, but because it is not in the 
NODE_LIST environment variable it would be considered DOWN.

So for everything to work, the NODE_LIST environment variable would have to be 
updated as well as any other environment variables which contain member nodes 
in a cluster type information.

This is a perfect example of how an environment variable based design is 
difficult to maintain in an elastic environment. You have to make coordinated 
changes in too many places which is a maintenance nightmare.

Good question!



> Node up command (sqshell), when targeting a non-existing node, hangs.
> ---------------------------------------------------------------------
>
>                 Key: TRAFODION-1980
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-1980
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: foundation
>    Affects Versions: 2.1-incubating
>            Reporter: Gonzalo E Correa
>            Assignee: Gonzalo E Correa
>             Fix For: 2.1-incubating
>
>
> When testing JIRA TRAFODION-1885 and targeting a node that does not exist and 
> is not set in the TRAF_EXCLUDE_LIST environment variable. The 'up' command 
> attempts to process and hangs. It should be returning an error indicating 
> that the node is not available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to