Hi,

On Tue, Mar 30, 2010 at 11:54:07PM +0200, Helmut Weymann wrote:
> Hi Dejan,
> 
> this mail consists of two parts. The first part are the answers for your 
> questions. The second part is another proposal to redesign this plugin. That 
> proposal is based on your changes and some brainwork.
> 
> Here the answers to your questions and some simple comment:
> 
> During the review I detected lines 215 to 219, which are superfluous 
> comments. 
> They are still referencing the no longer needed device-name.

OK.

> Am Dienstag, 30. März 2010 schrieb Dejan Muhamedagic:
> > Hi,
> >
> > Sorry for the delay, finally found some time to take a look again
> > at your plugin.
> >
> > - in get_cookie_from_device and get_data_from_device there is:
> >
> >     if grep -qs "Invalid User name or Password" $MY_PATH/$MY_TEMPFILE
> >     if grep -qs "Cookie Time Out" $MY_PATH/$PORT_STATUS
> >
> >   Can we turn the logic around and look for a string one would
> >   expect to be found? There could probably be more different
> >   errors.
> 
> If the device ip-address is OK you will either get the expected page, invalid 
> login or cookie timeout. 
> 
> Whenever the script is called by stonithd the first call is for case branch 
> status).
> 
> within the case branch status) the procedure get_http_status is called.
> This call is the final validation of ip-address, username, password and 
> responsiveness of the device. And the response does always match the defined 
> pattern.

Right. There is a slim chance that if something goes wrong
beforehand the plugin would be able to get the list of ports.

> > - in get_http_status
> >
> >    
> > pattern="P60=[01],P61=[01],P62=[01],P63=[01],P64=[01],P65=[01],P66=[01],P67
> >= [01]"
> >
> >   Aren't there also 4-port devices? Would they return the same?
> 
> I have the 4-port version and the procedure works. This adheres to the device 
> documentation: The response does always match the defined pattern.

OK.

> > - on hostlist: How about _always_ checking if hostnames defined
> >   in the device match the provided hostlist. That way, there is
> >   no need to have holes in the list. Since the names seem to be
> >   limited to 16 chars, we should also cut the domain name part
> >   from the node name. I actually implemented that already, the
> >   new script attached.
> 
> Host names within the cluster (node-names) are always(?) without the domain. 

Not always, though that practice is discouraged, sometimes people
use FQDN names.

> Yes, we are interchanging the meaning of hostname and nodename. The domain-
> name of cluster-nodes is actually needed nowhere in the CIB. The only 
> reference to a domain could be in the stonith-device-ip-address (ip-address 
> or 
> hostname in the sense of DNS).

In this case the name is just a string which has to match what is
defined in the device.

> Your debug message  
> $LOG_DEBUG "Got new hostlist ($hostlist) from $deviceip" 
> inside build_device_hostlist() should probably be 
> $LOG_DEBUG "Got new hostlist ($device_hostlist) from $deviceip"
> or
> $LOG_DEBUG "Got new port-map ($device_hostlist) from $deviceip"

Yes.

> Proposal for adjusted procedure flow.
> 
> Reason for the proposed re-design is based on these thoughts and definitions:
> 
> The pacemaker CRM database needs to know the cluster-nodes which are 
> controlled by the device. That information shall be provided via the 
> parameter 
> "hostlist". This hostlist is a list of cluster-node-names. In order to avoid 
> further mis-interpretations we might want to change the parameter name to 
> "nodelist". Its your decision.

hostlist is, for better or worse, used in all other stonith
plugins.

> In any case this list must not be empty. The 
> information is needed by pacemaker. It must be in the CIB.

It's not needed by pacemaker. It is needed by stonithd. But
that's what it finds out by invoking gethosts.

> The device command "IPSetPower" needs the port-number to control the power-
> outlet. The plugin must know the mapping of node-names to port-numbers. The 
> device allows to assign a 16 character name to each of the power-outlets. 
> These port-names - you used device_hostlist - should define the mapping of 
> cluster nodes to power outlets. For that purpose the device should have 
> exactly one port for each node in the hostlist/nodelist with a matching port-
> name. An alternative would be to add an additional parameter defining the 
> device-port-number for each node.
> 
> The plugin command status could conduct two tests: get_http_status() and 
> compare the hostlist/nodelist from the CIB with the port-names from the 
> device. Corresponding messages could report the result as status to pacemaker.

Right.

> From this introduction we have two questions left:
> 
> Shall we just use the device to get a mapping between ports and cluster-nodes 
> or do we want to allow an independent definition of that mapping via an 
> addditional parameter?

No, we get the mapping from the device since it's already
available there. Otherwise, the parameter list (the resource
configuration) is going to get ugly.

> Shall the status command of the plugin validate the hostlist/nodelist against 
> the device-port-names?

Yes. The status is actually part of the resource start operation.
It should verify that the defined hostlist is a subset of the
list we get from the device.

> With regard to the flow of the procedure these definitions make the following 
> difference:
> 
> We have three types of commands $1 for the plugin:
> 
> 1. Requests for Meta data:
> 
> getconfignames)
> getinfo-devid)
> getinfo-devname)
> getinfo-devdescr)
> getinfo-devurl)
> getinfo-xml)
> gethosts)
> 
> These commands dont need any interaction with the actual device.

gethosts does, because the hostlist is optional. Alternatively,
the hostlist may be required as well. I don't really have a
preference.

> 2. Call for status
> 
> status)
> 
> The device status is fully detected by the call to get_http_status().
> It is implemented by the documented web-command "IPGetPower".
> We may want to include the validation of node-names in this call.

Definitely.

> 3. Power control commands
> 
> on)
> off)
> reset)
> 
> These commands are implemented by the documented web-command "IPSetPower". 
> They require a valid mapping between cluster-nodes and device-ports.
> stonithd does always send two commands to the device. The first command 
> issued 
> is status which is followed by the actual required control function.
> 
> 
> Comparing these design principles with the code you did attach, I'm convinced 
> this is exactly what you wanted to achieve. And I think I know your answers 
> for the two questions: The device shall define the mapping and status) shall 
> also validate node-names against the port-names.

Yes.

> The difference between these principles and the current code is the handling 
> of the hostlist.
> 
> If the plugin does update the CIB with the hostlist it gets 
> from the device, your code does achieve this. But there is a risk in 
> reporting 
> that list as hostlist: What happens if the list includes a port-name which is 
> not a member of the cluster? Might pacemaker get confused?

No, but anyway the non-assigned ports shouldn't be reported.

> procedure name2port() should probably translate hostnames using 
> device_hostlist instead of hostlist.

Right.

I can see now that the code is not right in another place, as you
noticed above:

        hostlist="$device_hostlist"

That should be replaced by a filter which would weed out the
non-assigned ports.

Cheers,

Dejan

> Regards,
> 
> Helmut
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to