Hi,
On Tue, Mar 30, 2010 at 11:54:07PM +0200, Helmut Weymann wrote:
> Hi Dejan,
>
> this mail consists of two parts. The first part are the answers for your
> questions. The second part is another proposal to redesign this plugin. That
> proposal is based on your changes and some brainwork.
>
> Here the answers to your questions and some simple comment:
>
> During the review I detected lines 215 to 219, which are superfluous
> comments.
> They are still referencing the no longer needed device-name.
OK.
> Am Dienstag, 30. März 2010 schrieb Dejan Muhamedagic:
> > Hi,
> >
> > Sorry for the delay, finally found some time to take a look again
> > at your plugin.
> >
> > - in get_cookie_from_device and get_data_from_device there is:
> >
> > if grep -qs "Invalid User name or Password" $MY_PATH/$MY_TEMPFILE
> > if grep -qs "Cookie Time Out" $MY_PATH/$PORT_STATUS
> >
> > Can we turn the logic around and look for a string one would
> > expect to be found? There could probably be more different
> > errors.
>
> If the device ip-address is OK you will either get the expected page, invalid
> login or cookie timeout.
>
> Whenever the script is called by stonithd the first call is for case branch
> status).
>
> within the case branch status) the procedure get_http_status is called.
> This call is the final validation of ip-address, username, password and
> responsiveness of the device. And the response does always match the defined
> pattern.
Right. There is a slim chance that if something goes wrong
beforehand the plugin would be able to get the list of ports.
> > - in get_http_status
> >
> >
> > pattern="P60=[01],P61=[01],P62=[01],P63=[01],P64=[01],P65=[01],P66=[01],P67
> >= [01]"
> >
> > Aren't there also 4-port devices? Would they return the same?
>
> I have the 4-port version and the procedure works. This adheres to the device
> documentation: The response does always match the defined pattern.
OK.
> > - on hostlist: How about _always_ checking if hostnames defined
> > in the device match the provided hostlist. That way, there is
> > no need to have holes in the list. Since the names seem to be
> > limited to 16 chars, we should also cut the domain name part
> > from the node name. I actually implemented that already, the
> > new script attached.
>
> Host names within the cluster (node-names) are always(?) without the domain.
Not always, though that practice is discouraged, sometimes people
use FQDN names.
> Yes, we are interchanging the meaning of hostname and nodename. The domain-
> name of cluster-nodes is actually needed nowhere in the CIB. The only
> reference to a domain could be in the stonith-device-ip-address (ip-address
> or
> hostname in the sense of DNS).
In this case the name is just a string which has to match what is
defined in the device.
> Your debug message
> $LOG_DEBUG "Got new hostlist ($hostlist) from $deviceip"
> inside build_device_hostlist() should probably be
> $LOG_DEBUG "Got new hostlist ($device_hostlist) from $deviceip"
> or
> $LOG_DEBUG "Got new port-map ($device_hostlist) from $deviceip"
Yes.
> Proposal for adjusted procedure flow.
>
> Reason for the proposed re-design is based on these thoughts and definitions:
>
> The pacemaker CRM database needs to know the cluster-nodes which are
> controlled by the device. That information shall be provided via the
> parameter
> "hostlist". This hostlist is a list of cluster-node-names. In order to avoid
> further mis-interpretations we might want to change the parameter name to
> "nodelist". Its your decision.
hostlist is, for better or worse, used in all other stonith
plugins.
> In any case this list must not be empty. The
> information is needed by pacemaker. It must be in the CIB.
It's not needed by pacemaker. It is needed by stonithd. But
that's what it finds out by invoking gethosts.
> The device command "IPSetPower" needs the port-number to control the power-
> outlet. The plugin must know the mapping of node-names to port-numbers. The
> device allows to assign a 16 character name to each of the power-outlets.
> These port-names - you used device_hostlist - should define the mapping of
> cluster nodes to power outlets. For that purpose the device should have
> exactly one port for each node in the hostlist/nodelist with a matching port-
> name. An alternative would be to add an additional parameter defining the
> device-port-number for each node.
>
> The plugin command status could conduct two tests: get_http_status() and
> compare the hostlist/nodelist from the CIB with the port-names from the
> device. Corresponding messages could report the result as status to pacemaker.
Right.
> From this introduction we have two questions left:
>
> Shall we just use the device to get a mapping between ports and cluster-nodes
> or do we want to allow an independent definition of that mapping via an
> addditional parameter?
No, we get the mapping from the device since it's already
available there. Otherwise, the parameter list (the resource
configuration) is going to get ugly.
> Shall the status command of the plugin validate the hostlist/nodelist against
> the device-port-names?
Yes. The status is actually part of the resource start operation.
It should verify that the defined hostlist is a subset of the
list we get from the device.
> With regard to the flow of the procedure these definitions make the following
> difference:
>
> We have three types of commands $1 for the plugin:
>
> 1. Requests for Meta data:
>
> getconfignames)
> getinfo-devid)
> getinfo-devname)
> getinfo-devdescr)
> getinfo-devurl)
> getinfo-xml)
> gethosts)
>
> These commands dont need any interaction with the actual device.
gethosts does, because the hostlist is optional. Alternatively,
the hostlist may be required as well. I don't really have a
preference.
> 2. Call for status
>
> status)
>
> The device status is fully detected by the call to get_http_status().
> It is implemented by the documented web-command "IPGetPower".
> We may want to include the validation of node-names in this call.
Definitely.
> 3. Power control commands
>
> on)
> off)
> reset)
>
> These commands are implemented by the documented web-command "IPSetPower".
> They require a valid mapping between cluster-nodes and device-ports.
> stonithd does always send two commands to the device. The first command
> issued
> is status which is followed by the actual required control function.
>
>
> Comparing these design principles with the code you did attach, I'm convinced
> this is exactly what you wanted to achieve. And I think I know your answers
> for the two questions: The device shall define the mapping and status) shall
> also validate node-names against the port-names.
Yes.
> The difference between these principles and the current code is the handling
> of the hostlist.
>
> If the plugin does update the CIB with the hostlist it gets
> from the device, your code does achieve this. But there is a risk in
> reporting
> that list as hostlist: What happens if the list includes a port-name which is
> not a member of the cluster? Might pacemaker get confused?
No, but anyway the non-assigned ports shouldn't be reported.
> procedure name2port() should probably translate hostnames using
> device_hostlist instead of hostlist.
Right.
I can see now that the code is not right in another place, as you
noticed above:
hostlist="$device_hostlist"
That should be replaced by a filter which would weed out the
non-assigned ports.
Cheers,
Dejan
> Regards,
>
> Helmut
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/