On 10/12/13 12:31, Brian J. Murrell wrote:
On Tue, 2013-12-10 at 10:27 +0000, Christine Caulfield wrote:

Sadly you're not wrong.

That's what I was afraid of.

But it's actually no worse than updating
corosync.conf manually,

I think it is...

in fact it's pretty much the same thing,

Not really.  Updating corosync.conf on any given node means only having
to write that file on that node.  There is no cluster-wide
synchronization needed and therefore no last-write-wins race so all
nodes can do that in parallel.  Plus adding a new node means only having
to update the corosync.conf on that new node (and starting up corosync
of course) and corosync then does the job of telling it's peers about
the new node rather than having to have the administrator go out and
touch every node to inform them of the new member.

It's this removal of node auto-discovery and changing it to an operator
task that is really complicating the workflow.  Granted, it's not so
much complicating it for a human operator who is naturally only
single-threaded and mostly incapable of inducing the last-write-wins
races.

But when you are writing tools that now have to take what used to be a
very capable multithreaded task, free of races and shove it down a
single-threaded pipe/queue just to eliminate races, this is a huge step
backwards in evolution.

so
nothing is actually getting worse.

It is though.  See above.

All the CIB information is still
properly replicated.

Yeah.  I understand/understood that.  Pacemaker's actual operations go
mostly unchanged.  It's the cluster membership process that's gotten
needlessly complicated and regressed in functionality.

The main difficulty is in safely replicating information that's needed
to boot the system.

Do you literally mean staring the system up?  I guess the use-case you
are describing here is booting nodes from a clustered filesystem?  But
what if you don't need that complication?  This process is being made
more complicated to satisfy only a subset of the use-cases.

In general use we've not found it to be a huge problem (though, I'm
still not keen on it either TBH) because most management is done by one
person from one node.

Indeed.  As I said above, WRT to single-threaded operators.  But when
you are writing a management system on top of all of this, which
naturally wants to be multi-threaded (because scalable systems avoid
bottlenecking through single choke points) and was able to be
multithreaded when it was just corosync.conf, having to choke everything
back down into a single thread just sucks.

There is not really any concept of nodes trying to
"add themselves" to a cluster, it needs to be done by a person - which
maybe what you're unhappy with.

Yes, not so much "add themselves" but allowed to be added, in parallel
without fear of racing.

This ccs tool wouldn't be so bad if it operated more like the CIB where
modifications were replicated automatically and properly locked so that
modifications could be made anywhere on the cluster and all members got
those modifications automatically rather than pushing off the work of
locking, replication and serialization off onto the caller.



This is not officially supported but you might like to investigate the command 'cman_tool join -X' which allows cman to do auto-discovery. There is some brief documentation in the man page, but you might need to play with it to see what it can do for you and which bits actually work as you want them to ...

Chrissie


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to