Ken Gaillot <kgail...@redhat.com> wrote:
On Thu, 2017-12-07 at 12:13 +0000, Adam Spiers wrote:
https://gocardless.com/blog/incident-review-api-and-dashboard-outage-
on-10th-october/

It's a great write-up, although a little frustrating that it is still
not fully understood why a -inf colocation failed whereas a +inf
succeeded.  (I actually have a vague memory of discovering something
very similar a while back, but I can't find the details.)

That is an excellent post. I'll contact them directly to discuss it
further.

Cool, thanks!

IMHO this serves as a good example of the difficulty Pacemaker faces,
and consequently as valuable feedback for how Pacemaker needs to
improve: it's all too easy to do one tiny misconfiguration which can
potentially bring the whole house of cards tumbling down, and it's
often really hard to understand what went wrong.

So FWIW, my personal view is that more than anything else right now,
Pacemaker needs to be made easier to understand.  I know this is a

Agreed, but there are about a dozen things that are more important than
anything else right now ;)

Heheh yeah, I can related to that feeling ;-)

Personally, my current focus is technical debt: stripping out all the
legacy features that were deprecated in 1.1.18, so we can release 2.0.0
with a smaller code base that is easier to maintain going forward. The
hope is that this pays off in greater time savings down the road, but
it sucks up a lot of time in the near term.

There are a large number of outstanding bug reports that bother me,
several of them quite serious, and I would like to spend more time on
those before new features, but ...

There is constant demand for new features from paying customers, and we
can't stay relevant without trying to keep up at least to an extent.
Several recent projects (bundles, alerts, versioned attributes) could
really benefit from some follow-up work, and more major projects are
right on the horizon (failure handling configuration overhaul, crm_mon
overhaul, containerization of pacemaker/corosync, corosync 3/knet
compatibility).

And of course usability is, indeed, an incredibly important area to be
addressed, spanning log messages, documentation, and tooling.

Yep, totally understood.

Which is to say, volunteers welcome :-)

... which is the cue for everyone to run away, leaving tumbleweed
silence ;-)

Seriously though, I acknowledge the lack of resources, so maybe just
aim for a few small steps forward here and there?

For example, making a few of the most crucial existing log messages
less cryptic could maybe go a long way.  Or if "dumbing down" log
messages would make life harder for developers who are familiar with
Pacemaker internals and need to be able to track all the gory details,
recognise the fact that the kind of logs which developers and users
need to read are vastly different, and consequently provide a way of
distinguishing between the two kinds.  Making all developer logs DEBUG
level and non-developer other levels might be one way, but there are
probably better approaches (e.g. tag all developer logs with a certain
string which can be filtered out).

Another simple change would be to adopt a policy that rather than
sharing information on this list in response to questions which arise,
add the answers to the documentation and then just give a short reply
to the list saying "here's the link to the documentation I just
updated".  I'm sure that the archives of this list are an absolute
gold mine of useful information, but list archives make for really
poor documentation ...

And BTW, lest I come across as a constant whinger ... I think you're
doing an absolutely fantastic job as maintainer! ;-)

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to