On Thu, 12 Aug 2010, Dejan Muhamedagic wrote: > On Wed, Aug 11, 2010 at 05:22:56PM -0700, David Lang wrote: >> On Thu, 12 Aug 2010, Dejan Muhamedagic wrote: >> >>> On Wed, Aug 11, 2010 at 03:59:34PM -0700, David Lang wrote: >>>> On Thu, 12 Aug 2010, Dejan Muhamedagic wrote: >>>> >>>>> On Wed, Aug 11, 2010 at 02:44:36PM -0700, David Lang wrote: >> >>>>>> I've been watching things get more and more complicated over time, and I >>>>>> recognise that to solve complex problems you sometimes need that >>>>>> complexity, but >>>>>> there are a LOT of problems that aren't that complex. Heartbeat has been >>>>>> making >>>>>> it harder and harder to do simple things, and with the difficulty in >>>>>> figuring >>>>>> out what version 3.0.2 is doing that Igor is experiancing, and the >>>>>> inability to >>>>>> take a simple config and convert it to the new format, it is sounding >>>>>> like it >>>>>> may be time to fork. >>>>> >>>>> I completely agree that increased complexity is a problem and >>>>> particularly in HA solutions. And it is possible to create very >>>>> complex configurations with Pacemaker, and at the same time make >>>>> it hard (or impossible) for humans to understand what does the >>>>> cluster do. >>>> >>>> and sometimes such complexity is needed, but sometimes it's not. >>> >>> I'd say that running something one can't understand is at least >>> unmaintainable. >> >> but if all I'm doing is the simple stuff, I don't need to understand all the >> complex stuff, I just need to learn the part that I'm using. > > Well, you said it. I'm not sure what does "complex stuff" exactly > refer to.
more than two machines, active-active to start with. the simple haresources config (when you start have box X default to running the following resources) covers a LOT of ground, especially if one of those resources can be control of a shared drive (either physically shared or logical via drbd) >>>> the fact that we are on day 2 or 3 of Igor's problem and can't even figure >>>> out >>>> what's happening because the logs aren't showing anything is a very bad >>>> sign. >>> >>> Those logs have always been the same. >> >> Could you please take a look at what Igor has been posting and see if you can >> figure out why the logs stop within a minute or so of heartbeat starting >> (before >> it starts/stops any resources) and doesn't log _anything_ for a long time (at >> least 40 min) >> >> the logs are not showing stuff that I (and others who have responded) are >> used >> to seeing in the 2.x versions that we have deployed, so I assumed that this >> was >> due to logging changes (I have never used logd, so I didn't know what >> changes it >> had for example) > > Unfortunately, I forgot almost everything about v1 and can't > provide any useful input. Don't know what kind of logging is > missing. he's running 3.0.x he has one sample in e-mail where he started heartbeat manually and it did on the box that auto-failback pointed to initialization stop all services notice that it needed to be active start all services received an external kill signal stop all services exit on the other box initialization stop all services received an external kill signal stop all services exit what he's getting normally is initialization with nothing else unless one of the boxes shuts down (at which point the other takes over, but he hasn't posted logs from that scenerio) so what _should_ be happening after the first few seconds of startup? when initdead expires something _should_ happen, but we don't see anything in the logs. David Lang _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems