On Fri, Jun 27, 2008 at 06:48, Junko IKEDA <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> When I checked the following bug using the latest heartbeat-dev and
> pacemaker-dev,
> http://developerbugs.linux-foundation.org/show_bug.cgi?id=1924
>
> I found the weird behavior.
>
> There are these five resources.
>
> ============
> Last updated: Fri Jun 27 13:07:11 2008
> Current DC: x3650b (db1e4cef-d242-419e-9393-bf5113384744)
> 2 Nodes configured.
> 1 Resources configured.
> ============
>
> Node: x3650a (ce2caf3f-c150-4394-916d-3b4b635394d7): online
> Node: x3650b (db1e4cef-d242-419e-9393-bf5113384744): online
>
> Resource Group: grpPostgreSQLDB
>    prmFsPostgreSQLDB1  (ocf::heartbeat:Filesystem):    Started x3650a
>    prmFsPostgreSQLDB2  (ocf::heartbeat:Filesystem):    Started x3650a
>    prmFsPostgreSQLDB3  (ocf::heartbeat:Filesystem):    Started x3650a
>    prmIpPostgreSQLDB   (ocf::heartbeat:IPaddr):        Started x3650a
>    prmApPostgreSQLDB   (ocf::heartbeat:pgsql): Started x3650a
>
>
> When "lrmd" is killed, crmd can not notice that event due to (maybe) a
> glib's problem.
>
> hb_report-10/x3650a:line 616
> heartbeat[24311]: 2008/06/27_12:57:55 WARN: Managed
> /usr/lib64/heartbeat/lrmd -r process 24327 killed by signal 9 [SIGKILL -
> Kill, unblockable].
>
> but if I stop pgsql like this,
>
> # su - postgres
> $ pg_ctl stop
> waiting for server to shut down.... done
> server stopped
>
> the frozen process is resumed.
>
> hb_report-10/x3650a:line 657
> crmd[24330]: 2008/06/27_13:09:36 CRIT: lrm_connection_destroy: LRM
> Connection failed
>
> Heartbeat 2.1.3 did the same.
> I wonder why the status of Postgres affects this.

This is seriously messed up.
I wonder if could it be caused by the fact that a process spawned by
the lrmd is still active.

It might be worth seeing if you can repeat the result with a resource
based on a simple daemon process ( while(1) { sleep(1); } ).
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to