Bjorn Oglefjorn wrote:
> Oh my. I feel embarrassed. I owe you a drink, Dejan. Stonith seems to be
> working now. I'll go hang my head in shame now.
Many people owe Dejan a drink!
Thanks for finishing this off. I just didn't seem to have the
concentration to follow through on the details.
--
On Fri, Apr 20, 2007 at 02:18:48PM -0400, Bjorn Oglefjorn wrote:
> Oh my. I feel embarrassed. I owe you a drink, Dejan. Stonith seems to be
> working now. I'll go hang my head in shame now.
No need to be embarrassed. Worse things happen :) I'm glad that we
finally managed to find the problem.
Oh my. I feel embarrassed. I owe you a drink, Dejan. Stonith seems to be
working now. I'll go hang my head in shame now.
--BO
On 4/20/07, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:
> Once again:
>
> [EMAIL PROTECTED] ~]# stonith -t external/drac4
> DRAC_ADDR=test-2.drac.domainDRAC_LOGIN=r
On Fri, Apr 20, 2007 at 10:25:04AM -0400, Bjorn Oglefjorn wrote:
> If it seems counter intuitive, think of it like this:
>* test-1_DRAC is the DRAC installed in the chassis of
> test-1.domainwhich has an address of
> test-1.drac.domain
No, actually it's not counter intuitive.
> Then look here
Thirty seconds _should_ be enough time, but I'm curious as to why my five
minute timeout isn't in effect here.
--BO
On 4/19/07, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:
On Tue, Apr 17, 2007 at 03:53:41PM -0400, Bjorn Oglefjorn wrote:
> Here they are again.
It looks like that this
Apr 4 1
If it seems counter intuitive, think of it like this:
* test-1_DRAC is the DRAC installed in the chassis of
test-1.domainwhich has an address of
test-1.drac.domain
Then look here:
In other words, test-1_DRAC
On 4/19/07, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:
Anyway, in CIB I found only this (crm_verify doesn't complain) I
find these two timeouts:
...
1. transition_timeout is not in the annotated CIB.
2. Should user specify this timeout in the crm_config section and
calculate the maximum v
On Tue, Apr 17, 2007 at 03:55:07PM -0400, Bjorn Oglefjorn wrote:
> Alan, what is the list operation? The node names are always FQDNs and
> always match.
If you apply this patch:
http://hg.linux-ha.org/dev/rev/944d240b728a
and recompile, we should see a list of hosts as reported by your
stonith ag
On Tue, Apr 17, 2007 at 03:55:07PM -0400, Bjorn Oglefjorn wrote:
> Alan, what is the list operation? The node names are always FQDNs and
> always match.
Do they?
>From your CIB:
Hate replying to myself...
There's more and somewhere here is the real problem:
Apr 4 11:27:50 test-2 tengine: [13668]: info: te_fence_node:actions.c
Executing reboot fencing operation (16) on test-1.domain (timeout=3)
Apr 4 11:27:50 test-2 stonithd: [13658]: info: Broadcasting the message
On Tue, Apr 17, 2007 at 03:53:41PM -0400, Bjorn Oglefjorn wrote:
> Here they are again.
It looks like that this
Apr 4 11:28:20 test-2 stonithd: [13658]: info: Failed to STONITH the node
test-1.domain: optype=1, op_result=2
means that the stonith operation timed out. I'll fix the code to
raise
Alan, what is the list operation? The node names are always FQDNs and
always match.
--BO
On 4/17/07, Alan Robertson <[EMAIL PROTECTED]> wrote:
Andrew Beekhof wrote:
> On 4/17/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
>> I know that my plugin is getting called because of the logging that t
Andrew Beekhof wrote:
> On 4/17/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
>> I know that my plugin is getting called because of the logging that the
>> plugin does.
>
> do we get to see that logging at all? preferably in the context of
> the other log messages
>
>> That said, I also know my
On 4/17/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
I know that my plugin is getting called because of the logging that the
plugin does.
do we get to see that logging at all? preferably in the context of
the other log messages
That said, I also know my plugin is not receiving any 'reset'
I know that my plugin is getting called because of the logging that the
plugin does. That said, I also know my plugin is not receiving any 'reset'
operation request from heartbeat. If you see below, request actions are
logged. The only actions logged when node failure is simulated are:
getconfi
On 4/17/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
Yes, I most certainly have. The stonith command-line tool has no problem at
all with the plugin. The following was run from test-1.domain. The
indented log entries are from the debug log of the stonith plugin:
I'm no stonith expert, but
Yes, I most certainly have. The stonith command-line tool has no problem at
all with the plugin. The following was run from test-1.domain. The
indented log entries are from the debug log of the stonith plugin:
root:~ # stonith -t external/drac4
DRAC_ADDR=test-2.drac.domainDRAC_LOGIN=root DRAC_
On 4/16/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
No ideas?
none at all - have you tried calling it manually using the stonith
command-line tool to make sure it works?
On 4/9/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
>
> I quickly put together a STONITH plugin for testing this. It
I quickly put together a STONITH plugin for testing this. It conforms to
the heartbeat spec and always lies to heartbeat returning success no matter
what. With this plugin in place I'm still getting this error:
Apr 9 15:40:47 test-2 stonithd: [8791]: info: Failed to STONITH the node
test-1.dom
Any ideas?
--BO
On 4/4/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
I do not know what op_result=2 means. I can only say that the drac4 RA
will never have exit code 2. I am sure that the drac4 RA works as expected
in all use cases and also when called via the stonith command from the
comman
On Tue, Apr 03, 2007 at 03:52:37PM -0400, Bjorn Oglefjorn wrote:
> Sorry Alan, I realize that this post is getting quite long. Here is a run
> down of where I am currently.
>
> STONITH is failing and I'm still not sure why.
Me neither. There's nothing in the logs apart from:
Mar 30 09:38:20 tes
On Tue, Apr 03, 2007 at 03:38:44PM -0400, Bjorn Oglefjorn wrote:
> Maybe I have too much logging now. Here is the log from test-1 with the
> debugging information removed. I've also trimmed it from where heartbeat
> notices the node is dead until STONITH fails the second time. I hope this
> is a
Bjorn Oglefjorn wrote:
> Anyone? Help?
> --BO
>
> On 4/2/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
>>
>> Any ideas as to what's going wrong here?
there is so much send/reply/try/fail/fix stuff in the email that I had
trouble following what was going on.
Could you try reposting this cleanly
Anyone? Help?
--BO
On 4/2/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
Any ideas as to what's going wrong here?
--BO
On 3/30/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
>
> I've made the OCF apache RA work by editing the script's parameters for
> now. This is just testing anyway. Attach
Any ideas as to what's going wrong here?
--BO
On 3/30/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
I've made the OCF apache RA work by editing the script's parameters for
now. This is just testing anyway. Attached are my configs and a tar ball
of the logs from the two nodes in question. Th
Thanks Alan. That makes more sense now.
--BO
On 3/30/07, Alan Robertson <[EMAIL PROTECTED]> wrote:
Bjorn Oglefjorn wrote:
> I took a look at the apache RA, but it makes a lot of assumptions about
the
> environment which are mostly untrue in Red Hat. How can I configure
> this RA
> short of ma
Bjorn Oglefjorn wrote:
> I took a look at the apache RA, but it makes a lot of assumptions about the
> environment which are mostly untrue in Red Hat. How can I configure
> this RA
> short of making changes to the script? Can I set environmental variables?
> I tried setting what's shown in the 'm
Correct. I am running CentOS 4.4 and httpd-2.0.52-28.ent.centos4. The
default location for httpd.conf is in /etc/httpd/conf/http.conf for this
package. The script looks at /etc/httpd/httpd.conf and fails when it
doesn't find it. Also, the default LISTEN directive in the apache config
does NOT
On Fri, Mar 30, 2007 at 09:22:44AM -0400, Bjorn Oglefjorn wrote:
> I took a look at the apache RA, but it makes a lot of assumptions about the
> environment which are mostly untrue in Red Hat. How can I configure this RA
> short of making changes to the script? Can I set environmental variables?
I took a look at the apache RA, but it makes a lot of assumptions about the
environment which are mostly untrue in Red Hat. How can I configure this RA
short of making changes to the script? Can I set environmental variables?
I tried setting what's shown in the 'meta-data' output, but with no lu
Bjorn Oglefjorn wrote:
> Thanks for the reply Dejan. My responses are inline.
> --BO
>
> On 3/28/07, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:
>>
>> On Wed, Mar 28, 2007 at 11:29:35AM -0400, Bjorn Oglefjorn wrote:
>> > I believe I've corrected some issues, but now I'm getting more of this:
>>
Dejan Muhamedagic wrote:
> On Wed, Mar 28, 2007 at 02:33:28PM -0400, Bjorn Oglefjorn wrote:
>> Thanks for the reply Dejan. My responses are inline.
>> --BO
>>
>> On 3/28/07, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:
>>> On Wed, Mar 28, 2007 at 11:29:35AM -0400, Bjorn Oglefjorn wrote:
I bel
On Wed, Mar 28, 2007 at 02:33:28PM -0400, Bjorn Oglefjorn wrote:
> Thanks for the reply Dejan. My responses are inline.
> --BO
>
> On 3/28/07, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:
> >
> >On Wed, Mar 28, 2007 at 11:29:35AM -0400, Bjorn Oglefjorn wrote:
> >> I believe I've corrected some is
On Wed, Mar 28, 2007 at 11:29:35AM -0400, Bjorn Oglefjorn wrote:
> I believe I've corrected some issues, but now I'm getting more of this:
> Mar 28 11:02:37 test-1 lrmd: [22008]: ERROR: RA lsb:httpd:monitor (process
> 24472) failed to redirect stdout for its background child (daemon)
> processes. T
Here is the script I forgot to attach.
On 3/28/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
Thanks for the reply Dejan. My responses are inline.
--BO
On 3/28/07, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:
>
> On Wed, Mar 28, 2007 at 11:29:35AM -0400, Bjorn Oglefjorn wrote:
> > I believe I
Thanks for the reply Dejan. My responses are inline.
--BO
On 3/28/07, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote:
On Wed, Mar 28, 2007 at 11:29:35AM -0400, Bjorn Oglefjorn wrote:
> I believe I've corrected some issues, but now I'm getting more of this:
> Mar 28 11:02:37 test-1 lrmd: [22008]:
I believe I've corrected some issues, but now I'm getting more of this:
Mar 28 11:02:37 test-1 lrmd: [22008]: ERROR: RA lsb:httpd:monitor (process
24472) failed to redirect stdout for its background child (daemon)
processes. This will likely cause those processes to die mysteriously at
some later
Does this make more sense? I've changed the constraints to the way you've
suggested and I've also changed the scores to INFINITY. I have since added
debug logging and made some changes to my STONITH RA. It kind of works at
this point, but eventually both nodes get shot in the head if I shut dow
Dejan Muhamedagic wrote:
> On Wed, Mar 21, 2007 at 05:45:14AM -0600, Alan Robertson wrote:
>> Dejan Muhamedagic wrote:
>>> On Tue, Mar 20, 2007 at 10:59:06PM -0600, Alan Robertson wrote:
Dejan Muhamedagic wrote:
> On Tue, Mar 20, 2007 at 01:11:21PM -0400, Bjorn Oglefjorn wrote:
>> Odd.
On Wed, Mar 21, 2007 at 05:45:14AM -0600, Alan Robertson wrote:
> Dejan Muhamedagic wrote:
> > On Tue, Mar 20, 2007 at 10:59:06PM -0600, Alan Robertson wrote:
> >> Dejan Muhamedagic wrote:
> >>> On Tue, Mar 20, 2007 at 01:11:21PM -0400, Bjorn Oglefjorn wrote:
> Odd. I've changed that op to be
CentOS extras to be more specific.
On 3/21/07, Bjorn Oglefjorn <[EMAIL PROTECTED]> wrote:
I am running:
Version: 2.0.7
Release: 1.c4
Thanks,
--BO
On 3/21/07, Alan Robertson <[EMAIL PROTECTED]> wrote:
>
> Dejan Muhamedagic wrote:
> > On Tue, Mar 20, 2007 at 10:59:06PM -0600, Alan Robertson wro
41 matches
Mail list logo