Among two cases where I have seen this error messages I solved one.
On one cluster these dedicated interfaces were connected to a switch
instead of being connected directly.
Though I still don't know what caused these errors on another system
(the logs in the previous email).
The nodes are called node-0 and node-1.
It is not happening regularly. It rather happens occasionally.
Among about 50 two-node clusters we have in house I've seen this issue
in journal of 2 clusters.
I looked at logs and the pattern I see is this: stop Pacemaker and
Corosync on node-1, and then
Hi folks,
We have a lot of our two-node systems running in our server room.
I noticed that some of them occasionally have this entries in the syslog:
Mar 15 12:54:45 A5-E4-151-bottom corosync[13766]: [TOTEM ] Digest does
not match
Mar 15 12:54:45 A5-E4-151-bottom corosync[13766]: [TOTEM ]
Nice logo!
http://wiki.clusterlabgs.org/ doesn't load for me.
I also have a question which bothers me for a long time. Not a significant
one, but anyways ...
I have seen a lot "Linux-HA" name around. But it seems that the name it not
used anymore for this particular stack of HA software.
So I
> attrd_updater --update-delay or --update-both.
>
> > Moreover, when you delete this attribute the actual remove will be
> > delayed by that "--delay" which was used when the attribute was set.
> >
> >
> > Thank you,
> > Kostia
> >
&g
Thank you,
Kostia
On Wed, Nov 30, 2016 at 7:31 PM, Kostiantyn Ponomarenko <
konstantin.ponomare...@gmail.com> wrote:
> Hi Ken,
>
> I didn't look into the logs, but I experimented with it for a while.
> Here is what I found.
>
> It worked for you because this attr
, 2016 at 1:08 AM, Ken Gaillot <kgail...@redhat.com> wrote:
> On 11/24/2016 05:24 AM, Kostiantyn Ponomarenko wrote:
> > Attribute dampening doesn't work for me also.
> > To test that I have a script:
> >
> > attrd_updater -N node-0 -n my-attr --update false --dela
hing wrong?
Or maybe my understanding of an attribute dampening is not correct?
My Pacemaker version is 1.1.13. (heh, not the last one, but it is what it
is ...)
Thank you,
Kostia
On Wed, Nov 23, 2016 at 7:27 PM, Kostiantyn Ponomarenko <
konstantin.ponomare...@gmail.com> wrote:
> Maybe I
ime=reboot
And this attribute is set to the live cluster configuration immediately.
What am I doing wrong?
Thank you,
Kostia
On Tue, Nov 22, 2016 at 11:33 PM, Kostiantyn Ponomarenko <
konstantin.ponomare...@gmail.com> wrote:
> Ken,
> Thank you for the explanation.
> I will try this lo
<mailto:ulrich.wi...@rz.uni-regensburg.de>> wrote:
> >
> > >>> Ken Gaillot <kgail...@redhat.com <mailto:kgail...@redhat.com>>
> > schrieb am 18.11.2016 um 16:17 in Nachricht
> > <d6f449da-64f8-12ad-00be-e772d8e38...@redhat.
Hi folks,
I am looking for a good way of checking if a resource is in "starting"
state.
The thing is - I need to issue a command and I don't want to issue that
command when this particular resource is starting. This resource start can
take up to a few min.
As a note, I am OK with issuing that
ht
> <d6f449da-64f8-12ad-00be-e772d8e38...@redhat.com>:
> > On 11/18/2016 08:55 AM, Kostiantyn Ponomarenko wrote:
> >> Hi folks,
> >>
> >> Is there a way to set a node attribute to the "status" section for few
> >> nodes at the same time?
&g
Hi folks,
Is there a way to set a node attribute to the "status" section for few
nodes at the same time?
In my case there is a node attribute which allows some resources to start
in the cluster if it is set.
If I set this node attribute for say two nodes in a way - one and then
another, than
Phanidhar,
If you don't have any location rules in your cluster, you can try setting
"resource-stickiness=1" or "resource-stickiness=100".
That will do the same job as INFINITY if there is no other location rules
in the cluster.
Also, there is a way to see current state of scores in the cluster,
> in
> > Nachricht <80c65564-b299-e504-4c6c-afd0ff86e...@redhat.com>:
> >> On 11/09/2016 05:30 PM, Kostiantyn Ponomarenko wrote:
> >>> When one problem seems to be solved, another one appears.
> >>> Now my script looks this way:
> >>>
&g
maybe it can be a feature request =)
Thank you,
Kostia
On Wed, Nov 9, 2016 at 6:42 PM, Klaus Wenninger <kwenn...@redhat.com> wrote:
> On 11/09/2016 05:30 PM, Kostiantyn Ponomarenko wrote:
> > When one problem seems to be solved, another one appears.
> > Now my script looks t
Thank you for the answer, Kristoffer.
Thank you,
Kostia
On Sat, Nov 5, 2016 at 10:55 PM, Kristoffer Grönlund <kgronl...@suse.com>
wrote:
> Kostiantyn Ponomarenko <konstantin.ponomare...@gmail.com> writes:
>
> > Hi,
> >
> > I was reading about changing defau
u,
Kostia
On Tue, Nov 8, 2016 at 10:19 PM, Dejan Muhamedagic <deja...@fastmail.fm>
wrote:
> On Tue, Nov 08, 2016 at 12:54:10PM +0100, Klaus Wenninger wrote:
> > On 11/08/2016 11:40 AM, Kostiantyn Ponomarenko wrote:
> > > Hi,
> > >
> > > I need a
Hi,
I need a way to do a manual fail-back on demand.
To be clear, I don't want it to be ON/OFF; I want it to be more like "one
shot".
So far I found that the most reasonable way to do it - is to set "resource
stickiness" to a different value, and then set it back to what it was.
To do that I
I faced with the same problem a few years ago - we needed to make a
two-node cluster working in a "split-brain" situation. We were looking at a
resource agent called SFEX which is disk based -
http://www.linux-ha.org/wiki/Sfex_(resource_agent) . At the end we rejected
SFEX because, if I am not
Yes, DBus would be one of the ways.
Thank you,
Kostia
On Mon, Sep 26, 2016 at 3:33 PM, Klaus Wenninger <kwenn...@redhat.com>
wrote:
> On 09/26/2016 02:29 PM, Kostiantyn Ponomarenko wrote:
>
> Correcting a typo.
> * the same -> I also was hoping to hear that I can do the
Correcting a typo.
* the same -> I also was hoping to hear that I can do the same from c++
code.
Thank you,
Kostia
On Mon, Sep 26, 2016 at 3:28 PM, Kostiantyn Ponomarenko <
konstantin.ponomare...@gmail.com> wrote:
> Thanks for the answer.
>
> I also was hoping to hear that
Thanks for the answer.
I also was hoping to hear that I can do the case from c++ code.
Thank you,
Kostia
On Mon, Sep 26, 2016 at 1:59 PM, Klaus Wenninger <kwenn...@redhat.com>
wrote:
> On 09/26/2016 12:29 PM, Kostiantyn Ponomarenko wrote:
> > Hi,
> >
> > I am
2, 2016 at 11:33 AM, Kristoffer Grönlund <kgronl...@suse.com>
wrote:
> Kostiantyn Ponomarenko <konstantin.ponomare...@gmail.com> writes:
>
> > Hi,
> >
> >>> If "scripts: no-quorum-policy=ignore" is becoming depreciated
> > Are there any
Hi guys,
My understanding is that sometimes an unused command or/and option to a
command can be removed. I don't know how many people use "crmadmin
−−dc_lookup" command, but I do use it =) .
That is why I ask not to remove this in the future releases of Pacemaker,
because it is a vital command
Thank you, Ken.
This helps a lot.
Now I am sure that my current approach fits best for me =)
Thank you,
Kostia
On Wed, Mar 30, 2016 at 11:10 PM, Ken Gaillot <kgail...@redhat.com> wrote:
> On 03/29/2016 08:22 AM, Kostiantyn Ponomarenko wrote:
> > Ken, thank you for the answer.
>
2015 at 12:17 AM, Andrew Beekhof <and...@beekhof.net> wrote:
>
> > On 27 May 2015, at 10:09 pm, Kostiantyn Ponomarenko <
> konstantin.ponomare...@gmail.com> wrote:
> >
> > I think I wasn't precise in my questions.
> > So I will try to ask more precise questions.
&
One of resources in my cluster is not actually running, but "crm_mon" shows
it with the "Started" status.
Its resource agent's monitor function returns "$OCF_NOT_RUNNING", but
Pacemaker doesn't react on this anyhow - crm_mon show the resource as
Started.
I couldn't find an explanation to this
[monitor] : got rc=$rc"
return $OCF_NOT_RUNNING
}
Thank you,
Kostia
On Tue, Jan 19, 2016 at 6:30 PM, Kostiantyn Ponomarenko <
konstantin.ponomare...@gmail.com> wrote:
> The resource that wasn't running, but was reported as running, is
> "adminServer".
>
> Here are a b
Hi,
What is the difference between "OCF_ERR_GENERIC" and "OCF_NOT_RUNNING"
return codes in "monitor" action from the Pacemaker's point of view?
I was looking here
http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-ocf-return-codes.html
, but I still don't see the
The issue doesn't hurt me right now as we use a workaround for that.
But a workaround is not a fix of the problem.
Thank you,
Kostya
On Fri, Aug 28, 2015 at 12:25 PM, Kostiantyn Ponomarenko
konstantin.ponomare...@gmail.com wrote:
In my case the final solution will be shipped to different
I agree that the possibility of this to happen is really really small =)
But the consequences can be huge =(
Thank you,
Kostya
On Fri, Aug 21, 2015 at 4:06 PM, Kostiantyn Ponomarenko
konstantin.ponomare...@gmail.com wrote:
As I wrote in the previous email, it could happen when NTP servers
.
And in that case the bug will appear itself.
Thank you,
Kostya
On Mon, Aug 17, 2015 at 3:01 AM, Andrew Beekhof and...@beekhof.net wrote:
On 8 Aug 2015, at 12:43 am, Kostiantyn Ponomarenko
konstantin.ponomare...@gmail.com wrote:
Hi Andrew,
So the issue is:
Having one node up and running
Hi,
Brief description of the STONITH problem:
I see two different behaviors with two different STONITH configurations. If
Pacemaker cannot find a device that can STONITH a problematic node, the
node remains up and running. Which is bad, because it must be STONITHed.
As opposite to it, if
Digimer,
Thank you. I will try this out.
One more question. What about directories for those agents, what rules are
here?
Thank you,
Kostya
On Tue, Aug 11, 2015 at 6:21 PM, Digimer li...@alteeve.ca wrote:
On 11/08/15 11:17 AM, Kostiantyn Ponomarenko wrote:
Hi guys,
Is there any
Then make sure it can be stonithd. Add additional stonith agent using
independent communication channel.
Not possible. Only one node up and running in the cluster and I am
wondering - can it STONITH itself? Because most likely, after reboot, the
problem can be gone.
I have no idea what
Hi,
I noticed that after moving to the new mailing list there is no more
updates here:
http://www.gossamer-threads.com/lists/linuxha/users/
Can it be fixed or am I missing something? I was a convenient way of
searching/reading/tracking issues.
Thank you,
Kostya
places which I also can put my
agent in and get it visible to the cluster?
Thank you,
Kostya
On Thu, Aug 13, 2015 at 5:34 PM, Digimer li...@alteeve.ca wrote:
On 13/08/15 07:54 AM, Kostiantyn Ponomarenko wrote:
Digimer,
Thank you. I will try this out.
One more question. What about directories
Thank you for the help :-)
On Aug 13, 2015 20:19, Digimer li...@alteeve.ca wrote:
Ah, yes. If it's a RHEL/CentOS machine, put it in /usr/sbin/. If it's
another OS, locate fence_ipmilan and put your agent in the same directory.
digimer
On 13/08/15 01:03 PM, Kostiantyn Ponomarenko wrote
Hi Marek,
The agent I wrote is too much specific for me.
There is no use outside of it.
And it basically as simple as Resource Agent.
Thank you,
Kostya
On Wed, Mar 18, 2015 at 5:45 PM, Marek marx Grac mg...@redhat.com wrote:
Hi,
On 03/11/2015 10:39 AM, Kostiantyn Ponomarenko wrote:
Hi
).
So, then, after NTP becomes reachable, the bug appears.
Thank you,
Kostya
On Mon, Aug 10, 2015 at 9:13 AM, Ulrich Windl
ulrich.wi...@rz.uni-regensburg.de wrote:
Kostiantyn Ponomarenko konstantin.ponomare...@gmail.com schrieb am
07.08.2015
um 16:43 in Nachricht
On Tue, Aug 4, 2015 at 3:57 AM, Andrew Beekhof and...@beekhof.net wrote:
Github might be another.
I am not able to open an issue/bug here
https://github.com/ClusterLabs/pacemaker
Thank you,
Kostya
___
Users mailing list: Users@clusterlabs.org
if the apache server has gone down.Do i need to change
any of my scripts? I want to make sure that a single command to start an
apache service in one node should also start the apache servers running on
other nodes.
On Wed, Jul 29, 2015 at 7:12 PM, Kostiantyn Ponomarenko
konstantin.ponomare
Hi,
If you do:
# date --set=1990-01-01 01:00:00
when only one node is present in the cluster and while the cluster is
working, and then stop a resource (any resource), the cluster fails the
resource once, shows it as Started, but the resource actually is still
stopped.
Is it the expected
44 matches
Mail list logo