On 2015-07-07T14:15:14, Muhammad Sharfuddin m.sharfud...@nds.com.pk wrote:
now msgwait timeout is set to 10s and a delay/inaccessibility of 15 seconds
was observed. If a service(App, DB, file server) is installed and running
from the ocfs2 file system via the surviving/online node, then
On 2015-07-07T12:23:44, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
The advantage depends on the alternatives: If two nodes both want to access
the same filesystem, you can use OCFS2, NFS, or CIFS (list not complete). If
only one node can access a filesystem, you could try any
On 2015-02-16T09:20:22, Kristoffer Grönlund kgronl...@suse.com wrote:
Actually, I decided that it does make sense to return 0 as the error
code even if the resource to delete doesn't exist, so I pushed a commit
to change this. The error message is still printed, though.
I'm not sure I agree,
On 2015-01-30T14:57:29, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
# grep -i high /etc/corosync/corosync.conf
clear_node_high_bit:new
Could this cause our problem?
This is an option that didn't exist prior to SP3.
With there was no change meant: No administrator
On 2015-01-30T08:23:14, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
Two of the three nodes were actually updated from SP1 via SP2 to SP3, and the
third node was installed with SP3. AFAIR there was no configuration change
since SP1.
That must be incorrect, because:
Was the
On 2015-01-28T16:21:23, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
Kind of answering my own question:
Node id 84939948 in hex is 051014AC, which is 5.16.20.172 where the IP
address actually is 172.20.16.5.
But I see another node ID of 739512325 (hex 2C141005) which is 44.20.16.5.
On 2015-01-28T16:44:34, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
address actually is 172.20.16.5.
But I see another node ID of 739512325 (hex 2C141005) which is
44.20.16.5.
That seems revered compared to the above, and the 0x2c doesn't fit anywhere.
It does. The
On 2015-01-16T16:25:15, EXTERNAL Konold Martin (erfrakon, RtP2/TEF72)
external.martin.kon...@de.bosch.com wrote:
I am glad to hear that SLE HA has no plans to drop support for DRBD.
Unfortunately I currently cannot disclose who is spreading this false
information.
Too bad. Do let them
On 2015-01-16T08:11:48, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
Hi!
MHO: The correct time to wait is in an interval bounded by these two values:
1: An I/O delay that may occur during normal operation that is never allowed
to trigger fencing
2: The maximum value to are
On 2015-01-16T12:22:57, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
Unfortunately the SBD syntax is a real mess, and there is not manual page
(AFAIK) for SBD.
... because man sbd isn't obvious enough, I guess. ;-)
OK, I haven't re-checked recently: You added one!
Yes, we
On 2015-01-16T11:56:04, EXTERNAL Konold Martin (erfrakon, RtP2/TEF72)
external.martin.kon...@de.bosch.com wrote:
I have been told that support for DRBD is supposed to be phased out from both
SLES and RHEL in the near future.
This is massively incorrect for SLE HA. (drbd is part of the HA
On 2014-12-04T08:12:28, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
Of course HP's software isn't quite flexible here, but maybe a symlink from
the old location to the new one wouldn't be bad (for the lifetime of SLES11,
maybe)...
A symlink might not work, depending on what kind
On 2014-11-27T10:10:47, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
Hi!
I had thought ordrring of clones would work, but it looks like it does not in
current SLES11 SP3 (1.1.11-3ca8c3b):
I have rules like:
order ord_DLM_O2CB inf: cln_DLM cln_O2CB
order ord_DLM_cLVMd inf:
On 2014-11-25T16:46:01, David Vossel dvos...@redhat.com wrote:
Okay, okay, apparently we have got enough topics to discuss. I'll
grumble a bit more about Brno, but let's get the organisation of that
thing on track ... Sigh. Always so much work!
I'm assuming arrival on the 3rd and departure on
On 2014-11-24T16:16:05, Fabio M. Di Nitto fdini...@redhat.com wrote:
Yeah, well, devconf.cz is not such an interesting event for those who do
not wear the fedora ;-)
That would be the perfect opportunity for you to convert users to Suse ;)
I´d prefer, at least for this round, to keep
On 2014-11-11T09:17:56, Fabio M. Di Nitto fdini...@redhat.com wrote:
Hey,
I know I'm a bit late to the game, but: I'd be happy to meet, yet Brno
is not all that easy to reach. There don't appear to be regular flights
to BRQ, and it's also quite far by train.
Am I missing something obvious
On 2014-11-24T06:59:39, Digimer li...@alteeve.ca wrote:
The LINBIT folks suggested to land in Vienna and then it's two hours by
road, but I've not looked too closely at it just yet.
I'd be happy to meet in Vienna. I'm not keen on first flying to VIE and
then spending 2+h on the road/bus.
On 2014-09-08T12:30:23, Fabio M. Di Nitto fdini...@redhat.com wrote:
Folks, Fabio,
thanks for organizing this and getting the ball rolling. And again sorry
for being late to said game; I was busy elsewhere.
However, it seems that the idea for such a HA Summit in Brno/Feb 2015
hasn't exactly
On 2014-11-24T15:54:33, Fabio M. Di Nitto fdini...@redhat.com wrote:
dates and location were chosen to piggy-back with devconf.cz and allow
people to travel for more than just HA Summit.
Yeah, well, devconf.cz is not such an interesting event for those who do
not wear the fedora ;-)
I´d
On 2014-10-23T20:36:38, Lars Ellenberg lars.ellenb...@linbit.com wrote:
If we want to require presence of start-stop-daemon,
we could make all this somebody elses problem.
I need find some time to browse through the code
to see if it can be improved further.
But in any case, using (a tool
On 2014-09-09T16:03:04, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
I modified the ping RA to meet my needs, and then I used ocf-tester to check
it with the settings desired. I'm wondering about the output; shoudln't
ocf-tester query the metadata _before_ trying to use the methods,
On 2014-06-07T16:13:05, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote:
So I'd appreciate it if you'd not make those claims; I admit to feeling
slighted.
The claim that prompted this was that the level of support a centos user
gets is for pacemaker: 50% chance that the Lars over there will ask
On 2014-05-31T11:15:20, Dmitri Maziuk dmaz...@bmrb.wisc.edu wrote:
Is there a reason you keep spouting nonsense?
Yes: I have a memory and it remembers. For example, this:
http://www.gossamer-threads.com/lists/linuxha/users/81573?do=post_view_threaded#81573
I don't remember that being an
On 2014-06-05T07:41:18, Teerapatr Kittiratanachai maillist...@gmail.com wrote:
In my /usr/lib/ocf/resource.d/heartbeat/ directory doesn't has the `IPv6addr`
agent.
But I found that the `IPaddr2` agent also support IPv6, from the source
code in GitHub (IPaddr2
On 2014-06-02T20:37:59, Venkata G Thota venkt...@in.ibm.com wrote:
Hello,
In our project we had the heartbeat cluster with version
heartbeat-2.1.4-0.24.9.
Is it the supported version ?
Kindly assist how to get support for heartbeat cluster issues.
Regards
That looks like a fairly
On 2014-06-02T12:04:23, Digimer li...@alteeve.ca wrote:
You should email Linbit (http://linbit.com) as they're the company that
still supports the heartbeat package.
For completeness, I doubt Linbit will support this version, since 2.1.4
from SLES 10 contains a number of backports from the
On 2014-04-22T14:21:33, Tom Parker tpar...@cbnco.com wrote:
Hi Tom,
Has anyone seen this? Do you know what might be causing the flapping?
No, I've never seen this.
Apr 21 22:03:04 qaxen6 sbd: [12974]: info: Waiting to sign in with
cluster ...
So it connected fine. This is the process
On 2014-04-17T08:05:43, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
I think Xen live migration and correct time has a lot to do with HA; maybe
not with the product you have in mind, but with the concept in general.
Sure. But Xen and kernel time keeping developers aren't subscribed to
On 2014-04-16T09:18:21, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
As it turn out, the time in the VMs is actually wrong after migration:
# ntpq -pn
remote refid st t when poll reach delay offset jitter
On 2014-03-18T02:24:51, Liuhua Wang lw...@suse.com wrote:
Hi Liuhua,
thanks for pushing again!
I've taken some time to provide some code review. Overall, I think it
looks good, mostly cosmetic and codingstyle.
I'd welcome more insight from others on this list; especially those with
maintainer
On 2014-03-14T15:50:18, David Vossel dvos...@redhat.com wrote:
in-flight operations always have to complete before we can process a new
transition. The only way we can transition earlier is by killing the
in-flight process, which results in failure recovery and possibly fencing
depending
On 2014-03-11T12:37:39, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
I'm wondering: Does unlock mean that all file locks are invalidated? If so,
I think it's a bad idea, because for a migration of the NFS server the exports
will be stopped/started, thus loosing all locks. That's not
On 2014-02-27T11:05:21, Digimer li...@alteeve.ca wrote:
So regardless of quorum, fencing is required. It is the only way to
reliably avoid split-brains. Unfortunately, fencing doesn't work on stretch
clusters.
For a two node stretch cluster, sbd can also be used reliably as a
fencing
On 2014-02-28T13:16:33, Digimer li...@alteeve.ca wrote:
Assuming a SAN in each location (otherwise you have a single point of
failure), then isn't it still possible to end up with a split-brain if/when
the WAN link fails?
As I suggested a 3rd tie-breaker site (which, in the case of SBD, can
On 2014-02-22T12:35:42, ml ml mliebher...@googlemail.com wrote:
Hello List,
i have a two node Cluster with Debian 7 and this configuation:
node proxy01-example.net
node proxy02-example.net
primitive login.example.net ocf:heartbeat:Xen \
params xmfile=/etc/xen/login.example.net.cfg \
On 2014-02-22T13:49:40, JR botem...@gmail.com wrote:
I've been told by folks on the linux-ha IRC that fencing is my answer
and I've put in place the null fence client. I understand that this is
not what I'd want in production, but for my testing it seems to be the
correct way to test a
On 2014-02-19T10:31:45, Andrew Beekhof and...@beekhof.net wrote:
Unifying this might be difficult, as far as I know pcs doesn't have an
interactive mode or anything similar to the configure interface of
crmsh..
It does have bash completion for the command line.
FWIW, so does the crm shell
On 2014-02-05T12:24:00, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
I had a problem where O2CB stop fenced the node that was shut down:
I had updated the kernel, and then rebooted. As part of shutdown, the cluster
stack was stopped. In turn, the O2CB resource was stopped.
On 2014-02-05T15:06:47, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
I guess the kernel update is more common than the just the ocfs2-kmp update
Well, some customers do apply updates in the recommended way, and thus
don't encounter this ;-) In any case, since at this time the cluster
On 2014-01-30T12:19:27, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
root@vm-nas1:~# crm ra info fencing_vcenter stonith:external/vcenter
ERROR: stonith:external/vcenter:fencing_vcenter: could not parse meta-data:
I guess your RA may be LSB (which is kind of obsolete).
Hm? How can
On 2014-01-31T07:59:57, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
root@vm-nas1:~# crm ra info fencing_vcenter stonith:external/vcenter
ERROR: stonith:external/vcenter:fencing_vcenter: could not parse
meta-data:
I guess your RA may be LSB (which is kind of obsolete).
Hm?
On 2014-01-27T08:59:55, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
Talking on node-action-limit: I think I read in the syslog (not the best way
to document changes) that the migration-limit parameter is obsoleted by
node-action-limit in lastest SLES. Ist that correct?
No.
On 2014-01-24T10:52:56, Tom Parker tpar...@cbnco.com wrote:
Thanks Kristoffer.
How is tuning done for lrm now?
What do you want to tune?
The LRM_MAX_CHILDREN setting is still (okay: again ;-), that was broken
in one update) honored as before. Or you can use the node-action-limit
property in
On 2014-01-24T08:16:03, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
We have a server with a network traffic light on the front. With
corosync/pacemaker the light is constantly flickering, even if the cluster
does nothing.
So I guess it's normal.
Yes. Totem and other components have
On 2014-01-22T09:55:10, Thomas Schulte tho...@cupracer.de wrote:
Hi Thomas,
since those are very recent upstream versions, I think you'll have a
better chance to ask directly on the pacemaker mailing list, or directly
report via bugs.clusterlabs.org - at least for providing the
attachments,
On 2014-01-22T11:18:06, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
We are living in a very distributed world, even when using one Linux
distribution. Maybe those who know could post periodic reminders which
problems
to post where...
I thought that was what I just did? This is
On 2014-01-15T12:05:22, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
Ulrich,
please either ask this question to support or at least on the ocfs2
mailing list.
We really can't provide enterprise-level support via a generic
mailing list. That is not a sustainable business model. And
On 2014-01-15T08:49:55, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
I feel the current clusterstack for SLES11 SP3 has several problems. I'm
fighting for a day to get my test cluster up again after having installed the
latest updates. I still cannot find out what's going on, but I
On 2014-01-15T15:05:02, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
My man at Novell knows about the issue, too ;-)
%s/Novell/SUSE/g
I understand that Novell does not want to read about bugs in their products in
mailinglists, just as customers don't want to see bugs in the products
On 2014-01-09T22:38:17, erkin kabataş ekb...@gmail.com wrote:
I am using heartbeat-2.1.2-2.i386.rpm, heartbeat-pils-2.1.2-2.i386.
rpm, heartbeat-stonith-2.1.2-2.i386.rpm packages on RHEL 5.5.
2.1.2? Seriously, upgrade. You're running code from 2007.
I have 3 nodes and I only use cluster IP
On 2014-01-03T20:56:42, Digimer li...@alteeve.ca wrote:
causing a lot of reinvention of the wheel. In the last 5~6 years, both teams
have been working hard to unify under one common open-source HA stack.
Pacemaker + corosync v2+ is the result of all that hard work. :)
Yes. We know finally
On 2014-01-02T10:17:13, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
Are you using the update that was pushed on Friday? The previous
As I installed updates on Monday, I guess so ;-)
Hm, we're not yet aware of any new or existing bugs in that update. Of
course, we'll learn
On 2013-12-13T10:16:41, Kristoffer Grönlund kgronl...@suse.com wrote:
Lars (lmb) suggested that we might switch to using the { } - brackets
around resource sets everywhere for consistency. My only concern with
that would be that it would be a breaking change to the previous crmsh
syntax.
On 2013-12-13T13:51:27, Andrey Groshev gre...@yandex.ru wrote:
Just thought that I was missing in location, something like: node=any :)
Can you describe what this is supposed to achieve?
any is the default for symmetric clusters anyway.
Regards,
Lars
--
Architect Storage/HA
SUSE LINUX
On 2013-12-04T10:25:58, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
You thought it was working, but in fact it wasn't. ;-)
working meaning the resource started.
not working meaning the resource does not start
You see I have minimal requirements ;-)
I'm sorry; we couldn't
On 2013-12-02T09:22:10, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
No!
Then it can't work. Exclusive activation only works for clustered volume
groups, since it uses the DLM to protect against the VG being activated
more than once in the cluster.
Hi!
Try it with
On 2013-11-29T12:05:28, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
Hi!
A short notice: We had a problem after updating the resource agents in SLES11
SP2 HAE: A LVM VG would not start after updating the RAs. The primitive had
exclusive=true for years, but the current RA requires
On 2013-11-26T12:09:41, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
I saw that I don't have an SBD device any more (it's stopped). Unfortunately I
could not start it (crm resource start prm_stonith_sbd).
I guess it's due to the fact that the cluster won't start resources until the
On 2013-11-29T13:46:17, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
I just did s/true/false/...
Was that a clustered volume?
Clusterd exclusive=true ??
No!
Then it can't work. Exclusive activation only works for clustered volume
groups, since it uses the DLM to protect
On 2013-11-29T13:48:33, Lars Marowsky-Bree l...@suse.com wrote:
Was that a clustered volume?
Clusterd exclusive=true ??
No!
Then it can't work. Exclusive activation only works for clustered volume
groups, since it uses the DLM to protect against the VG being activated
more than once
On 2013-11-26T09:32:50, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
Anotherthing I've notivced: One of out nodes has defective hardware and is
down. It was OK all the time with SLES11 SP2, but SP3 now tried to fence the
node and got a fencing timeout:
Hmmm.
Isn't the logic that
On 2013-11-25T17:48:25, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
Hi Ulrich,
Probably reason:
cib: [12226]: ERROR: cib_perform_op: Discarding update with feature set
'3.0.7' greater than our own '3.0.6'
Is it required to update the whole cluster at once?
It shouldn't be, and
On 2013-11-15T09:05:53, Tom Parker tpar...@cbnco.com wrote:
The XL tools are much faster and lighter weight. I am not sure if they
report proper codes (I will have to test) but the XM stack has been
deprecated so at some point I assume it will go away completely.
The Xen RA already supports
On 2013-10-31T10:54:15, Chuck Smith cgasm...@comcast.net wrote:
I have been debugging ocf:heartbeat:anything, can someone point me to the
definitive standards for ra handshake, as it appears there are several
ported legacy methods that are inconsistent. Also, if you can point me to
the top
On 2013-10-15T14:15:50, Moullé Alain alain.mou...@bull.net wrote:
in fact, I would like to know if someone has configured gfs2 under Pacemaker
with the dlm-controld and gfs-controld from cman-3.0.12 rpm (so without any
more the dlm-controld.pcml and gfs-controld.pcml) ?
And if it works
On 2013-10-15T16:25:37, Moullé Alain alain.mou...@bull.net wrote:
Hi Lars,
thanks a lot for information.
I 'll try, but the documentation asks for gfs2-cluster rpm installation, and
for now I don't find this rpm on RHEL6.4, and don't know
if it is always required ... but it is not in your
On 2013-10-02T09:36:14, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
In general I'm afraid you cannot handle this situation in a perfect way:
You have two types of problems:
1) A node, resource, or monitor is hanging, but a long timeout prevents to
recognize this in time
2) A
On 2013-10-02T13:40:16, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
There is one notable exception: If you have shared storage (SAN, NAS, NFS),
the cause of the slowness may be external to the systems being monitored,
thus fencing those will not improve the situation, most likely.
On 2013-10-01T00:53:15, Tom Parker tpar...@cbnco.com wrote:
Thanks for paying attention to this issue (not really a bug) as I am
sure I am not the only one with this issue. For now I have set all my
VMs to destroy so that the cluster is the only thing managing them but
this is not super
On 2013-09-24T20:55:40, AvatarSmith cgasm...@comcast.net wrote:
I'm having a bit of an issue under centos 6.4 x64. I have two duplcate
hardware systems (raid arrays, 10G nics' etc) configured identically and
drbd replication is working fine in the cluster between the two. When I
started doing
On 2013-09-25T11:00:17, Chuck Smith cgasm...@comcast.net wrote:
do act accordingly, for instance, I have raw primitives, add them to a
group, then decide to move them to a different group (subject to load order)
I cant just remove it from the group and put it in a different one, I have
to
On 2013-09-16T16:36:38, Tom Parker tpar...@cbnco.com wrote:
Can you kindly file a bug report here so it doesn't get lost
https://github.com/ClusterLabs/resource-agents/issues ?
Submitted (Issue *#308)*
Thanks.
It definitely leads to data corruption and I think has to do with the
way that
On 2013-09-17T11:38:34, Ferenc Wagner wf...@niif.hu wrote:
On the other hand, doesn't the recover action after a monitor failure
consist of a stop action on the original host before the new start, just
to make sure? Or maybe I'm confusing things...
Yes, it would - but it seems there's a
On 2013-09-13T17:48:40, Tom Parker tpar...@cbnco.com wrote:
Hi Feri
I agree that it should be necessary but for some reason it works well
the way it is and everything starts in the correct order. Maybe
someone on the dev list can explain a little bit better why this is
working. It may
On 2013-09-14T00:28:30, Tom Parker tpar...@cbnco.com wrote:
Does anyone know of a good way to prevent pacemaker from declaring a vm
dead if it's rebooted from inside the vm. It seems to be detecting the
vm as stopped for the brief moment between shutting down and starting
up.
Hrm. Good
On 2013-09-12T18:14:04, marcy.d.cor...@wellsfargo.com wrote:
Hello list,
Using SUSE SLES 11 SP2.
I have 4 servers in a cluster running cLVM + OCFS2.
If I tried to shutdown the one that is the DC using openais stop, strange
things happen resulting in a really messed up cluster.
One
On 2013-09-09T17:22:14, Dejan Muhamedagic deja...@fastmail.fm wrote:
a) When pacemaker and all other commandline tool can live
nicely with multiple meta-attributes sections (it seems to
be allowed by the xml definition) and address all nvpairs
just by name beneath this tag, than crm
On 2013-09-04T08:26:14, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
In my experience network traffic grows somewhat linear with the size
of the CIB. At some point you probably have to change communication
parameters to keep the cluster in a happy comminication state.
Yes, I wish
On 2013-09-03T10:25:58, Digimer li...@alteeve.ca wrote:
I've run only 2-node clusters and I've not seen this problem. That said,
I've long-ago moved off of openais in favour of corosync. Given that
membership is handled there, I would look at openais as the source of your
trouble.
This is,
On 2013-09-03T13:04:52, Digimer li...@alteeve.ca wrote:
My mistake then. I had assumed that corosync was just a stripped down
openais, so I figured openais provided the same functions. My personal
experience with openais is limited to my early days of learning HA
clustering on EL5.
Yes and
On 2013-09-03T21:14:02, Vladislav Bogdanov bub...@hoster-ok.com wrote:
To solve problem 2, simply disable corosync/pacemaker from starting on
boot. This way, the fenced node will be (hopefully) back up and running,
so you can ssh into it and look at what happened. It won't try to rejoin
On 2013-08-29T15:49:30, Tom Parker tpar...@cbnco.com wrote:
Hello. Las night I updated my SLES 11 servers to HAE-SP3 which contains
the following versions of software:
Could you kindly file a report via NTS? That's the way to get official
and timely support for SLE HA. (I don't mean to cut
On 2013-08-28T20:13:43, Dejan Muhamedagic de...@suse.de wrote:
A new RC has been released today. It contains both fixes. It
doesn't do atomic updates anymore, because cibadmin or something
cannot stomach comments.
Couldn't find the upstream bug report :-( Can you give me the pacemaker
bugid,
On 2013-08-13T20:53:13, Andrew Beekhof and...@beekhof.net wrote:
I'd:
- Rename the provider to core
- Rework our own documentation and as we find it
- Transparently support references to ocf:heartbeat forever:
- Re-map s/heartbeat/core/ in the LRM (silently, or it'd get really
On 2013-07-12T11:05:32, Wengatz Herbert herbert.weng...@baaderbank.de wrote:
Seeing the high dropping quote... (just compare this to the other NIC) - have
you tried a new cable? Maybe it's a cheap hardware problem...
The drop rate is normal. A slave NIC in a bonded active/passive
On 2013-07-12T12:19:40, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
BTW: The way resource restart is implemented (i.e.: stop wait, then
start) has a major problem: If stop causes to fence the node where the crm
command is running, the resource will remain stopped even after the
On 2013-07-12T12:26:18, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
(Another way to trigger a restart is to modify the instance parameters.
Set __manual_restart=1 and it'll restart.)
once? ;-)
Keep increasing it. ;-)
--
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff
On 2013-07-11T08:41:33, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
For a really silly idea, but can you swap the network cards for a test?
Say, with Intel NICs, or even another Broadcom model?
Unfortunately no: The 4-way NIC is onboard, and all slots are full.
Too bad.
But then
On 2013-07-10T13:26:32, John M john332...@gmail.com wrote:
Current application supports only the Master/Slave configuration and
there can be one master and one slave process in a group.
A cluster can host multiple groups. You could, indeed, group your
systems into 3 or 5 node clusters, and
On 2013-07-10T08:31:17, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
I had reported about terrible performance of cLVM (maybe related to using
OCFS also) when uses in SLES11 SP2. I guesses cLVM (or OCFS2) is
communicating to death on activity. Now I have some interesing news:
No,
On 2013-07-10T14:33:12, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
Network problems in hypervisors though also have a tendency to be, well,
due to the hypervisor, or some network cards (broadcom?).
Yes:
driver: bnx2
version: 2.1.11
firmware-version: bc 5.2.3 NCSI 2.0.12
For
On 2013-07-08T22:35:31, Digimer li...@alteeve.ca wrote:
As for multi-DC support, watch the booth project. It's supposed to bring
stretch clustering to corosync + pacemaker.
Stretch clustering is already possible and supported (depending on whom
you ask; it is on SLE HA) with corosync.
booth
On 2013-07-09T20:06:45, John M john332...@gmail.com wrote:
Now I want to know
1. Can I use a node which is part of another cluster to run quorum node?
2. Can I configure a standalone quorum node that can manager 25 Clusters?
No. Using the quorum node approach, a node can only ever be part of
On 2013-07-09T23:11:01, John M john332...@gmail.com wrote:
So STONITH/feancing is the only option?
A quorum node is no alternative to fencing, anyway.
Regards,
Lars
--
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB
21284 (AG
On 2013-07-05T19:06:54, Vladislav Bogdanov bub...@hoster-ok.com wrote:
params #merge param1=value1 param2=value2
meta #replace ...
utilization #keep
and so on. With default to #replace?
Even more.
If we allow such meta lexems anywhere (not only at the very
beginning), then
On 2013-07-03T00:20:19, Vladislav Bogdanov bub...@hoster-ok.com wrote:
I do not edit them. I my setup I generate full crm config with
template-based framework.
And then you do a load/replace? Tough; yes, that'll clearly overwrite
what is already there and added by scripts that more dynamically
On 2013-07-03T10:26:09, Dejan Muhamedagic deja...@fastmail.fm wrote:
Not sure that is expected by most people.
How you then delete attributes?
Tough call :) Ideas welcome.
Set them to an empty string, or a magic #undef value.
It's not only for the nodes. Attributes of resources should be
On 2013-07-01T16:31:13, William Seligman selig...@nevis.columbia.edu wrote:
a) people can exclaim You fool! and point out all the stupid things I did
wrong;
b) sysadmins who are contemplating the switch to HA have additional points to
add to the pros and cons.
I think you bring up an
On 2013-07-02T11:05:01, Vladislav Bogdanov bub...@hoster-ok.com wrote:
One thing I see immediately, is that node utilization attributes are
deleted after I do 'load update' with empty node utilization sections.
That is probably not specific to this patch.
Yes, that isn't specific to that.
I
On 2013-07-02T13:14:48, Vladislav Bogdanov bub...@hoster-ok.com wrote:
Yes, that's exactly what you need here.
I know, but I do not expect that to be implemented soon.
crm_attribute -l reboot -z doesn't strike me as an unlikely request.
You could file an enhancement request for that.
But
1 - 100 of 1358 matches
Mail list logo