ter for
mailing lists).
> Could you explain in *heartbeat can monitor virtual IP alive or not*
> please ? thank a lot.
Pacemaker can do this just fine. It's one of the initial examples in
"Clusters from Scratch";
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/
thing else I should be looking for?
I *think* it fell behind (fence_ec2, iirc). It might need to be picked
up, updated/tested and then it can be re-added to the official list.
I'm not 100% on this though, so if someone contradicts me, ignore me.
--
Digimer
Papers and Projects: https://alteev
ve come up frustratingly blank. If
anyone can give me a clue, I would be very grateful. :)
digimer
1. https://bugzilla.redhat.com/show_bug.cgi?id=1349755
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to
; *Vladimir Pavlov*
>
>
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clu
I say go for it. A key to good HA is
simplicity, and maintaining two branches (or getting stuck on a dead-end
branch) seems to go against that ethos.
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
On 04/08/16 11:44 PM, Andrei Borzenkov wrote:
> 05.08.2016 02:33, Digimer пишет:
>> On 04/08/16 07:21 PM, Dan Swartzendruber wrote:
>>> On 2016-08-04 19:03, Digimer wrote:
>>>> On 04/08/16 06:56 PM, Dan Swartzendruber wrote:
>>>>> I'm setting up an
of the hosts.
>> All it is for is quorum. So, looking at fencing next. The primary
>
> I wonder what happens if the machine where the VM runs crashes (2 of 3 nodes
> down).
2 of 3 dead is loss of quorum. Surviving node stops offering cluster
services when it could have otherwise su
On 06/08/16 08:22 PM, Dan Swartzendruber wrote:
> On 2016-08-06 19:46, Digimer wrote:
>> On 06/08/16 07:33 PM, Dan Swartzendruber wrote:
>>>
>>> Okay, I almost have this all working. fence_ipmilan for the supermicro
>>> host. Had to specify lanplus for i
On 20/06/16 05:58 PM, Dimitri Maziuk wrote:
> On 06/20/2016 03:58 PM, Digimer wrote:
>
>> Then wouldn't it be a lot better to just run your services on both nodes
>> all the time and take HA out of the picture? Availability is predicated
>> on building the simplest syst
On 31/01/17 03:19 AM, Kristoffer Grönlund wrote:
> Digimer <li...@alteeve.ca> writes:
>
>> On 30/01/17 09:23 AM, Kristoffer Grönlund wrote:
>>> Hi everyone!
>>>
>>> The last time we had an HA summit was in 2015, and the intention then
>>> was
lier (change to Wed/Thu instead of Thu/Fri) to
>> make it easier for people traveling to/from the conference.
>
> Hi Chris,
>
> Sounds great! Happy to move it to September 6-7 if that works out
> better.
>
> Cheers,
> Kristoffer
I've updated the wiki to set the
nformal sight seeing outing for the following
weekend, whether it be held Wed/Thu or Thu/Fri. The last few times I've
been to Europe, I afforded myself little to no time to see any sights. I
don't plan to rush out this time, and would love to have some friendly
company. :)
--
Digimer
Papers an
pensuse.org/repositories/network:/ha-clustering:/Stable/
>
> Archives of the tagged release:
>
> * https://github.com/ClusterLabs/crmsh/archive/3.0.0.tar.gz
> * https://github.com/ClusterLabs/crmsh/archive/3.0.0.zip
>
> As usual, a huge thank you to all contributors and users o
: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty
On 31/01/17 03:19 AM, Kristoffer Grönlund wrote:
> Digimer <li...@alteeve.ca> writes:
>
>> On 30/01/17 09:23 AM, Kristoffer Grönlund wrote:
>>> Hi everyone!
>>>
>>> The last time we had an HA summit was in 2015, and the intention then
>>> was
there are, though, is here:
http://plan.alteeve.ca/index.php/HA_Cluster_Summit_2015
Please feel free to comment/edit as you wish. I can set up an account on
the wiki if you don't have one from last time (I only close it normally
to keep the spammers out).
digimer
On 07/02/17 12:47 AM, Gang He wrote:
>
/clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/index.html#_ensure_resources_run_on_the_same_host
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certai
running. If you can pass both of these tests,
you will have simulated most all possible node failure modes (I say
'most' because it is impossible to think of everything :) ).
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convoluti
system
and applications are in a clean state when you take the snapshot, so
using the image is like recovering from sudden power loss. If data was
in cache but not flushed out, you could have corruption.
If you can't stop your VMs, I'd recommend using a backup application
inside the VM that kn
On 24/02/17 06:27 PM, Lentes, Bernd wrote:
>
>
> - On Feb 24, 2017, at 7:20 PM, Digimer li...@alteeve.ca wrote:
>
>
>>>
>>> I read
>>> https://www.centos.org/docs/5/html/Cluster_Logical_Volume_Manager/cluster_activation.html
>>> .
>>
i curse in a mailing
> list, but ipmitool really frustrates me.
> Why can't i set access to this channel ? I'm running the commands as root.
> It's ipmitool 1.8.15.
>
> Can someone help me in configuring IPMI that i can used it from the other
> node to fence this node ?
>
> Bi
Depends on your OS, but generally /var/log/messages. Also, please share
your full pacemaker config. Please only obfuscate passwords.
digimer
On 05/09/16 07:53 PM, Nurit Vilosny wrote:
> Hi Kristoffer,
> Thanks for the prompt answer.
> Result of kill -9 is a dead process. Restart is
oot@node1 ~]#
>
> Any help would be appreciated, I think there is something dumb that I'm
> missing.
>
> Thank you.
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
>
h PDUs
are called to open the circuits feeding the lost node, thus ensuring it
is off.
If for some reason both methods fail, pacemaker goes back to IPMI and
tries that again, then on to PDUs, ... and will loop until one of the
methods succeeds, leaving the cluster (intentionally) hung in the mean
e method with a
pair of switched PDUs as a backup fence method. This provides full
coverage and is generally a lot faster.
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
_
he starved node should be declared lost by corosync, the remaining
nodes reform and if they're still quorate, the hung node should be
fenced. Recovery occur and life goes on.
Unless you don't have fencing, then may $deity of mercy. ;)
--
Digimer
Papers and Projects: https://alteeve.ca/w
a lot of "tutorials" make when the author
doesn't understand the role of fencing.
In your case, pcs setup cman to use the fence_pcmk "passthrough" fence
agent, as it should. So when something went wrong, corosync detected it,
informed cman which then requested pacemaker to fe
in production, and many are using the
.102 drivers. So I have a feeling that it wasn't so much the upgrade
that made the difference, but instead the reinstall of the drivers.
I have no idea why this bug happened, but hopefully this might save
someone some grief in the future if they hit the same.
Can you share your current full configuration please?
If you're hitting errors, please also share the relevant log entries
from the nodes.
digimer
On 07/10/16 09:06 PM, Dayvidson Bezerra wrote:
> The company only uses Ubuntu, and do not want another distro in your
> environment.
&g
ry best, you lose your services. At worst, you corrupt your data. Why
risk that at all when fencing solves the problem perfectly fine?
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
_
On 04/10/16 07:09 PM, Israel Brewster wrote:
> On Oct 4, 2016, at 3:03 PM, Digimer <li...@alteeve.ca> wrote:
>>
>> On 04/10/16 06:50 PM, Israel Brewster wrote:
>>> On Oct 4, 2016, at 2:26 PM, Ken Gaillot <kgail...@redhat.com
>>> <mailto:kgail...@redhat.
On 04/10/16 07:50 PM, Israel Brewster wrote:
> On Oct 4, 2016, at 3:38 PM, Digimer <li...@alteeve.ca> wrote:
>>
>> On 04/10/16 07:09 PM, Israel Brewster wrote:
>>> On Oct 4, 2016, at 3:03 PM, Digimer <li...@alteeve.ca> wrote:
>>>>
>>>>
cts that the server is running fine and simply marks the server as
'started'. Is there no way to do something similar to go 'failed' ->
'started' without the 'disable' step?
I tried freezing the service, no luck. I also tried coalescing via
'-c', but that didn't help either.
Thanks!
--
Digimer
On 19/09/16 03:07 PM, Digimer wrote:
> On 19/09/16 02:39 PM, Digimer wrote:
>> On 19/09/16 02:30 PM, Jan Pokorný wrote:
>>> On 18/09/16 15:37 -0400, Digimer wrote:
>>>> If, for example, a server's definition file is corrupted while the
>>>> server
On 19/09/16 03:13 PM, Digimer wrote:
> On 19/09/16 03:07 PM, Digimer wrote:
>> On 19/09/16 02:39 PM, Digimer wrote:
>>> On 19/09/16 02:30 PM, Jan Pokorný wrote:
>>>> On 18/09/16 15:37 -0400, Digimer wrote:
>>>>> If, for example, a server's defini
On 19/09/16 02:30 PM, Jan Pokorný wrote:
> On 18/09/16 15:37 -0400, Digimer wrote:
>> If, for example, a server's definition file is corrupted while the
>> server is running, rgmanager will put the server into a 'failed' state.
>> That's fine and fair.
>
> Please,
reason to shutdown a node. What is your opinion on
> this? Can i just set the primitive monitor operation to disabled?
Monitoring is how you will detect that, for example, the IPMI cable
failed or was unplugged. I do not believe the node will get fenced on
fence agent monitor failing... At least not b
gt; Do you know if the latest ver is stable?
>
> And which companies are using it?
>
>
>
> Thanks in advance,
>
> Ron
Short answer is "Corosync v2 + pacemaker 1.1.10+" (1.1.14+, ideally)
Long answer is here: https://alteeve.ca/w/History_of_HA_Clustering
--
On 09/10/16 11:58 PM, Andrei Borzenkov wrote:
> 10.10.2016 00:42, Eric Robinson пишет:
>> Digimer, thanks for your thoughts. Booth is one of the solutions I
>> looked at, but I don't like it because it is complex and difficult to
>> implement
>
> HA is comp
The condition stopped as
> soon as the Linux server in question became reachable again.
>
> --
> Eric Robinson
A properly build mode=1 bond will only use one interface or the other,
not both, so it shouldn't cause a storm.
--
Digimer
Papers and Projects: https://alteeve.ca/w/
What
VG Attr
> LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
> volume-1b0ea468-37c8-4b47-a6fa-6cce65b068b5 cinder-volumes -wi---
> 1.00g
>
>
> thank you very much!
>
>
>
>
>
> __
On 05/12/16 10:32 PM, su liu wrote:
> Digimer, thank you very much!
>
> I do not need to have the data accessible on both nodes at once. I want
> to use the clvm+pacemaker+corosync in OpenStack Cinder.
I'm not sure what "cinder" is, so I don't know what it needs to work.
r 'lvscan'? You should see it on both nodes at the
same time as soon as it is created, *if* things are working properly. It
is possible, without stonith, that they are not.
Please configure and test stonith, and see if the problem remains. If it
does, tail the system logs on both nodes, creat
; Nice, and congratulations, Krig, for the logo escalation :)
>
> (Still looking forward to seeing the animated version...)
I failed to get the designer to get an update for me, but it doesn't
matter because I do really like this one. Thanks, Ken!
--
Digimer
Papers and Projects: https://al
16:35:31 b015 stonith-ng[2251]: warning: fence_scsi[20020]
> stderr: [ ]
> Mar 24 16:35:31 b015 stonith-ng[2251]: warning: fence_scsi[20020]
> stderr: [ Failed: keys cannot be same. You can not fence yourself. ]
> Mar 24 16:35:31 b015 stonith-ng[2251]: warning: fence_scsi[20020]
> stderr: [ ]
> Mar 24 16:35:31 b015 stonith-ng[2251]:
; question is: how to avoid such behaviour?
>
> Thank you!
Please share your config along with the logs from the nodes that were
effected.
cheers,
digimer
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einste
is deprecated, so
please switch over to there
(http://lists.clusterlabs.org/mailman/listinfo/users).
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal tal
I think it is reasonable to expect corosync to handle this
properly. How hard would it be to make corosync resilient to this fault
case?
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einst
contraindications for using broadcast teams (or teamd
at all) under corosync v2?
Thanks!
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have lived and
configure everything for you. Do NOT
configure corosync directly; You need to configure it inside cman
itself. Reset corosync.conf back to defaults.
Reference;
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-single/Clusters_from_Scratch/index.html
--
Digimer
Papers and Projects: ht
On 16/04/17 04:04 PM, Eric Robinson wrote:
>> -Original Message-
>> From: Digimer [mailto:li...@alteeve.ca]
>> Sent: Sunday, April 16, 2017 11:17 AM
>> To: Cluster Labs - All topics related to open-source clustering welcomed
>> <users@clusterlab
that's not his problem.
>
> Dima
Can you elaborate? I'm not following your point/concern here...
Availability is all about making sure customers/users can access their
services.
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and conv
On 19/04/17 02:38 AM, Ulrich Windl wrote:
>>>> Digimer <li...@alteeve.ca> schrieb am 18.04.2017 um 19:08 in Nachricht
> <26e49390-b384-b46e-4965-eba5bfe59...@alteeve.ca>:
>> On 18/04/17 11:07 AM, Lentes, Bernd wrote:
>>> Hi,
>>>
>>> i'm
it, and you can
pretend cman doesn't exist for all intent and purpose.
If that's not good enough, switch to EL7 where it's pure pacemaker and
corosync v2.
digimer
On 13/04/17 06:18 PM, neeraj ch wrote:
> I have dreaded that answer. Maybe I can fix vote quorum on corosync 1.4.
> Or maybe I can
thing should ever be more important than
availability, and availability is a product of simplicity. So in my
view, a 3-node cluster adds complexity that is avoidable, and so is
sub-optimal.
I'm happy to answer any questions you have on my comments/point of view
on this.
--
Digimer
Paper
On 18/04/17 03:47 AM, Ulrich Windl wrote:
>>>> Digimer <li...@alteeve.ca> schrieb am 16.04.2017 um 20:17 in Nachricht
> <12cde13f-8bad-a2f1-6834-960ff3afc...@alteeve.ca>:
>> On 16/04/17 01:53 PM, Eric Robinson wrote:
>>> I was reading in "Clusters
On 18/04/17 10:00 AM, Digimer wrote:
> On 18/04/17 03:47 AM, Ulrich Windl wrote:
>>>>> Digimer <li...@alteeve.ca> schrieb am 16.04.2017 um 20:17 in Nachricht
>> <12cde13f-8bad-a2f1-6834-960ff3afc...@alteeve.ca>:
>>> On 16/04/17 01:53 PM, Eric Robinso
VM to the letter. I don't know what could be
> the problem.
>
> would you suggest ways to troubleshoot it? Is it faulty/failing hardware?
>
> many thanks,
> L.
LVM or clustered LVM?
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in
, as one example. It's slow, and needs shared
storage, but a small box somewhere running a small tgtd or iscsid should
do the trick (note that I have never used SBD myself...).
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions
ut I am still lost...
> On Mon, Apr 17, 2017 at 1:45 PM, Dimitri Maziuk <dmaz...@bmrb.wisc.edu
> <mailto:dmaz...@bmrb.wisc.edu>> wrote:
>
> On 04/17/2017 11:58 AM, Digimer wrote:
>
> > ... Unless I am misunderstanding, your comment is related to
> > s
ng code figure this one out?
That sounds like a use-case where a full HA cluster is overkill already.
In any case, it would be a tiny fraction of installs and would be
tangential to the 2v3+ node debate that this thread started with.
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I a
it's just the cluster network interconnect that has
> failed.
>
> IMO SCSI fencing should never be used on a 2 node cluster for reasons
> you have already described very clearly.
>
> Chrissie
I was fairly generic on that term because I've seen (and even wrote
one!) where snmp was u
On 23/04/17 12:51 AM, Andrei Borzenkov wrote:
> 22.04.2017 23:33, Dmitri Maziuk пишет:
>> On 4/22/2017 12:02 PM, Digimer wrote:
>>
>>> Having SBD properly configured is *massively* safer than no fencing at
>>> all. So for people where other fence methods are not
or me. That said, I have the same
reservation with IPMI itself. So to me, "proper" fencing requires a
backup, totally external, option like a pair of switched PDUs. Of
course, I'm more paranoid than most.
Having SBD properly configured is *massively* safer than no fencing at
all. So for
mmences.
>
> Is this delay a feature of the cpg_mcast_joined function? If I
> understand correctly (unlikely), it looks like cpg_mcast_joined is not
> completing because one of the nodes in the group is missing, but I
> haven't looked at that code closely yet. Is it advisable to have
&g
' document and I will have
anyone interested comment before making it an official update.
Comments?
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have
On 07/03/17 05:09 AM, Jan Pokorný wrote:
> On 06/03/17 17:12 -0500, Digimer wrote:
>> The old FenceAgentAPI document on fedorahosted is gone now that fedora
>> hosted is closed. So I created a copy on the clusterlabs wiki:
>>
>> http://wiki.clusterlabs.org/wiki/FenceA
t someone *might* build unimportant clusters doesn't change anything,
really. One could ask "if the service isn't important, why go to the
hassle of building a cluster at all? It's just avoidable complication".
So, in closing, I still argue that if you need HA, you always need fencing.
--
Dig
On 18/04/17 08:50 PM, Dimitri Maziuk wrote:
> On 04/18/2017 07:05 PM, Digimer wrote:
>
>> Certainly, the people creating the software have to assume that a
>> split-brain is devastating. Same for people who teach others and people
>> who write documentation.
>
> s
1
> node2"? But how the fence device will combine the hostname with port
> (or plug)? I presume that node1 must somehow know that node2's plug is
> Centos2, otherwise It could reboot itself (?)
> Thank you.
The "plug" should match the name used by the hypervisor, not the
ecommended, but it is possible with creative use of filter =
[] in lvm.conf. I've not done it myself, mind you. As far as clvmd on
DRBD, to LVM, it's no different if the block device is a SAN LUN or
DRBD... It only cares that a changed block/inode on one side is the same
on the other.
--
Digime
election say 10 - 15 seconds before
> considering quoram loss ?
>
> Of reference, I am using pacemaker 1.14 with corosync.
>
>
> Thank you
You should be able to increase corosync's token timeout to do this.
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, l
ice that can be taken offline by a bad firmware update, user error,
etc. DRBD gives you full mechanical and electrical replication of the
data and has survived some serious in-the-field faults in our Anvil!
system (including a case where three drives were ejected at the same
time from the node hostin
ion system, it
> is probably better to use that (or at least use it at one level).
You are absolutely correct. However, OP asked about DRBD vs SAN, not
DRBD/SAN versus backup.
Proper continuity planning requires redundancy (DRBD + clustering),
backup and DR as three separate components.
--
Digimer
Pape
er.sh'
script (and crm-unfence-peer.sh for after resync).
This is the only way to avoid split-brains.
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the near certainty that people of equal talent
have li
on). With DRBD 9, you can set it up to momentarily do
dual-primary to support live migration, though I have not used this
myself yet.
With dual-primary, you need to be sure a few things are in place (ie:
proper fencing, but you need that anyway, a cluster resource manager, etc).
--
Digimer
Pap
out warning.
* Failed backplanes causing multiple disks to be lost.
* User error destroying RAID arrays.
* Bad components used during upgrades causing a node to be offline until
a new part is delivered
Etc.
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interest
understand how that would work... The goal of clvmd is to ensure
changes to the VG (on a shared PV, like a LUN or DRBD) happen on all
nodes at the same time.
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein
d LVM to manage the DRBD space, creating per-VM LVs, and
then use the resource manager to manage the servers. This keeps the LVM
data in sync and avoids the cost of cluster locking.
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convoluti
;
>> If you want to have a shared FS, yes. If you want to back VMs though, we
>> use clustered LVM to manage the DRBD space, creating per-VM LVs, and
>> then use the resource manager to manage the servers. This keeps the LVM
>> data in sync and avoids the cost of cluster locking.
&g
to 'fencing resource-and-stonith'? If so, then the only way
to get a split-brain is if something is configured wrong in pacemaker or
if something caused crm-fence-peer.sh to report success when it didn't
actually succeed...
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, l
respond to the list, not
developers-ow...@clusterlabs.org.
digimer
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of Einstein’s brain than in the near certainty that people of equal talent have lived and
gt;
>
> Thanks in advance
>
> Cristiano
So to confirm, they have locking type = 3, but vgdisplay does not show
clustered? I've not tried this myself in ages, but yes, 'vgchange -cy
...' should do the trick. It's possible to have
fallback_to_local_locking = 1 and a mix of clustered
ogin and change your password if you
> have an account on one of these sites.
>
> [1] https://letsencrypt.org/
> [2] https://wiki.clusterlabs.org/
> [3] https://bugs.clusterlabs.org/
More security is more better!
Thanks, Ken!
--
Digimer
Papers and Projects: https://al
d
argue. If you want to keep the services in VMs, that's fine, get a pair
of nodes and make them an HA cluster to protect the VMs as the services
(we do this all the time).
With that, then you pair IPMI and switched PDUs for complete coverage
(IPMI alone isn't enough, because if the host is destroye
, of
course, to all of you for the years of advice, banter and debate. I
still have very much to learn!
Now, time to start working full time on version 3!
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain th
der and harder to keep things stable as the number of nodes
grow. There is a lot of coordination that has to happen between the
nodes and it gets ever more complex.
Generally speaking, you don't want large clusters. It is always advised
to break things up it separate smaller clusters whenever possible.
--
large.
Again, there is no hard code limit here, just what is practical. Can I
ask how large of a cluster you are planning to build, and what it will
be used for?
Note also; This is not related to pacemaker remote. You can have very
large counts of remote nodes.
digimer
On 2017-07-05 11:27 PM
locking_type = 3;
fallback_to_clustered_locking = 1
fallback_to_local_locking = 0
}
This assumes you are not trying to use LVM and clustered LVM at the
same time. If you are, you probably don't want to. If you do anyway,
don't set the fallback variables.
With this, you then sta
IPMI is common on most servers, so fence_ipmilan is
quite common. Switched PDUs from APC are also popular, and they use
fence_apc_snmp, etc.
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested in the weight and convolutions of
Einstein’s brain than in the
Hi all,
Pardon the off-topic post. I'll be attending Open Source Summit in
Tokyo
(http://events.linuxfoundation.org/events/open-source-summit-japan) in a
couple of weeks.
If anyone else from the cluster world will be attending, maybe we can
meet up for beer/sake/coffee. :)
--
Digimer
On 19/06/17 11:40 PM, Andrei Borzenkov wrote:
> 20.06.2017 02:15, Digimer пишет:
>> On 19/06/17 06:59 PM, Ferenc Wágner wrote:
>>> Digimer <li...@alteeve.ca> writes:
>>>
>>>> So we have a tool that watches for changes to clvmd by running
>>>
So we have a tool that watches for changes to clvmd by running
pvscan/vgscan/lvscan, but this seems to be expensive and occassionally
cause trouble. Is there any other way to be notified or to check when
something changes?
cheers
--
Digimer
Papers and Projects: https://alteeve.com/w/
"
On 19/06/17 06:59 PM, Ferenc Wágner wrote:
> Digimer <li...@alteeve.ca> writes:
>
>> So we have a tool that watches for changes to clvmd by running
>> pvscan/vgscan/lvscan, but this seems to be expensive and occassionally
>> cause trouble.
>
> Wha
rlabs.org
>
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://lists.clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://w
On 26/05/17 02:05 AM, Ulrich Windl wrote:
> PLEASE learn how to use the subject in E_mail messages!
Christopher explained that the email was sent early by accident.
digimer
>>>> Christopher Pax <ops...@gmail.com> schrieb am 24.05.2017 um 22:36 in
>>>> Nachricht
>
ats not really an option.
>
> --
> Andrew W. Kerber
fence_virsh -a -l root -p
-n -o status
That should show the status. To reboot, change 'status' to 'reboot'.
If this doesn't work, make sure you can ssh from the nodes to the
hypervisor as the root user.
--
Digimer
Papers and Proje
erlabs.org/mailman/listinfo/users
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> ___
> Users mailing list: U
ni-regensburg.de]
>> Sent: Thursday, June 1, 2017 8:34 AM
>> To: users@clusterlabs.org
>> Subject: [ClusterLabs] Antw: Re: Antw: clearing failed actions
>>
>>>>> Digimer <li...@alteeve.ca> schrieb am 01.06.2017 um 00:03 in Nachricht
>> <50aad2
learning
HA on SUSE/RHEL and then, after you know what config works for you,
migrate to the target OS. That way you have only one set of variables at
a time.
Also, use fencing. Seriously, just do it.
--
Digimer
Papers and Projects: https://alteeve.com/w/
"I am, somehow, less interested
101 - 200 of 409 matches
Mail list logo