Re: [ClusterLabs] Fwd: Multi cluster
[addendum inline] On 04/08/17 18:35 +0200, Jan Pokorný wrote: > On 03/08/17 20:37 +0530, sharafraz khan wrote: >> I am new to clustering so please ignore if my Question sounds silly, i have >> a requirement were in i need to create cluster for ERP application with >> apache, VIP component,below is the scenario >> >> We have 5 Sites, >> 1. DC >> 2. Site A >> 3. Site B >> 4. Site C >> 5. Site D >> >> Over here we need to configure HA as such that DC would be the primary Node >> hosting application & be accessed from by all the users in each sites, in >> case of Failure of DC Node, Site users should automatically be switched to >> there local ERP server, and not to the Nodes at other sites, so >> communication would be as below >> >> DC < -- > Site A >> DC < -- > Site B >> DC < -- > Site C >> DC < -- > Site D >> >> Now the challenge is >> >> 1. If i create a cluster between say DC < -- > Site A it won't allow me to >> create another cluster on DC with other sites >> >> 2. if i setup all the nodes in single cluster how can i ensure that in case >> of Node Failure or loss of connectivity to DC node from any site, users >> from that sites should be switched to Local ERP node and not to nodes on >> other site. >> >> a urgent response and help would be quite helpful > > From your description, I suppose you are limited to just a single > machine per site/DC (making the overall picture prone to double > fault, first DC goes down, then any of the sites goes down, then > at least the clients of that very site encounter the downtime). > Otherwise I'd suggest looking at booth project that facilitates > inter-cluster (back to your "multi cluster") decisions, extending > upon pacemaker performing the intra-cluster ones. > > Using a single cluster approach, you should certainly be able to > model your fallback scenario, something like: > > - define a group A (VIP, apache, app), infinity-located with DC > - define a different group B with the same content, set up as clone > B_clone being (-infinity)-located with DC > - set up ordering "B_clone starts when A stops", of "Mandatory" kind > > Further tweaks may be needed. Hmm, actually VIP would not help much here, even if "ip" adapted per host ("#uname") as there're two conflicting principles ("globality" of the network for serving from DC vs. locality when serving from particular sites _in parallel_). Something more sophisticated would likely be needed. -- Poki pgp5y7r20EoBF.pgp Description: PGP signature ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Fwd: Multi cluster
On Fri, 2017-08-04 at 18:35 +0200, Jan Pokorný wrote: > On 03/08/17 20:37 +0530, sharafraz khan wrote: > > I am new to clustering so please ignore if my Question sounds silly, i have > > a requirement were in i need to create cluster for ERP application with > > apache, VIP component,below is the scenario > > > > We have 5 Sites, > > 1. DC > > 2. Site A > > 3. Site B > > 4. Site C > > 5. Site D > > > > Over here we need to configure HA as such that DC would be the primary Node > > hosting application & be accessed from by all the users in each sites, in > > case of Failure of DC Node, Site users should automatically be switched to > > there local ERP server, and not to the Nodes at other sites, so > > communication would be as below > > > > DC < -- > Site A > > DC < -- > Site B > > DC < -- > Site C > > DC < -- > Site D > > > > Now the challenge is > > > > 1. If i create a cluster between say DC < -- > Site A it won't allow me to > > create another cluster on DC with other sites Right, your choices (when using corosync+pacemaker) are one big cluster with all sites (including the data center), or an independent cluster at each site connected by booth. It sounds like your secondary sites don't have any communication between each other, only to the DC, so that suggests that the "one big cluster" approach won't work. For more details on pacemaker+booth, see: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm139900093104976 > > 2. if i setup all the nodes in single cluster how can i ensure that in case > > of Node Failure or loss of connectivity to DC node from any site, users > > from that sites should be switched to Local ERP node and not to nodes on > > other site. The details depend on the particular service. Unfortunately I don't have any experience with ERP, maybe someone else can jump in with tips. How do users contact the ERP node? Via an IP address, or a list of IP addresses that will be tried in order, or some other way? Is the ERP service itself managed by the cluster? If so, what resource agent are you using? Does the agent support cloning or master/slave operation? > > > > a urgent response and help would be quite helpful > > From your description, I suppose you are limited to just a single > machine per site/DC (making the overall picture prone to double > fault, first DC goes down, then any of the sites goes down, then > at least the clients of that very site encounter the downtime). > Otherwise I'd suggest looking at booth project that facilitates > inter-cluster (back to your "multi cluster") decisions, extending > upon pacemaker performing the intra-cluster ones. > > Using a single cluster approach, you should certainly be able to > model your fallback scenario, something like: > > - define a group A (VIP, apache, app), infinity-located with DC > - define a different group B with the same content, set up as clone > B_clone being (-infinity)-located with DC > - set up ordering "B_clone starts when A stops", of "Mandatory" kind > > Further tweaks may be needed. -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] big trouble with a DRBD resource
On Fri, 2017-08-04 at 18:20 +0200, Lentes, Bernd wrote: > Hi, > > first: is there a tutorial or s.th. else which helps in understanding what > pacemaker logs in syslog and /var/log/cluster/corosync.log ? > I try hard to find out what's going wrong, but they are difficult to > understand, also because of the amount of information. > Or should i deal more with "crm histroy" or hb_report ? Unfortunately no -- logging, and troubleshooting in general, is an area we are continually striving to improve, but there are more to-do's than time to do them. > > What happened: > I tried to configure a simple drbd resource following > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-single/Clusters_from_Scratch/index.html#idm140457860751296 > I used this simple snip from the doc: > configure primitive WebData ocf:linbit:drbd params drbd_resource=wwwdata \ > op monitor interval=60s > > I did it on live cluster, which is in testing currently. I will never do this > again. Shadow will be my friend. lol, yep > The cluster reacted promptly: > crm(live)# configure primitive prim_drbd_idcc_devel ocf:linbit:drbd params > drbd_resource=idcc-devel \ >> op monitor interval=60 > WARNING: prim_drbd_idcc_devel: default timeout 20s for start is smaller than > the advised 240 > WARNING: prim_drbd_idcc_devel: default timeout 20s for stop is smaller than > the advised 100 > WARNING: prim_drbd_idcc_devel: action monitor not advertised in meta-data, it > may not be supported by the RA > > From what i understand until now is that i didn't configure start/stop > operations, so the cluster chooses the default from default-action-timeout. > It didn't configure the monitor operation, because this is not in the > meta-data. > > I checked it: > crm(live)# ra info ocf:linbit:drbd > Manages a DRBD device as a Master/Slave resource (ocf:linbit:drbd) > > Operations' defaults (advisory minimum): > > start timeout=240 > promote timeout=90 > demotetimeout=90 > notifytimeout=90 > stop timeout=100 > monitor_Slave timeout=20 interval=20 > monitor_Master timeout=20 interval=10 > > OK. I have to configure monitor_Slave and monitor_Master. > > The log says: > Aug 1 14:19:33 ha-idg-1 drbd(prim_drbd_idcc_devel)[11325]: ERROR: meta > parameter misconfigured, expected clone-max -le 2, but found unset. > > ^ > Aug 1 14:19:33 ha-idg-1 crmd[4692]: notice: process_lrm_event: Operation > prim_drbd_idcc_devel_monitor_0: not configured (node=ha-idg-1, call=73, rc=6, > cib-update=37, confirmed=true) > Aug 1 14:19:33 ha-idg-1 crmd[4692]: notice: process_lrm_event: Operation > prim_drbd_idcc_devel_stop_0: not configured (node=ha-idg-1, call=74, rc=6, > cib-update=38, confirmed=true) > > Why is it complaining about missing clone-max ? This is a meta attribute for > a clone, but not for a simple resource !?! This message is constantly > repeated, it still appears although cluster is in standby since three days. The "ERROR" message is coming from the DRBD resource agent itself, not pacemaker. Between that message and the two separate monitor operations, it looks like the agent will only run as a master/slave clone. > And why does it complain that stop is not configured ? A confusing error message. It's not complaining that the operations are not configured, it's saying the operations failed because the resource is not properly configured. What "properly configured" means is up to the individual resource agent. > Isn't that configured with the default of 20 sec. ? That's what crm said. See > above. This message is also repeated nearly 7000 times in 9 minutes. > If the stop op is not configured and the cluster complains about it, why does > it not complain about a unconfigured start op ? > That the missing monitor is complained is clear. > > The DC says: > Aug 1 14:19:33 ha-idg-2 pengine[27043]: warning: unpack_rsc_op_failure: > Processing failed op stop for prim_drbd_idcc_devel on ha-idg-1: not > configured (6) > Aug 1 14:19:33 ha-idg-2 pengine[27043]:error: unpack_rsc_op: Preventing > prim_drbd_idcc_devel from re-starting anywhere: operation stop failed 'not > configured' (6) > > Again complaining about a failed stop, saying it's not configured. Or does it > complain that the fail of a stop op is not configured ? Again, it's confusing, but you have various logs of the same event coming from three different places. First, DRBD logged that there is a "meta parameter misconfigured". It then reported that error value back to the crmd cluster daemon that called it, so the crmd logged the error as well, that the result of the operation was "not configured". Then (above), when the policy engine reads the current status of the cluster, it sees that there is a failed operation, so it decides what to do about the failure. > The d
Re: [ClusterLabs] Notification agent and Notification recipients
On Thu, 2017-08-03 at 12:31 +0530, Sriram wrote: > > Hi Team, > > > We have a four node cluster (1 active : 3 standby) in our lab for a > particular service. If the active node goes down, one of the three > standby node becomes active. Now there will be (1 active : 2 > standby : 1 offline). > > > Is there any way where this newly elected node sends notification to > the remaining 2 standby nodes about its new status ? Hi Sriram, This depends on how your service is configured in the cluster. If you have a clone or master/slave resource, then clone notifications is probably what you want (not alerts, which is the path you were going down -- alerts are designed to e.g. email a system administrator after an important event). For details about clone notifications, see: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_clone_resource_agent_requirements The RA must support the "notify" action, which will be called when a clone instance is started or stopped. See the similar section later for master/slave resources for additional information. See the mysql or pgsql resource agents for examples of notify implementations. > I was exploring "notification agent" and "notification recipient" > features, but that doesn't seem to work. /etc/sysconfig/notify.sh > doesn't get invoked even in the newly elected active node. Yep, that's something different altogether -- it's only enabled on RHEL systems, and solely for backward compatibility with an early implementation of the alerts interface. The new alerts interface is more flexible, but it's not designed to send information between cluster nodes -- it's designed to send information to something external to the cluster, such as a human, or an SNMP server, or a monitoring system. > Cluster Properties: > cluster-infrastructure: corosync > dc-version: 1.1.17-e2e6cdce80 > default-action-timeout: 240 > have-watchdog: false > no-quorum-policy: ignore > notification-agent: /etc/sysconfig/notify.sh > notification-recipient: /var/log/notify.log > placement-strategy: balanced > stonith-enabled: false > symmetric-cluster: false > > > > > I m using the following versions of pacemaker and corosync. > > > /usr/sbin # ./pacemakerd --version > Pacemaker 1.1.17 > Written by Andrew Beekhof > /usr/sbin # ./corosync -v > Corosync Cluster Engine, version '2.3.5' > Copyright (c) 2006-2009 Red Hat, Inc. > > > Can you please suggest if I m doing anything wrong or if there any > other mechanisms to achieve this ? > > > Regards, > Sriram. > > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org -- Ken Gaillot ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Fwd: Multi cluster
On 03/08/17 20:37 +0530, sharafraz khan wrote: > I am new to clustering so please ignore if my Question sounds silly, i have > a requirement were in i need to create cluster for ERP application with > apache, VIP component,below is the scenario > > We have 5 Sites, > 1. DC > 2. Site A > 3. Site B > 4. Site C > 5. Site D > > Over here we need to configure HA as such that DC would be the primary Node > hosting application & be accessed from by all the users in each sites, in > case of Failure of DC Node, Site users should automatically be switched to > there local ERP server, and not to the Nodes at other sites, so > communication would be as below > > DC < -- > Site A > DC < -- > Site B > DC < -- > Site C > DC < -- > Site D > > Now the challenge is > > 1. If i create a cluster between say DC < -- > Site A it won't allow me to > create another cluster on DC with other sites > > 2. if i setup all the nodes in single cluster how can i ensure that in case > of Node Failure or loss of connectivity to DC node from any site, users > from that sites should be switched to Local ERP node and not to nodes on > other site. > > a urgent response and help would be quite helpful From your description, I suppose you are limited to just a single machine per site/DC (making the overall picture prone to double fault, first DC goes down, then any of the sites goes down, then at least the clients of that very site encounter the downtime). Otherwise I'd suggest looking at booth project that facilitates inter-cluster (back to your "multi cluster") decisions, extending upon pacemaker performing the intra-cluster ones. Using a single cluster approach, you should certainly be able to model your fallback scenario, something like: - define a group A (VIP, apache, app), infinity-located with DC - define a different group B with the same content, set up as clone B_clone being (-infinity)-located with DC - set up ordering "B_clone starts when A stops", of "Mandatory" kind Further tweaks may be needed. -- Poki pgpfQIuSWqmZH.pgp Description: PGP signature ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] big trouble with a DRBD resource
Hi, first: is there a tutorial or s.th. else which helps in understanding what pacemaker logs in syslog and /var/log/cluster/corosync.log ? I try hard to find out what's going wrong, but they are difficult to understand, also because of the amount of information. Or should i deal more with "crm histroy" or hb_report ? What happened: I tried to configure a simple drbd resource following http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html-single/Clusters_from_Scratch/index.html#idm140457860751296 I used this simple snip from the doc: configure primitive WebData ocf:linbit:drbd params drbd_resource=wwwdata \ op monitor interval=60s I did it on live cluster, which is in testing currently. I will never do this again. Shadow will be my friend. The cluster reacted promptly: crm(live)# configure primitive prim_drbd_idcc_devel ocf:linbit:drbd params drbd_resource=idcc-devel \ > op monitor interval=60 WARNING: prim_drbd_idcc_devel: default timeout 20s for start is smaller than the advised 240 WARNING: prim_drbd_idcc_devel: default timeout 20s for stop is smaller than the advised 100 WARNING: prim_drbd_idcc_devel: action monitor not advertised in meta-data, it may not be supported by the RA From what i understand until now is that i didn't configure start/stop operations, so the cluster chooses the default from default-action-timeout. It didn't configure the monitor operation, because this is not in the meta-data. I checked it: crm(live)# ra info ocf:linbit:drbd Manages a DRBD device as a Master/Slave resource (ocf:linbit:drbd) Operations' defaults (advisory minimum): start timeout=240 promote timeout=90 demotetimeout=90 notifytimeout=90 stop timeout=100 monitor_Slave timeout=20 interval=20 monitor_Master timeout=20 interval=10 OK. I have to configure monitor_Slave and monitor_Master. The log says: Aug 1 14:19:33 ha-idg-1 drbd(prim_drbd_idcc_devel)[11325]: ERROR: meta parameter misconfigured, expected clone-max -le 2, but found unset. ^ Aug 1 14:19:33 ha-idg-1 crmd[4692]: notice: process_lrm_event: Operation prim_drbd_idcc_devel_monitor_0: not configured (node=ha-idg-1, call=73, rc=6, cib-update=37, confirmed=true) Aug 1 14:19:33 ha-idg-1 crmd[4692]: notice: process_lrm_event: Operation prim_drbd_idcc_devel_stop_0: not configured (node=ha-idg-1, call=74, rc=6, cib-update=38, confirmed=true) Why is it complaining about missing clone-max ? This is a meta attribute for a clone, but not for a simple resource !?! This message is constantly repeated, it still appears although cluster is in standby since three days. And why does it complain that stop is not configured ? Isn't that configured with the default of 20 sec. ? That's what crm said. See above. This message is also repeated nearly 7000 times in 9 minutes. If the stop op is not configured and the cluster complains about it, why does it not complain about a unconfigured start op ? That the missing monitor is complained is clear. The DC says: Aug 1 14:19:33 ha-idg-2 pengine[27043]: warning: unpack_rsc_op_failure: Processing failed op stop for prim_drbd_idcc_devel on ha-idg-1: not configured (6) Aug 1 14:19:33 ha-idg-2 pengine[27043]:error: unpack_rsc_op: Preventing prim_drbd_idcc_devel from re-starting anywhere: operation stop failed 'not configured' (6) Again complaining about a failed stop, saying it's not configured. Or does it complain that the fail of a stop op is not configured ? The doc says: "Some operations are generated by the cluster itself, for example, stopping and starting resources as needed." http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_resource_operations.html . Is the doc wrong ? What happens when i DON'T configure start/stop operations ? Are they configured automatically ? I have several primitives without a configured start/stop operation, but never had any problems with them. failcount is direct INFINITY: Aug 1 14:19:33 ha-idg-1 attrd[4690]: notice: attrd_trigger_update: Sending flush op to all hosts for: fail-count-prim_drbd_idcc_devel (INFINITY) Aug 1 14:19:33 ha-idg-1 attrd[4690]: notice: attrd_perform_update: Sent update 8: fail-count-prim_drbd_idcc_devel=INFINITY After exact 9 minutes the complaints about the not configured stop operation stopped, the complaints about missing clone-max still appears, although both nodes are in standby now fail-count is 1 million: Aug 1 14:28:33 ha-idg-1 attrd[4690]: notice: attrd_trigger_update: Sending flush op to all hosts for: fail-count-prim_drbd_idcc_devel (100) Aug 1 14:28:33 ha-idg-1 attrd[4690]: notice: attrd_perform_update: Sent update 7076: fail-count-prim_drbd_idcc_devel=100 and a complain about monitor operation appeared again: Aug 1 14:28:33 ha-idg-1 crmd[4692]: notice: process_lrm_event: O
Re: [ClusterLabs] Notification agent and Notification recipients
On 04/08/17 11:06 +0530, Sriram wrote: > Any idea what could have gone wrong or if there are other ways to achieve > the same ? Sriram, I have just answered in the original thread. Note that it's that part of the year where vacations are quite common, so even if you are eager to know the answer, the reasonable wait time should be a bit higher (and moreover, do not start a new thread but rather respond to the existing one next time around, please). -- Poki pgprM9DZSi_Dz.pgp Description: PGP signature ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Notification agent and Notification recipients
On 03/08/17 12:31 +0530, Sriram wrote: > We have a four node cluster (1 active : 3 standby) in our lab for a > particular service. If the active node goes down, one of the three standby > node becomes active. Now there will be (1 active : 2 standby : 1 offline). > > Is there any way where this newly elected node sends notification to the > remaining 2 standby nodes about its new status ? > > I was exploring "notification agent" and "notification recipient" features, > but that doesn't seem to work. /etc/sysconfig/notify.sh doesn't get invoked > even in the newly elected active node. > > Cluster Properties: > cluster-infrastructure: corosync > dc-version: 1.1.17-e2e6cdce80 > default-action-timeout: 240 > have-watchdog: false > no-quorum-policy: ignore > *notification-agent: /etc/sysconfig/notify.sh* > *notification-recipient: /var/log/notify.log* This ^ legacy approach to configure notifications ... > placement-strategy: balanced > stonith-enabled: false > symmetric-cluster: false > > > I m using the following versions of pacemaker and corosync. > > /usr/sbin # ./pacemakerd --version > Pacemaker 1.1.17 ... is not expected to be used with this ^ new pacemaker (or any version 1.1.15+, for that matter, unless explicitly enabled): https://github.com/ClusterLabs/pacemaker/commit/a8d8c0c2d4cad571f0746c879de4f6d0c55dd5d6 [...] > Can you please suggest if I m doing anything wrong or if there any other > mechanisms to achieve this ? Please, have a look at the respective chapter of Pacemaker Explained http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_alert_agents with the details on how to use the blessed (and usually only, which is very likely here) approach to configure notification scripts. -- Poki pgpLhWZi840yR.pgp Description: PGP signature ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Antw: verify status starts at 100% and stays there?
Yeah, UpToDate was not of concern to me. The part that threw me off was "done:100.00." It did eventually finish, though, and that was shown in the dmesg output. However, 'drbdadm status' said "done:100.00" the whole time, from start to finish, which seems weird. -- Eric Robinson > -Original Message- > From: Ulrich Windl [mailto:ulrich.wi...@rz.uni-regensburg.de] > Sent: Thursday, August 03, 2017 11:25 PM > To: users@clusterlabs.org > Subject: [ClusterLabs] Antw: verify status starts at 100% and stays there? > > >>> Eric Robinson schrieb am 04.08.2017 um > >>> 06:53 in > Nachricht > d03.prod.outlook.com> > > > I have drbd 9.0.8. I started an online verify, and immediately checked > > status, and I see... > > > > ha11a:/ha01_mysql/trimtester # drbdadm status ha01_mysql role:Primary > > disk:UpToDate > > ha11b role:Secondary > > replication:VerifyT peer-disk:UpToDate done:100.00 > > > > ...which looks like it is finished, but the tail of dmesg says... > > > > [336704.851209] drbd ha01_mysql/0 drbd0 ha11b: repl( Established -> > > VerifyT ) [336704.851244] drbd ha01_mysql/0 drbd0: Online Verify start > > sector: 0 > > > > ...which looks like the verify is still in progress. > > > > So is it done, or is it still in progress? Is this a drbd bug? > > Not deep into DRBD, but I guess "disk:UpToDate" just indicated that up to > the present moment DRBD thinks the disks are up to date (unless veryfy > wouold detect otherwise). Maybe there should be an additional status like > "syncing,verifying, etc." > > Regards, > Ulrich > > > > > ___ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Antw: LVM resource and DAS - would two resources off one DAS...
On 07/27/2017 09:20 PM, Ulrich Windl wrote: Hi! I think it will work, because the cluster does not monitor the PVs or prtition or LUNs. It just checks whether you can activate the LVs (i.e.: the VG). That's what I know... Regards, Ulrich lejeczek schrieb am 27.07.2017 um 15:05 in Nachricht <636398a2-e8ea-644b-046b-ff12358de...@yahoo.co.uk>: hi fellas I realise this might be quite specialized topic, as this regards hardware DAS(sas2) and LVM and cluster itself but I'm hoping with some luck an expert peeps over here and I'll get some or all the answers then. question: Can cluster manage two(or more) LVM resources which would be on/in same single DAS storage and have these resources(eg. one LVM runs on 1&2 the other LVM runs on 3&4) run on different nodes(which naturally all connect to that single DAS)? Yes, it works in production environment for users. While, it could depend on your detailed scenarios. You should further evaluation if you need protect lvm vg metadata, eg. resize your lv on multiple node simultaneously. If so, to involve clvm or the coming lvmlockd is necessary. --Roger Now, I guess this might be something many do already and many will say: trivial. In which case a few firm "yes" confirmations will mean - typical, just do it. Or could it be something unusual and untested but might/should work when done with care and special "preparation"? I understand that lots depends on what/how harwdare+kernel do things, but if possible(?) I'd leave it out for now and ask only the cluster itself - do you do it? many thanks. L. ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org