[Linux-HA] Antw: Q: groups of groups
>>> I wrote 22.08.2013 um 11:31 in Nachricht <5215DA7A.3B0:161:60728>: [...] > So what you really want is > "group grp_A ((R1 VG1 F1) (R2 VG2 F2)) A" [...] Of course that's nonsense (despite of the fact that the syntax does not exist), or at least non-obvious. I meant "group grp_A ([R1 VG1 F1] ([R2 VG2 F2]) A" where [...] denotes an ordered "sub-group" or "sequence", and both of those sequences should be executes in parallel. I hope it's less confusing now. Regards, Ulrich ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Antw: Re: Backing out of HA
>>> Ferenc Wagner schrieb am 22.08.2013 um 14:00 in Nachricht <87siy2ndf8@lant.ki.iif.hu>: > Lars Marowsky-Bree writes: > >> "Poisoned resources" indeed should just fail to start and that should be >> that. What instead can happen is that the resource agent notices it >> can't start, reports back to the cluster, and the cluster manager goes >> "Oh no, I couldn't start the resource successfully! It's now possibly in >> a weird state and I better stop it!" >> >> ... And because of the misconfiguration, the *stop* also fails, and >> you're hit with the full power of node-level recovery. >> >> I think this is an issue with some resource agents (if the parameters >> are so bad that the resource couldn't possibly have started, why fail >> the stop?) and possibly also something where one could contemplate a >> better on-fail="" default for "stop in response to first-start failure". > > Check out http://www.linux-ha.org/doc/dev-guides/_execution_block.html, > especially the comment "anything other than meta-data and usage must > pass validation". So if the start action fails with some validation > error, the stop action will as well. Is this good practice after all? > Or is OCF_ERR_GENERIC treated differently from the other errors in this > regard and thus the validate action should never return OCF_ERR_GENERIC? Hi! Please note that "anything other than meta-data and usage must pass validation". isn't very help ful to the RA developer: The statement just says it's not allowed to use such methodes (like start and stop) if the parameters don't pass validation, but it does not say _who_ is to ensure that: Requiring the RA to do parameter validation before any such method is called maybe helpful only while debugging the RA. Normally the program that does configuration changes should validate the parameters (preferrably on aany node where the resource is to be activated). Regards, Ulrich > -- > Thanks, > Feri. > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] cibadmin --delete disregarding attributes
On 22/08/2013, at 10:22 PM, Ferenc Wagner wrote: > Hi, > > man cibadmin says: "the tagname and all attributes must match in order > for the element to be deleted", "for the element to be deleted" <--- not the children of the element to be deleted > but experience says otherwise: the > primitive is deleted even if it was created with different attributes > than those provided to the --delete call, cf. 'foo' vs 'bar' in the > example below. Do I misinterpret the documentation or is this a bug in > Pacemaker 1.1.7? > > # cibadmin --create --obj_type resources --xml-text " provider='heartbeat' type='Dummy' id='test_dummy'> > > > > > value='false'/> > " > > # cibadmin --delete --obj_type resources --xml-text " provider='heartbeat' type='Dummy' id='test_dummy'> tagname, class, provider, type and id all match here. so deletion is allowed to proceed > > > > > value='false'/> > " > -- > Thanks, > Feri. > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems signature.asc Description: Message signed with OpenPGP using GPGMail ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Storing arbitrary metadata in the CIB
On 22/08/2013, at 10:08 PM, Ferenc Wagner wrote: > Hi, > > Our setup uses some cluster wide pieces of meta information. Think > access control lists for resource instances used by some utilities or > some common configuration data used by the resource agents. Currently > this info is stored in local files on the nodes or replicated in each > primitive as parameters. Are you aware that resources can - have multiple sets of parameters, and - share sets of parameters The combination might be useful here. > I find this suboptimal, as keeping them in > sync is a hassle. It is possible to store such stuff in the "fake" > parameter of unmanaged Dummy resources, but that clutters the status > output. Can somebody offer some advice in this direction? Or is this > idea a pure heresy? > -- > Thanks, > Feri. > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems signature.asc Description: Message signed with OpenPGP using GPGMail ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Q: groups of groups
On 22/08/2013, at 7:31 PM, Ulrich Windl wrote: > Hi! > > Suppose you have an application A that needs two filesystems F1 and F2. The > filesystems are on separate LVM VGs VG1 and VG2 with LVs L1 and L2, > respectively. The RAID R1 and R2 provide the LVM PVs. > > (Actually we have one group that has 58 primitives in them with both > dimensions being wider than in this example) > > So you can configure > "group grp_A R1 R2 VG1 VG2 F1 F2 A" (assuming the elements are primitives > already configured) > > Now for example if R2 has a problem, the cluster will restart the whole group > of resources, even that sequence that is unaffected (R1 VG1 F1). This causes > extra operations and time for recovery what you don't like. So don't put them in a group? > > What you can do now is having parallel execution like this > "group grp_A (R1 R2) (VG1 VG2) (F1 F2) A" You're saying this is currently possible? If so, crmsh must be re-writing this into something other than a group. > (Note that this is probably a bad idea as the RAIDs and VGs (and maybe mount > also) most likely use a common lock each that forces serialization) > > For the same failure scenario R2 wouldn't be restarted, so the gain is small. > A better approach seems to be > "group grp_A (R1 VG1 F1) (R2 VG2 F2) A" > > Now for the same failure R1, VG1, and F1 will survive; unfortunately if R1 > fails, then everything will be restarted, like in the beginning. > > So what you really want is > "group grp_A ((R1 VG1 F1) (R2 VG2 F2)) A" > > Now if R2 fails, then R1, VG1, and F1 will survive, and if R1 fails, then R2, > VG2 and F2 will survive > > Unfortunately the syntax of the last example is not supported. I'm surprised the one before it is even supported. Groups of groups have never been supported. > This one isn't either: > > group grp_1 R1 VG1 F1 > group grp_2 R2 VG2 F2 > group grp_A (grp_1 grp_2) A > > So a group of groups would be nice to have. I thought about that long time > ago, but only yesterday I learned about the syntax of "netgroups" which has > exactly that: a netgroup can contain another netgroup ;-) > > Regards, > Ulrich > > > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems signature.asc Description: Message signed with OpenPGP using GPGMail ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] How to painlessly change depended upon resource groups?
Hi, On Thu, 22 Aug 2013 18:22:50 +0200 Ferenc Wagner wrote: > I built a Pacemaker cluster to manage virtual machines (VMs). Storage > is provided by cLVM volume groups, network access is provided by > software bridges. I wanted to avoid maintaining precise VG and bridge > dependencies, so I created two cloned resource groups: > > group storage dlm clvmd vg-vm vg-data > group network br150 br151 > > I cloned these groups and thus every VM resource uniformly got two > these two dependencies only, which makes it easy to add new VM > resources: > > colocation cl-elm-network inf: vm-elm network-clone > colocation cl-elm-storage inf: vm-elm storage-clone > order o-elm-network inf: network-clone vm-elm > order o-elm-storage inf: storage-clone vm-elm > > Of course the network and storage groups do not even model their > internal dependencies correctly, as the different VGs and bridges are > independent and unordered, but this is not a serious limitation in my > case. > > The problem is, if I want to extend for example the network group by a > new bridge, the cluster wants to restart all running VM resources > while starting the new bridge. I get this info by changing a shadow > copy of the CIB and crm_simulate --run --live-check on it. This is > perfectly understandable due to the strict ordering and colocation > constraints above, but undesirable in these cases. > > The actual restarts are avoidable by putting the cluster in > maintenance mode beforehand, starting the bridge on each node > manually, changing the configuration and moving the cluster out of > maintenance mode, but this is quite a chore, and I did not find a way > to make sure everything would be fine, like seeing the planned > cluster actions after the probes for the new bridge resource are run > (when there should not be anything left to do). Is there a way to > regain my peace of mind during such operations? Or is there at least > a way to order the cluster to start the new bridge clones so that I > don't have to invoke the resource agent by hand on each node, thus > reducing possible human mistakes? > > The bridge configuration was moved into the cluster to avoid having to > maintain it in each node's OS separately. The network and storage > resource groups provide a great concise status output with only the VM > resources expanded. These are bonuses, but not requirements; if > sensible maintenance is not achievable with this setup, everything is > subject to change. Actually, I'm starting to feel that simplifying > the VM dependencies may not be viable in the end, but wanted to ask > for outsider ideas before overhauling the whole configuration. If I understand you correctly, the problem only arises when adding new bridges while the cluster is running. And your vms will (rightfully) get restarted when you add a non-running bridge-resource to the cloned dependency-group. You might be able to circumvent this problem: Define the bridge as a single cloned resource and start it. When it runs on all nodes, remove the clones for the single resource and add the resource to your dependency-group in one single edit. With commit the cluster should see that the new resource in the group is already running and thus not affect the vms. On a side-note: I made the (sad) experience that its easier to configure such stuff outside of pacemaker/corosync and use the cluster only for the reliable ha things. Configuring several systems into a sane state is more a job for configuration-management such as chef, puppet or at least csync2 (to sync the configs). Have fun, Arnold signature.asc Description: PGP signature ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Storing arbitrary metadata in the CIB
Vladislav Bogdanov writes: > 22.08.2013 15:08, Ferenc Wagner wrote: > >> Our setup uses some cluster wide pieces of meta information. > > You may use meta attributes of any primitives for that. Although crmsh > doe not like that very much, it can be switched to a "relaxed" mode. OK, but the point is that this data is not specific to any resource, but cluster-global, so that would be unnatural. -- Cheers, Feri. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] How to painlessly change depended upon resource groups?
Hi, I built a Pacemaker cluster to manage virtual machines (VMs). Storage is provided by cLVM volume groups, network access is provided by software bridges. I wanted to avoid maintaining precise VG and bridge dependencies, so I created two cloned resource groups: group storage dlm clvmd vg-vm vg-data group network br150 br151 I cloned these groups and thus every VM resource uniformly got two these two dependencies only, which makes it easy to add new VM resources: colocation cl-elm-network inf: vm-elm network-clone colocation cl-elm-storage inf: vm-elm storage-clone order o-elm-network inf: network-clone vm-elm order o-elm-storage inf: storage-clone vm-elm Of course the network and storage groups do not even model their internal dependencies correctly, as the different VGs and bridges are independent and unordered, but this is not a serious limitation in my case. The problem is, if I want to extend for example the network group by a new bridge, the cluster wants to restart all running VM resources while starting the new bridge. I get this info by changing a shadow copy of the CIB and crm_simulate --run --live-check on it. This is perfectly understandable due to the strict ordering and colocation constraints above, but undesirable in these cases. The actual restarts are avoidable by putting the cluster in maintenance mode beforehand, starting the bridge on each node manually, changing the configuration and moving the cluster out of maintenance mode, but this is quite a chore, and I did not find a way to make sure everything would be fine, like seeing the planned cluster actions after the probes for the new bridge resource are run (when there should not be anything left to do). Is there a way to regain my peace of mind during such operations? Or is there at least a way to order the cluster to start the new bridge clones so that I don't have to invoke the resource agent by hand on each node, thus reducing possible human mistakes? The bridge configuration was moved into the cluster to avoid having to maintain it in each node's OS separately. The network and storage resource groups provide a great concise status output with only the VM resources expanded. These are bonuses, but not requirements; if sensible maintenance is not achievable with this setup, everything is subject to change. Actually, I'm starting to feel that simplifying the VM dependencies may not be viable in the end, but wanted to ask for outsider ideas before overhauling the whole configuration. -- Thanks in advance, Feri. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Storing arbitrary metadata in the CIB
22.08.2013 15:08, Ferenc Wagner wrote: > Hi, > > Our setup uses some cluster wide pieces of meta information. Think > access control lists for resource instances used by some utilities or > some common configuration data used by the resource agents. Currently > this info is stored in local files on the nodes or replicated in each > primitive as parameters. I find this suboptimal, as keeping them in > sync is a hassle. It is possible to store such stuff in the "fake" > parameter of unmanaged Dummy resources, but that clutters the status > output. Can somebody offer some advice in this direction? Or is this > idea a pure heresy? > You may use meta attributes of any primitives for that. Although crmsh doe not like that very much, it can be switched to a "relaxed" mode. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] cibadmin --delete disregarding attributes
Hi, man cibadmin says: "the tagname and all attributes must match in order for the element to be deleted", but experience says otherwise: the primitive is deleted even if it was created with different attributes than those provided to the --delete call, cf. 'foo' vs 'bar' in the example below. Do I misinterpret the documentation or is this a bug in Pacemaker 1.1.7? # cibadmin --create --obj_type resources --xml-text " " # cibadmin --delete --obj_type resources --xml-text " " -- Thanks, Feri. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Storing arbitrary metadata in the CIB
Hi, Our setup uses some cluster wide pieces of meta information. Think access control lists for resource instances used by some utilities or some common configuration data used by the resource agents. Currently this info is stored in local files on the nodes or replicated in each primitive as parameters. I find this suboptimal, as keeping them in sync is a hassle. It is possible to store such stuff in the "fake" parameter of unmanaged Dummy resources, but that clutters the status output. Can somebody offer some advice in this direction? Or is this idea a pure heresy? -- Thanks, Feri. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Backing out of HA
Lars Marowsky-Bree writes: > "Poisoned resources" indeed should just fail to start and that should be > that. What instead can happen is that the resource agent notices it > can't start, reports back to the cluster, and the cluster manager goes > "Oh no, I couldn't start the resource successfully! It's now possibly in > a weird state and I better stop it!" > > ... And because of the misconfiguration, the *stop* also fails, and > you're hit with the full power of node-level recovery. > > I think this is an issue with some resource agents (if the parameters > are so bad that the resource couldn't possibly have started, why fail > the stop?) and possibly also something where one could contemplate a > better on-fail="" default for "stop in response to first-start failure". Check out http://www.linux-ha.org/doc/dev-guides/_execution_block.html, especially the comment "anything other than meta-data and usage must pass validation". So if the start action fails with some validation error, the stop action will as well. Is this good practice after all? Or is OCF_ERR_GENERIC treated differently from the other errors in this regard and thus the validate action should never return OCF_ERR_GENERIC? -- Thanks, Feri. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Q: groups of groups
Hi! Suppose you have an application A that needs two filesystems F1 and F2. The filesystems are on separate LVM VGs VG1 and VG2 with LVs L1 and L2, respectively. The RAID R1 and R2 provide the LVM PVs. (Actually we have one group that has 58 primitives in them with both dimensions being wider than in this example) So you can configure "group grp_A R1 R2 VG1 VG2 F1 F2 A" (assuming the elements are primitives already configured) Now for example if R2 has a problem, the cluster will restart the whole group of resources, even that sequence that is unaffected (R1 VG1 F1). This causes extra operations and time for recovery what you don't like. What you can do now is having parallel execution like this "group grp_A (R1 R2) (VG1 VG2) (F1 F2) A" (Note that this is probably a bad idea as the RAIDs and VGs (and maybe mount also) most likely use a common lock each that forces serialization) For the same failure scenario R2 wouldn't be restarted, so the gain is small. A better approach seems to be "group grp_A (R1 VG1 F1) (R2 VG2 F2) A" Now for the same failure R1, VG1, and F1 will survive; unfortunately if R1 fails, then everything will be restarted, like in the beginning. So what you really want is "group grp_A ((R1 VG1 F1) (R2 VG2 F2)) A" Now if R2 fails, then R1, VG1, and F1 will survive, and if R1 fails, then R2, VG2 and F2 will survive Unfortunately the syntax of the last example is not supported. This one isn't either: group grp_1 R1 VG1 F1 group grp_2 R2 VG2 F2 group grp_A (grp_1 grp_2) A So a group of groups would be nice to have. I thought about that long time ago, but only yesterday I learned about the syntax of "netgroups" which has exactly that: a netgroup can contain another netgroup ;-) Regards, Ulrich ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems