Re: [Pacemaker] chicken-egg-problem with libvirtd and a VM within cluster
On Fri, Oct 12, 2012 at 3:18 AM, Andrew Beekhof and...@beekhof.net wrote: This has been a topic that has popped up occasionally over the years. Unfortunately we still don't have a good answer for you. The least worst practice has been to have the RA return OCF_STOPPED for non-recurring monitor operations (aka. startup probes) IFF its pre-requistites (ie. binaries, or things that might be on a cluster file system) are not available. Possibly we need to begin using the ordering constraints (normally used for ordering start operations) for the startup probes too. Ie. order(A, B) == A.start before B.(monitor_0, start) I had been resisting that move, but perhaps its time. (It would also help avoid slamming the cluster with a bazillion operations in parallel when several nodes start up together) Lars? Florian? Comments? Sure. As Tom correctly observes, the problem (as I know it) occurs when manually stopping Pacemaker services and then restarting them. As it shuts down, Pacemaker kills libvirtd (after migrating off or stopping all VMs), and then as you bring it back up, the probe runs into an error. The same, btw, applies if you only send the node into standby mode. For manual intervention, the workaround is simply this: - Stop Pacemaker services, or put node in standby (libvirtd stops in the process as the local clone instance shuts down). - Do whatever you need to do on that box. - Start libvirtd. - Start Pacemaker services, or take node online. For most people, this issue doesn't occur on system boot, as libvirtd would normally start before corosync, or corosync/pacemaker isn't part of the system bootup sequence at all (the latter is preferred for two-node clusters to prevent fencing shootouts in case of cluster split brain). On that ha-kvm.pdf guide, I will add that I'm guessing this is not the only piece of information missing or outdated in it. However, I have no rights to that document other than to be named as an original author and to use it under CC-NC-ND terms like anyone else, and I have no access to the sources anymore, so there's no way for me to update it. Maybe the Linbit folks are willing/able to do that. Back on the probe issue, we're in a bit of a catch-22 als libvirtd can be freely restarted and stopped while leaving domains (VMs) running. So the assumption if libvirtd doesn't run, then the domain can't be running simply doesn't hold up. In fact, it's outright dangerous, as a domain may well run _and have read/write access to shared resources_ while libvirt isn't running. So doing the naive thing and bail out of monitor if we can't detect a livirtd pid -- that doesn't fly. What would fly is to check for libvirtd on _every_ invocation of the RA (well, maybe all except validate and usage), and to restart it on the sole condition that we can't detect its pid. That, however, breaks the contract that a probe should be non-invasive and really shouldn't be touching any system services. Also, a running libvirtd is not needed, to the best of my knowledge, when the hypervisor in use is Xen rather than KVM. We could mitigate that by making it configurable, but the only sane default would be to have this enabled, which again breaks said contract. When virsh is invoked with a qemu:///session URI it will actually start up a user-specific libvirtd by itself, but as far as I know there is no way to do that for qemu:///system which most people will be using. Andrew, your suggestion would fix that issue, but it would obviously make the config more convoluted. In effect, we'd need one order and one colo constraint more than we already do. For a silly idea, how about thinking about being able to define a list of op types in a constraint, rather than a single op? As in: order libvirtd_before_virtdom inf: libvirtd:start virtdom_foo:monitor,start colocation virtdom_on_libvirtd inf: virtdom_foo:Started,Probed libvirtd:Started (Of course no such thing as a Probed role currently exists, so here we go down the rabbit hole...) I hope this is useful. Thoughts are much appreciated. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] FYI/RFC: Name for 'system service' alias
On Mon, Jun 25, 2012 at 1:40 PM, Andrew Beekhof and...@beekhof.net wrote: I've added the concept of a 'system service' that expands to whatever standard the local machine supports. So you could say, in xml, primitive id=Magic class=system type=mysql and the cluster would use 'lsb' on RHEL, 'upstart' on Ubuntu and 'systemd' on newer fedora releases. Handy if you have a mixed cluster. My question is, what to call it? 'system', 'service', something else? I think Red Hat Cluster has similar functionality named service, so in the interest of continuity that would be my preference. One thought though: what's supposed to happen on platforms that support several system service interfaces, such as Ubuntu which supports both Upstart and LSB? IOW: If I define a service as service:foobar, and there is no upstart job named foobar, but /etc/init.d/foobar exists, would that be an OCF_ERR_INSTALLED? In other news, the next pacemaker release will support systemd and both it and upstart will use a persistent connection to the DBus API (no more forking!). Sweet! Cheers, Florian ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] How to write a master/slave resource-script
On 06/15/12 16:37, Andrew Beekhof wrote: On Fri, Jun 15, 2012 at 12:19 AM, Stallmann, Andreas astallm...@conet.de wrote: Hi! Excuse my blindness; I found the „Stateful“ script, which is obviously the template / skeleton I was looking for. Unfortunately it comes without explanaition. Does anyone know, where I’d find this? This would be a good place to start: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch10s03s09.html Or this: http://www.linux-ha.org/doc/dev-guides/ra-dev-guide.html Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] General question about pacemaker
On Sun, Jun 10, 2012 at 3:07 PM, Stefan Günther smguent...@web.de wrote: Hello, I have a general question about the features of pacemaker. We are planning to setup a HA solution with pacemaker, corosync and drbd. After a failure of the master at later its recovery, drbd will sync the data from the slave to the master. Is it now possible to configure pacemaker and/or corosync to perform a failback, AFTER drbd has finished syncing? Yes. And if yes, which componenten is responsible for waiting from the signal from drbd that syncing has finished? The ocf:linbit:drbd resource agent (the Pacemaker resource agent that ships with DRBD) influences the resource master score, which Pacemaker evaluates for the placement of the DRBD Master role among cluster nodes. You can combine this with a location contraint that sets a preference for one of your nodes as the DRBD Master (Primary). If you set your location constraint score correctly, you would get the behavior you want. However, why do you want automatic failback? If your cluster nodes are interchangeable in terms of performance, you shouldn't need to care which node is the master. In other words the concept of having a preferred master is normally moot in well-designed clusters. Hope this is useful. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Problem with state: UNCLEAN (OFFLINE)
On Fri, Jun 8, 2012 at 1:01 PM, Juan M. Sierra jmsie...@cica.es wrote: Problem with state: UNCLEAN (OFFLINE) Hello, I'm trying to get up a directord service with pacemaker. But, I found a problem with the unclean (offline) state. The initial state of my cluster was this: Online: [ node2 node1 ] node1-STONITH (stonith:external/ipmi): Started node2 node2-STONITH (stonith:external/ipmi): Started node1 Clone Set: Connected Started: [ node2 node1 ] Clone Set: ldirector-activo-activo Started: [ node2 node1 ] ftp-vip (ocf::heartbeat:IPaddr): Started node1 web-vip (ocf::heartbeat:IPaddr): Started node2 Migration summary: * Node node1: pingd=2000 * Node node2: pingd=2000 node2-STONITH: migration-threshold=100 fail-count=100 and then, I removed the electric connection of node1, the state was the next: Node node1 (8b2aede9-61bb-4a5a-aef6-25fbdefdddfd): UNCLEAN (offline) Online: [ node2 ] node1-STONITH (stonith:external/ipmi): Started node2 FAILED Clone Set: Connected Started: [ node2 ] Stopped: [ ping:1 ] Clone Set: ldirector-activo-activo Started: [ node2 ] Stopped: [ ldirectord:1 ] web-vip (ocf::heartbeat:IPaddr): Started node2 Migration summary: * Node node2: pingd=2000 node2-STONITH: migration-threshold=100 fail-count=100 node1-STONITH: migration-threshold=100 fail-count=100 Failed actions: node2-STONITH_start_0 (node=node2, call=22, rc=2, status=complete): invalid parameter node1-STONITH_monitor_6 (node=node2, call=11, rc=14, status=complete): status: unknown node1-STONITH_start_0 (node=node2, call=34, rc=1, status=complete): unknown error I was hoping that node2 take the management of ftp-vip resource, but it wasn't in that way. node1 kept in a unclean state and node2 didn't take the management of its resources. When I put back the electric connection of node1 and it was recovered then, node2 took the management of ftp-vip resource. I've seen some similar conversations here. Please, could you show me some idea about this subject or some thread where this is discussed? Well your healthy node failed to fence your offending node. So fix your STONITH device configuration and as soon as that is able to fence, your failover should work fine. Of course, if your IPMI BMC fails immediately after you remove power from the machine (i.e. it has no backup battery so it can at least report the power status), then you might have to fix your issue by switching to a different STONITH device altogether. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] KVM DRBD and Pacemaker
On Tue, Jun 5, 2012 at 1:55 AM, Cliff Massey cliffm...@cliffmassey.com wrote: My config is: http://pastebin.com/5qYiHe56 Yep, you completely forgot your order and colo constraints. You need those to tie your foo-kvm primitive to its corresponding ms-foo master/slave set. http://www.drbd.org/users-guide-8.3/s-pacemaker-crm-drbd-backed-service.html Take a look at where it says order and colocation. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Announce: pcs / pcs-gui (Pacemaker/Corosync Configuration System)
On Mon, Jun 4, 2012 at 3:21 AM, Andrew Beekhof and...@beekhof.net wrote: On Sat, Jun 2, 2012 at 12:56 AM, Florian Haas flor...@hastexo.com wrote: On Fri, Jun 1, 2012 at 1:40 AM, Chris Feist cfe...@redhat.com wrote: I'd like to announce the existence of the Pacemaker/Corosync configuration system, PCS. Be warned, I will surely catch flak for what I'm about to say. Andrew, thanks for confirming. :) The emphasis in PCS differs somewhat from the existing shell: Before you get into where it differs in emphasis, can you explain why we need another shell? Uh, because the world isn't black white and people find different things important? Like, perhaps, some of the things Chris listed. I don't disagree with the importance of some of those, but none of them look like a compelling reason to write a new one from scratch. PCS will continue the tradition of having a regression test suite and discoverable 'ip'-like hierarchical menu structure, however unlike the shell we may end up not adding interactivity. Strangely enough, if I were to name one feature as the most useful in the existing shell, it's its interactivity. Personally I disagree. Mostly what I see people using is tab completion, which is not interactivity and even if considered crucial, doesn't need to be baked into the tool itself. That is true, but having done a bash completion thingy myself before, I can tell you it's quite a bit of effort. Unless, that is, the tool has a generic hook that completion systems can tie into, like what Mercurial does (iirc). Note that something taking a lot of effort doesn't disqualify it, but creating a lot of effort just to match functionality that something else already has -- that's questionable. The crm shell is actually not just about simple tab completion, it's about tab completion with the added benefits of providing documentation interactively, and to the best of my knowledge that's something you can't do in bash completion. Other completion systems I don't know. How do you envision people configuring, say, an IPaddr2 resource when they don't remember the parameter names, or whether a specific parameter is optional or required? Or even the resource agent name? Now you're just being silly. Oh, am I? Are you seriously claiming interactivity is the only way to discover information about a program? Yeah, we all know how attentively people read man pages. Quick, someone tell the iproute developers that no-one can add an IP address because 'ip help' and 'ip addr help' aren't interactive! Remind me how _that_ comment isn't silly? Both projects are far from complete, but so far PCS can: - Create corosync/pacemaker clusters from scratch - Add simple resources and add constraints If I were a new user, I'd probably be unable to create even a simple resource with this, for the reason given above. But I will concede that at its current state it's probably unfair to expect that new users are able to use this. (The existing shell is actually usable for newcomers, even though it's not perfect. Why to we need a new shell again?) To see how many straw men you could construct. See below on that comment. - Create/Remove resource groups Why is it resource create, but resource group add? I /think/ its because you're adding a resource to an existing group. Well you add and create one in one fell swoop (which is OK -- makes no sense to have an empty group), but it might still be a good idea in terms of POLA to add create, even if all it does is check that the group doesn't already exist, and then hand off to add. - Set most pacemaker configuration options How do you enumerate which ones are available? Valid question You'll hate me again for saying this, but by having this discussion we're already smack in the middle of duplicating effort. For something that's solved in an existing tool. - Start/Stop pacemaker/corosync - Get basic cluster status I'm currently working on getting PCS fully functional with Fedora 17 (and it should work with other distributions based on corosync 2.0, pacemaker 1.1 and systemd). I'm hoping to have a fairly complete version of PCS for the Fedora 17 release (or very shortly thereafter) and a functioning version of pcs-gui (which includes the ability to remotely start/stop nodes and set corosync config) by the Fedora 18 release. The code for both projects is currently hosted on github (https://github.com/feist/pcs https://github.com/feist/pcs-gui) You can view a sample pcs session to get a preliminary view of how pcs will work - https://gist.github.com/2697640 Any reason why the gist doesn't use pcs cluster sync, which as per pcs cluster --help would sync the Corosync config across nodes? Comments and contributions are welcome. I'm sorry, and I really don't mean this personally, but I just don't get the point. Plenty of people didn't see the point of Pacemaker either. And I don't recall anyone saying they hate
Re: [Pacemaker] Announce: pcs / pcs-gui (Pacemaker/Corosync Configuration System)
On Mon, Jun 4, 2012 at 1:02 PM, Lars Marowsky-Bree l...@suse.com wrote: I am getting a slightly defensive-to-aggressive vibe from your response to Florian. Can we tune that down? I much prefer to do the shouting at each other in person, because then the gestures come across much more vividly and the food is better. Thank you ;-) In that case I suggest you come to Canberra for next year's linux.conf.au, where the opportunity is likely to present itself. :) Open source has a long and glorious history of people saying I'm going to try and do it this way and Chris has every right to try something different. Personally I'm hoping a little friendly competition will result in both projects finding new ways to improve usability. Of course. Still, people will ask which one should I choose, and we need to be able to answer that. And if the answer to that were whatever your distro recommends, and everyone upstream would hence be leaving that decision to product managers or distro subsystem maintainers, then people should know about that too. I will add that I'd find that undesirable; we've been down that road before. And as a community, yes, I think we also should think about the cost of choice to users - as well as the benefits. Even developers will ask questions like I want to do X; where do I contribute that? I like things that make it easier for users to use our stuff, and still I need to understand how to advise them what to do when, and how the various toys in the playground relate ;-) I'd like to add that documentation in the vein of First do A. Then if you're on X do B, on Y do C and on Z do E. Then do F, unless you're on X, in which case you skip straight to G just doesn't work. Cheers, Florian ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Announce: pcs / pcs-gui (Pacemaker/Corosync Configuration System)
On Tue, Jun 5, 2012 at 1:43 AM, Andrew Beekhof and...@beekhof.net wrote: On Mon, Jun 4, 2012 at 9:02 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-06-04T11:21:57, Andrew Beekhof and...@beekhof.net wrote: Hi Andrew, I am getting a slightly defensive-to-aggressive vibe from your response to Florian. Can we tune that down? I much prefer to do the shouting at each other in person, because then the gestures come across much more vividly and the food is better. Thank you ;-) Now you're just being silly. Are you seriously claiming interactivity is the only way to discover information about a program? Quick, someone tell the iproute developers that no-one can add an IP address because 'ip help' and 'ip addr help' aren't interactive! I think the interactive tab completion is indeed cool. No, of course it's not the only way, but it does make things easier. You are of course right it doesn't need to be baked in; one can also dump the syntax tree and have bash/zsh/emacs do the completion. That does make dynamic completion a bit less efficient, though. True. But less efficient is a LONG way from sensationalist words like impossible. It's this kind of tired hyperbole that tends to generate a defensive-to-aggressive vibe on my part. Who said impossible? Looks to me like you're the first person in this thread to use that term. Plenty of people didn't see the point of Pacemaker either. And I don't recall anyone saying they hate the existing [resource manager] and this effort solves all their problems about the first few years Pacemaker development. I don't quite see this is a valid comparison, sorry. The crm was developed because the existing resource manager that heartbeat implemented was way too limited; the CRM was something radically different. That was a huge effort that couldn't possibly have been implemented in an incremental fashion. My point would be that despite the above, there /still/ wasn't the level of public outcry that Florian apparently deems necessary for new work. Nonsense. And if Pacemaker couldn't generate it, it makes an unfair criteria to require of pcs. (When we're talking about Pacemaker (versus the crm), it is obvious that that wasn't really a technology-driven move.) With the implication being that technology-driven moves are bad? Who made that implication? How do you explain HAWK then? Shouldn't Tim have written a patch to py-gui instead? I think a UI that runs in a browser, as opposed to requiring a graphics library and rendering engine that is only ubiquitous on Linux and practically non-existent on other platforms, is a significant usability improvement. Of course, Tim could also have written a server-side library that translates GTK2 into HTML5 and would allow the pygui to run on a server unmodified, but that's a bit much to ask. Open source has a long and glorious history of people saying I'm going to try and do it this way and Chris has every right to try something different. Personally I'm hoping a little friendly competition will result in both projects finding new ways to improve usability. Of course. Still, people will ask which one should I choose, and we need to be able to answer that. The same way the Linux community has answers for: - sh/bash/tsch/zsh/dash... - gnome/kde/enlightnment/twm/fvwm... - fedora/opensuse/debian/ubuntu/leaf... - mysql/postgres/oracle/sybase - ext2,3,4/reiserfs/btrfs... - GFS2/OCFS2 - dm_replicator/drbd - selinux/apparmor - iscsi clients - chat/irc/email clients - programming languages - editors - pacemaker GUIs Linux is hardly a bastion of there can be only one, so I find the level of doom people are expressing over a new cli to be disingenuous. Who expressed doom? Every argument made so far applies equally to HAWK and the Linbit GUI, yet there was no outcry when they were announced. This is likely to be an irrelevant tangent, but the pygui (afaik) had two problems: it only ran on Linux (for all practical purposes), and it was unmaintained (for all practical purposes). Neither of the two are true for the shell. It seems duplication is only bad to those that aren't responsible for it. And as a community, yes, I think we also should think about the cost of choice to users - as well as the benefits. Even developers will ask questions like I want to do X; where do I contribute that? I like things that make it easier for users to use our stuff, and still I need to understand how to advise them what to do when, and how the various toys in the playground relate ;-) Presumably you'll continue to advise SLES customers to use whatever you ship there. Doesn't seem too complex to me. Yep, that's what I referred to as leaving recommendations to distro maintainers and product managers. Not desirable, but if that's the case, then people at least have a right to know. I will add that this probably invalidates efforts to unify documentation, and it probably
Re: [Pacemaker] [Help] Pacemaker + Oracle Listener
On Wed, Jun 6, 2012 at 12:44 AM, Paul Damken zen.su...@gmail.com wrote: Im facing issues with my cluster setup. N+1 Pacemaker Hosting Oracle 11g Instances. Node name azteca I cannot get oralsnr to start my DB listener, it refuses on both nodes. Oracle RA is starting first, after all File systems and VIP starts. But no way to get Listener UP. When I do a manual start from /oracle/11.2.0/db_1/bin/lsnrctl start it works just fine. (Using oracle user shell prompt) CRM Config Oracle RA primitive p_oracle1 ocf:heartbeat:oracle \ params sid=xib11 home=/oracle/11.2.0/db_1 user=oracle ipcrm=orauser \ op start interval=0 timeout=120s \ op stop interval=0 timeout=120s \ op monitor interval=15s primitive p_oralsnr ocf:heartbeat:oralsnr \ params sid=xib11 listener=LISTENER user=oracle home=/oracle/11.2.0/db_1 \ op start interval=0 timeout=30s \ op stop interval=0 timeout=30s \ op monitor interval=15s group oracle_grp p_oracle1 p_oralsnr \ meta target-role=Started order o_fs_before_listener inf: oracle_fs oracle_grp colocation ora_on_fs inf: oracle_grp oracle_fs ERROR LOG: azteca:/var/log # cat messages | grep p_oralsnr Jun 5 17:02:24 azteca crmd: [24262]: info: do_lrm_rsc_op: Performing key=20:900:7:8bf8ffb9-cc40-42c5-9dfa-cdb84ec20d97 op=p_oralsnr_monitor_0 ) Jun 5 17:02:24 azteca lrmd: [24259]: info: rsc:p_oralsnr probe[401] (pid 9369) Jun 5 17:02:24 azteca lrmd: [24259]: info: operation monitor[401] on p_oralsnr for client 24262: pid 9369 exited with return code 7 Jun 5 17:02:24 azteca crmd: [24262]: info: process_lrm_event: LRM operation p_oralsnr_monitor_0 (call=401, rc=7, cib-update=812, confirmed=true) not running Jun 5 17:02:34 azteca crmd: [24262]: info: do_lrm_rsc_op: Performing key=64:900:0:8bf8ffb9-cc40-42c5-9dfa-cdb84ec20d97 op=p_oralsnr_start_0 ) Jun 5 17:02:34 azteca lrmd: [24259]: info: rsc:p_oralsnr start[404] (pid 11102) Jun 5 17:02:34 azteca lrmd: [24259]: info: operation start[404] on p_oralsnr for client 24262: pid 11102 exited with return code 1 This is just a generic error, so it could theoretically be anything, but often this is due to an incorrect tnslistener.ora configuration, where the listener is attempting to bind to an IP address that doesn't exist on the node where the listener is about to start. Find that tnslistener.ora file in your ORACLE_HOME, fix it up so the listener binds to the virtual IP, and you should hopefully be good to go. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] KVM DRBD and Pacemaker
On Mon, Jun 4, 2012 at 9:51 PM, Cliff Massey cliffm...@cliffmassey.com wrote: I am trying to setup a cluster consisting of KVM DRBD and pacemaker. Without pacemaker DRBD and KVM are working. I can even stop everything on one node, promote the other to drbd primary and start the KVM machine on the other. However, when trying to start the resource with pacemaker I receive the error: lmrd error: unable to open disk path /dev/drbd0: Wrong medium type Pacemaker config would indeed be helpful, but this sounds like a missing order and colocation constraint between your DRBD master/slave set and whatever should use that DRBD device -- probably your VirtualDomain resource. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)
On Fri, May 25, 2012 at 10:45 AM, Lars Ellenberg lars.ellenb...@linbit.com wrote: Sorry, sent to early. That would not catch the case of cluster partitions joining, only the pacemaker startup with fully connected cluster communication already up. I thought about a dc-priority default of 100, and only triggering a re-election if I am DC, my dc-priority is 50, and I see a node joining. Hardcoded arbitrary defaults aren't that much fun. You can use any number, but 100 is the magic threshold is something I wouldn't want to explain to people over and over again. We actually discussed node defaults a while back. Those would be similar to resource and op defaults which Pacemaker already has, and set defaults for node attributes for newly joined nodes. At the time the idea was to support putting new joiners in standby mode by default, so when you added a node in a symmetric cluster, you wouldn't need to be afraid that Pacemaker would shuffle resources around.[1] This dc-priority would be another possibly useful use case for this. Just my two cents. Florian [1] Yes, semi-doable with putting the cluster into maintenance mode before firing up the new node, setting that node into standby, and then unsetting maintenance mode. But that's just an additional step that users can easily forget about. -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [RFC] [Patch] DC node preferences (dc-priority)
On Fri, May 25, 2012 at 11:38 AM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Fri, May 25, 2012 at 11:15:32AM +0200, Florian Haas wrote: On Fri, May 25, 2012 at 10:45 AM, Lars Ellenberg lars.ellenb...@linbit.com wrote: Sorry, sent to early. That would not catch the case of cluster partitions joining, only the pacemaker startup with fully connected cluster communication already up. I thought about a dc-priority default of 100, and only triggering a re-election if I am DC, my dc-priority is 50, and I see a node joining. Hardcoded arbitrary defaults aren't that much fun. You can use any number, but 100 is the magic threshold is something I wouldn't want to explain to people over and over again. Then don't ;-) Not helping, and irrelevant to this case. Besides that was an example. Easily possible: move the I want to lose vs I want to win magic number to be 0, and allow both positive and negative priorities. You get to decide whether positive or negative is the I'd rather lose side. Want to make that configurable as well? Right. Nope, 0 is used as a threshold value in Pacemaker all over the place. So allowing both positive and negative priorities and making 0 the default sounds perfectly sane to me. I don't think this can be made part of the cib configuration, DC election takes place before cibs are resynced, so if you have diverging cibs, you possibly end up with a never ending election? Then maybe the election is stable enough, even after this change to the algorithm. Andrew? But you'd need to add an other trigger on dc-priority in configuration changed, complicating this stuff for no reason. We actually discussed node defaults a while back. Those would be similar to resource and op defaults which Pacemaker already has, and set defaults for node attributes for newly joined nodes. At the time the idea was to support putting new joiners in standby mode by default, so when you added a node in a symmetric cluster, you wouldn't need to be afraid that Pacemaker would shuffle resources around.[1] This dc-priority would be another possibly useful use case for this. Not so sure about that. [1] Yes, semi-doable with putting the cluster into maintenance mode before firing up the new node, setting that node into standby, and then unsetting maintenance mode. But that's just an additional step that users can easily forget about. Why not simply add the node to the cib, and set it to standby, before it even joins for the first time. Haha, good one. Wait, you weren't joking? Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] DRBD LVM EXT4 NFS performance
On Sun, May 20, 2012 at 12:05 PM, Christoph Bartoschek po...@pontohonk.de wrote: Hi, we have a two node setup with drbd below LVM and an Ext4 filesystem that is shared vi NFS. The system shows low performance and lots of timeouts resulting in unnecessary failovers from pacemaker. The connection between both nodes is capable of 1 GByte/s as shown by iperf. The network between the clients and the nodes is capable of 110 MByte/s. The RAID can be filled with 450 MByte/s. No it can't (most likely); see below. Thus I would expect to have a write performance of about 100 MByte/s. But dd gives me only 20 MByte/s. dd if=/dev/zero of=bigfile.10G bs=8192 count=1310720 1310720+0 records in 1310720+0 records out 10737418240 bytes (11 GB) copied, 498.26 s, 21.5 MB/s If you used that same dd invocation for your local test that allegedly produced 450 MB/s, you've probably been testing only your page cache. Add oflag=dsync or oflag=direct (the latter will only work locally, as NFS doesn't support O_DIRECT). If your RAID is one of reasonably contemporary SAS or SATA drives, then a sustained to-disk throughput of 450 MB/s would require about 7-9 stripes in a RAID-0 or RAID-10 configuration. Is that what you've got? Or are you writing to SSDs? While the slow dd runs there are timeouts on the server resulting in a restart of some resources. In the logfile I also see: [329014.592452] INFO: task nfsd:2252 blocked for more than 120 seconds. [329014.592820] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [329014.593273] nfsd D 0007 0 2252 2 0x [329014.593278] 88060a847c40 0046 88060a847bf8 00030001 [329014.593284] 88060a847fd8 88060a847fd8 88060a847fd8 00013780 [329014.593290] 8806091416f0 8806085bc4d0 88060a847c50 88061870c800 [329014.593295] Call Trace: [329014.593303] [8165a55f] schedule+0x3f/0x60 [329014.593309] [81265085] jbd2_log_wait_commit+0xb5/0x130 [329014.593315] [8108aec0] ? add_wait_queue+0x60/0x60 [329014.593321] [812111b8] ext4_sync_file+0x208/0x2d0 [329014.593328] [811a62dd] vfs_fsync_range+0x1d/0x40 [329014.593339] [a0227e51] nfsd_commit+0xb1/0xd0 [nfsd] [329014.593349] [a022f28d] nfsd3_proc_commit+0x9d/0x100 [nfsd] [329014.593356] [a0222a4b] nfsd_dispatch+0xeb/0x230 [nfsd] [329014.593373] [a00e9d95] svc_process_common+0x345/0x690 [sunrpc] [329014.593379] [8105f990] ? try_to_wake_up+0x200/0x200 [329014.593391] [a00ea1e2] svc_process+0x102/0x150 [sunrpc] [329014.593397] [a02221ad] nfsd+0xbd/0x160 [nfsd] [329014.593403] [a02220f0] ? nfsd_startup+0xf0/0xf0 [nfsd] [329014.593407] [8108a42c] kthread+0x8c/0xa0 [329014.593412] [81666bf4] kernel_thread_helper+0x4/0x10 [329014.593416] [8108a3a0] ? flush_kthread_worker+0xa0/0xa0 [329014.593420] [81666bf0] ? gs_change+0x13/0x13 Has anyone an idea what could cause such problems? I have no idea for further analysis. As a knee-jerk response, that might be the classic issue of NFS filling up the page cache until it hits the vm.dirty_ratio and then having a ton of stuff to write to disk, which the local I/O subsystem can't cope with. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Is synchronizing rmtab needed?
On Mon, May 21, 2012 at 1:36 AM, Christoph Bartoschek po...@pontohonk.de wrote: Hi, we currently have the problem that when the NFS server is highly used the heartbeat:exportfs monitor script fails with a timeout because it cannot write the rmtab to the exported filesystem within the given time. So, how about increasing the timeout? My question is now. Is it necessary to synchronize rmtab? Shouldn't the clients just reconnect after a timeout? Synchronizing the rmtab is meant for the clients being able to reconnect correctly after NFS _failover_, not just a brief network hiccup between the NFS client and server. Hope this helps. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] question about stonith:external/libvirt
On Sun, May 20, 2012 at 6:40 AM, Matthew O'Connor m...@ecsorl.com wrote: After using the tutorial on the Hastexo site for setting up stonith via libvirt, I believe I have it working correctly...but...some strange things are happening. I have two nodes, with shared storage provided by a dual-primary DRBD resource and OCFS2. Here is one of my stonith primitives: primitive p_fence-l2 stonith:external/libvirt \ params hostlist=l2:l2.sandbox hypervisor_uri=qemu+ssh://matt@hv01/system stonith-timeout=30 pcmk_host_check=none \ op start interval=0 timeout=15 \ op stop interval=0 timeout=15 \ op monitor interval=60 \ meta target-role=Started This cluster has stonith-enabled=true in the cluster options, plus the necessary location statements in the cib. Does it have fencing resource-and-stonith in the DRBD configuration, and stonith_admin-fence-peer.sh as its fence-peer handler? To watch the DLM, I run dbench on the shared storage on the node I let live. While it's running, I creatively nuke the other node. If I just killall pacemakerd on l2 for instance, the DLM seems unaffected and the fence takes place, rebooting the now failed node l2. No real interruption of service on the surviving node, l3. Yet, if I halt -f -n on l2, the fence still takes place but the surviving node's (l3's) DLM hangs and won't come back until I bring the failed node back online. A hanging DLM is OK, and DLM recovery after the failed node comes back is OK too, but of course the DLM should also recover once it's satisfied that the offending node has been properly fenced. Any logs from stonith-ng on l3? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] question about stonith:external/libvirt
On Mon, May 21, 2012 at 8:14 PM, Matthew O'Connor m...@ecsorl.com wrote: On 05/21/2012 05:43 AM, Florian Haas wrote: Does it have fencing resource-and-stonith in the DRBD configuration, and stonith_admin-fence-peer.sh as its fence-peer handler? That was the problem. Totally forgot to update my DRBD configuration. I actually wasn't saying that that was the root cause of your problem. :) But it's worth looking into, anyhow. For sake of testing, I used the crm-fence-peer.sh script - it seemed to do the trick, although I strongly suspect this is the wrong script for the job. It is. No good for dual-Primary, really, as it doesn't prevent split brain in that sort of configuration. Do I need to write my own script to call stonith_admin? No, stonith_admin-fence-peer.sh ships with recent DRBD releases. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Can Corosync bind to two networks
On Sat, May 12, 2012 at 2:49 AM, Steve Davidson steve.david...@pearl.com wrote: We want to run the Corosync heartbeat on the private net and, as a backup heartbeat, allow Corosync heartbeat on our public net as well. Thus in /etc/corosync/corosync.conf we need something like: bindaddr_primary: 192.168.57.0 bindaddr_secondary: 125.125.125.0 Our thinking is: if the private net connection fails but everything else is okay then we don't need to disrupt services since the private net failure won't affect our users. Is there any way to do this? Up to here the question makes sense, and Arnold already answered it. Use a redundant ring mode, and define two rings. man corosync.conf; look for redundant. Otherwise we need two interfaces connected to separate switches just for an (HA) heartbeat. This part doesn't make sense. Are you thinking that because you're using redundant rings, you _don't_ need to connect each of your nodes to two switches? Well, you do. Plugging all NICs in a redundant ring configuration into the same physical switch makes that switch a single point of failure. You can combine RRP with bonding, but regardless, one switch alone won't help. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] pacemaker+ocfs2 +RAC
On Mon, Apr 2, 2012 at 7:00 AM, Ruwan Fernando ruwanm...@gmail.com wrote: Hi, I was required to build oracle cluster so I configured pacemaker+ corosync+drbd+ocfs2 and built Active-active cluster. Why? pacemaker+corosync+drbd+xfs+oracle works just fine and is fully integrated with Pacemaker. RAC is primarily for scaleout, not for HA. And as Lars said, RAC won't accept any cluster manager other than its own. Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] socket is incremented after running crm shell
On Tue, Apr 3, 2012 at 5:53 PM, David Vossel dvos...@redhat.com wrote: I see the same thing. I'm using the latest pacemaker source from the master branch, so this definitely still exists. For me the file leak occurs every time I issue a cibadmin --replace --xml-file command. The shell is doing the same command internally for adding and removing resources, so I see it there as well. I opened a bug report for this. http://bugs.clusterlabs.org/show_bug.cgi?id=5051 What version of glib is this? Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Pacemaker 1.1.7 now available
On Mon, Apr 2, 2012 at 11:33 AM, Andrew Beekhof and...@beekhof.net wrote: On Fri, Mar 30, 2012 at 8:33 PM, Florian Haas flor...@hastexo.com wrote: On Fri, Mar 30, 2012 at 10:37 AM, Andrew Beekhof and...@beekhof.net wrote: I blogged about it, which automatically got sent to twitter, and I updated the IRC channel topic, but alas I forgot to mention it here :-) So in case you missed it, 1.1.7 is finally out. Special mention is due to David and Yan for the nifty features they've been writing lately. Thanks guys! Quick question: the blog post doesn't mention libqb specifically, the changelog says core: *Support* libqb for logging (as opposed to require) but the RPM spec file introduces a hard BuildRequires on libqb-devel. Is there such a thing as a soft BuildRequires? Nope. I was repeating myself redundantly. I apologetically apologize. Is this a hard dependency? Not yet, but IPC will likely be libqb-based for 1.1.8 which will make it a hard requirement. IOW does libqb have to be packaged on distros where it's not currently available, or can people build without libqb support and still be able to use 1.1.7? For 1.1.7 you can build without. Thanks. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Corosync with puppet
On Mon, Apr 2, 2012 at 11:34 AM, Hugo Deprez hugo.dep...@gmail.com wrote: Dear community, I am using a puppet mode in order to manage my cluster. I get a weird thing with the start stop of the corosync daemon. When I modify the corosync.conf file, puppet is asked to restart / reload corosync, but it failed on the command : start-stop-daemon --stop --quiet --retry forever/QUIT/1 --pidfile /var/run/corosync.pid this command doesn't seems working when corosync is running. I would think that your local Corosync doesn't like the fact that it's the only Corosync instance configured to run with parameters different from the other Corosync instances in the cluster. What I always do when I need to make changes to the Corosync config is enable Pacemaker maintenance mode, shut down pacemakerd corosync on all nodes, make the change, fire corosync pacemakerd back up, disable maintenance mode. I don't know how you would duplicate this in puppet, or if that's even possible, but that would be my generally recommended approach. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] OCF_RESKEY_CRM_meta_{ordered,notify,interleave}
On Mon, Apr 2, 2012 at 11:54 AM, Andrew Beekhof and...@beekhof.net wrote: On Fri, Mar 30, 2012 at 7:34 PM, Florian Haas flor...@hastexo.com wrote: On Fri, Mar 30, 2012 at 1:12 AM, Andrew Beekhof and...@beekhof.net wrote: Because it was felt that RAs shouldn't need to know. Those options change pacemaker's behaviour, not the RAs. But subsequently, in lf#2391, you convinced us to add notify since it allowed the drbd agent to error out if they were not turned on. Yes, and for ordered the motivation is exactly the same. Let me give a bit of background info. I'm currently working on an RA for GlusterFS volumes (the server-side stuff, everything client side is already covered in ocf:heartbeat:Filesystem). GlusterFS volumes are composed of bricks, and for every brick there's a separate process to be managed on each cluster node. When these brick processes fail, GlusterFS has no built-in way to recover, and that's where Pacemaker can be helpful. Obviously, you would run that RA as a clone, on however many nodes constitute your GlusterFS storage cluster. Now, while brick daemons can be _monitored_ individually, they can only be _started_ as part of the volume, with the gluster volume start command. And if we start a volume simultaneously on multiple nodes, GlusterFS just produces an error on all but one of them, and that error is also a generic one and not discernible from other errors by exit code (yes, you may rant). So, whenever we need to start 1 clone instance, we run into this problem: 1. Check whether brick is already running. 2. No, it's not. Start volume (this leaves other bricks untouched, but fires up the brick daemons expected to run locally). 3. Grumble. A different node just did the same thing. 4. All but one fail on start. Yes, all this isn't necessarily wonderful design (the start volume command could block until volume operations have completed on other servers, or it could error out with a try again error, or it could sleep randomly before retrying, or something else), but as it happens configuring the clone as ordered makes all of this evaporate. And it simply would be nice to be able to check whether clone ordering is enabled, during validate. I'd need more information. The RA shouldn't need to care I would have thought. The ordering happens in the PE/crmd, the RA should just do what its told. Quite frankly, I don't quite get this segregation of meta attributes we expect to be relevant to the RA The number of which is supposed to be zero. I'm not sure cutting down on questions to the mailing list is a good enough reason for adding additional exceptions. Well, but you did read the technical reason I presented here? The one truly valid exception in my mind is globally-unique, since the monitor operation has to work quite differently. Why are we not supposed to check for things like notify, ordered, allow-migrate? My concern with providing them all to RAs is that someone will probably start abusing them. _Everything_ about an RA can be abused. Why is that any concern of yours? You can't possibly enforce, from Pacemaker, that an RA actually does what it's supposed to do. Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] OCF_RESKEY_CRM_meta_{ordered,notify,interleave}
On Mon, Apr 2, 2012 at 12:32 PM, Andrew Beekhof and...@beekhof.net wrote: Well, but you did read the technical reason I presented here? Yes, and it boiled down to don't let the user hang themselves. Which is a noble goal, I just don't like the way we're achieving it. Why not advertise the requirements in the metadata somehow? The only way to do that is in the longdesc. There is nothing in the schema that would allow us to do this in a machine-readable way so the shell, HAWK, LCMC or anything else could warn the user by themselves. Why are we not supposed to check for things like notify, ordered, allow-migrate? My concern with providing them all to RAs is that someone will probably start abusing them. _Everything_ about an RA can be abused. Why is that any concern of yours? You can't possibly enforce, from Pacemaker, that an RA actually does what it's supposed to do. No, but I can take away the extra ammo :) You can count on there always being one round in the gun pointed at your foot. Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Migration of lower resource causes dependent resources to restart
On Thu, Mar 29, 2012 at 8:35 AM, Andrew Beekhof and...@beekhof.net wrote: On Thu, Mar 29, 2012 at 5:28 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, Pacemaker restarts resources when resource they depend on (ordering only, no colocation) is migrated. I mean that when I do crm resource migrate lustre, I get LogActions: Migrate lustre#011(Started lustre03-left - lustre04-left) LogActions: Restart mgs#011(Started lustre01-left) I only have one ordering constraint for these two resources: order mgs-after-lustre inf: lustre:start mgs:start This reminds me what have been with reload in a past (dependent resource restart when lower resource is reloaded). Shouldn't this be changed? Migration usually means that service is not interrupted... Is that strictly true? Always? No. Few things are always true. :) However, see below. My understanding was although A thinks the migration happens instantaneously, it is in fact more likely to be pause+migrate+resume and during that time anyone trying to talk to A during that time is going to be disappointed. I tend to be with Vladislav on this one. The thing that most people would expect from a live migration is that it's interruption free. And what allow-migrate was first implemented for (iirc), live migrations for Xen, does fulfill that expectation. Same thing is true for live migrations in libvirt/KVM, and I think anyone would expect essentially the same thing from checkpoint/restore migrations where they're available. So I guess it's reasonable to assume that if one resource migrates, dependent resources need not be restarted. But since Pacemaker now does restart them, you might need to figure out a way to preserve the existing functionality for users who rely on that. Not sure if any do, though. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] OCF_RESKEY_CRM_meta_{ordered,notify,interleave}
On Fri, Mar 30, 2012 at 1:12 AM, Andrew Beekhof and...@beekhof.net wrote: Because it was felt that RAs shouldn't need to know. Those options change pacemaker's behaviour, not the RAs. But subsequently, in lf#2391, you convinced us to add notify since it allowed the drbd agent to error out if they were not turned on. Yes, and for ordered the motivation is exactly the same. Let me give a bit of background info. I'm currently working on an RA for GlusterFS volumes (the server-side stuff, everything client side is already covered in ocf:heartbeat:Filesystem). GlusterFS volumes are composed of bricks, and for every brick there's a separate process to be managed on each cluster node. When these brick processes fail, GlusterFS has no built-in way to recover, and that's where Pacemaker can be helpful. Obviously, you would run that RA as a clone, on however many nodes constitute your GlusterFS storage cluster. Now, while brick daemons can be _monitored_ individually, they can only be _started_ as part of the volume, with the gluster volume start command. And if we start a volume simultaneously on multiple nodes, GlusterFS just produces an error on all but one of them, and that error is also a generic one and not discernible from other errors by exit code (yes, you may rant). So, whenever we need to start 1 clone instance, we run into this problem: 1. Check whether brick is already running. 2. No, it's not. Start volume (this leaves other bricks untouched, but fires up the brick daemons expected to run locally). 3. Grumble. A different node just did the same thing. 4. All but one fail on start. Yes, all this isn't necessarily wonderful design (the start volume command could block until volume operations have completed on other servers, or it could error out with a try again error, or it could sleep randomly before retrying, or something else), but as it happens configuring the clone as ordered makes all of this evaporate. And it simply would be nice to be able to check whether clone ordering is enabled, during validate. I'd need more information. The RA shouldn't need to care I would have thought. The ordering happens in the PE/crmd, the RA should just do what its told. Quite frankly, I don't quite get this segregation of meta attributes we expect to be relevant to the RA and meta attributes the RA shouldn't care about. Can't we just have a rule that _all_ meta attributes, like parameters, are just always available in the RA environment with the OCF_RESKEY_CRM_meta_ prefix? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Pacemaker 1.1.7 now available
On Fri, Mar 30, 2012 at 10:37 AM, Andrew Beekhof and...@beekhof.net wrote: I blogged about it, which automatically got sent to twitter, and I updated the IRC channel topic, but alas I forgot to mention it here :-) So in case you missed it, 1.1.7 is finally out. Special mention is due to David and Yan for the nifty features they've been writing lately. Thanks guys! Quick question: the blog post doesn't mention libqb specifically, the changelog says core: *Support* libqb for logging (as opposed to require) but the RPM spec file introduces a hard BuildRequires on libqb-devel. Is this a hard dependency? IOW does libqb have to be packaged on distros where it's not currently available, or can people build without libqb support and still be able to use 1.1.7? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Nodes not rejoining cluster
On Fri, Mar 30, 2012 at 5:38 PM, Gregg Stock gr...@damagecontrolusa.com wrote: I took the last 200 lines of each. Can you check the health of the Corosync membership, as per this URL? http://www.hastexo.com/resources/hints-and-kinks/checking-corosync-cluster-membership Do _all_ nodes agree on the health of the rings, and on the cluster member list? Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Nodes not rejoining cluster
On Fri, Mar 30, 2012 at 6:09 PM, Gregg Stock gr...@damagecontrolusa.com wrote: That looks good. They were all the same and had the correct ip addresses. So you've got both healthy rings, and all 5 nodes have 5 members in the membership list? Then this would make it a Pacemaker problem. IIUC the code causing Pacemaker to discard the update from a node that is not in our membership has actually been removed from 1.1.7[1] so an upgrade may not be a bad idea, but you'll probably have to wait for a few more days until packages become available. Still, out of curiosity, and since you're saying this is a test cluster: what happens if you shut down corosync and Pacemaker on *all* the nodes, and bring it back up? We've had a few people report these not in our membership issues on the list before, and they seem to appear in a very sporadic and transient fashion, so the root cause (which may well be totally trivial) hasn't really been found out -- as far as I can tell, at least. Hence, my question of whether the issue persists after a full cluster shutdown. Florian [1] https://github.com/ClusterLabs/pacemaker/commit/03f6105592281901cc10550b8ad19af4beb5f72f -- note Andrew will rightfully flame me to a crisp if I've misinterpreted that commit, so caveat lector. :) -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] manually failing back resources when set sticky
On Fri, Mar 30, 2012 at 8:26 PM, Brian J. Murrell br...@interlinx.bc.ca wrote: In my cluster configuration, each resource can be run on one of two node and I designate a primary and a secondary using location constraints such as: location FOO-primary FOO 20: bar1 location FOO-secondary FOO 10: bar2 And I also set a default stickiness to prevent auto-fail-back (i.e. to prevent flapping): rsc_defaults $id=rsc-options resource-stickiness=1000 This all works as I expect. Resources run where I expect them to while everything is operating normally and when a node fails the resource migrates to the secondary and stays there even when the primary node comes back. The question is, what is the proper administrative command(s) to move the resource back to it's primary after I have manually determined that that node is OK after coming back from a failure? I figure I could just create a new resource constraint, wait for the migration and then remove it, but I just wonder if there is a more atomic move back to your preferred node command I can issue. crm configure rsc_defaults resource-stickiness=0 ... and then when resources have moved back, set it to 1000 again. It's really that simple. :) Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Issue with ordering
On Thu, Mar 29, 2012 at 10:07 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, I'm continuing experiments with lustre on stacked drbd, and see following problem: At the risk of going off topic, can you explain *why* you want to do this? If you need a distributed, replicated filesystem with asynchronous replication capability (the latter presumably for DR), why not use a Distributed-Replicated GlusterFS volume with geo-replication? Note that I know next to nothing about your actual detailed requirements, so GlusterFS may well be non-ideal for you and my suggestion may thus be moot, but it would be nice if you could explain why you're doing this. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Issue with ordering
On Thu, Mar 29, 2012 at 11:40 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Florian, 29.03.2012 11:54, Florian Haas wrote: On Thu, Mar 29, 2012 at 10:07 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi Andrew, all, I'm continuing experiments with lustre on stacked drbd, and see following problem: At the risk of going off topic, can you explain *why* you want to do this? If you need a distributed, replicated filesystem with asynchronous replication capability (the latter presumably for DR), why not use a Distributed-Replicated GlusterFS volume with geo-replication? I need fast POSIX fs scalable to tens of petabytes with support for fallocate() and friends to prevent fragmentation. I generally agree with Linus about FUSE and userspace filesystems in general, so that is not an option. I generally agree with Linus and just about everyone else that filesystems shouldn't require invasive core kernel patches. But I digress. :) Using any API except what VFS provides via syscalls+glibc is not an option too because I need access to files from various scripted languages including shell and directly from a web server written in C. Having bindings for them all is a real overkill. And it all is in userspace again. So I generally have choice of CEPH, Lustre, GPFS and PVFS. CEPH is still very alpha, so I can't rely on it, although I keep my eye on it. GPFS is not an option because it is not free and produced by IBM (can't say which of these two is more important ;) ) Can't remember why exactly PVFS is a no-go, their site is down right now. Probably userspace server implementation (although some examples like nfs server discredit idea of in-kernel servers, I still believe this is a way to go). Ceph is 100% userspace server side, jftr. :) And it has no async replication capability at this point, which you seem to be after. Lustre is widely deployed, predictable and stable. It fully runs in kernel space. Although Oracle did its best to bury Lustre development, it is actively developed by whamcloud and company. They have builds for EL6, so I'm pretty happy with this. Lustre doesn't have any replication built-in so I need to add it on a lower layer (no rsync, no rsync, no rsync ;) ). DRBD suits my needs for a simple HA. But I also need datacenter-level HA, that's why I evaluate stacked DRBD and tickets with booth. So, frankly speaking, I decided to go with Lustre not because it is so cool (it has many-many niceties), but because all others I know do not suit my needs at all due to various reasons. Hope this clarifies my point, It does. Doesn't necessarily mean I agree, but the point you're making is fine. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] resources show as running on all nodes right after adding them
On Wed, Mar 28, 2012 at 4:26 PM, Brian J. Murrell br...@interlinx.bc.ca wrote: We seem to have occasion where we find crm_resource reporting that a resource is running on more (usually all!) nodes when we query right after adding it: # crm_resource -resource chalkfs-OST_3 --locate resource chalkfs-OST_3 is running on: chalk02 resource chalkfs-OST_3 is running on: chalk03 resource chalkfs-OST_3 is running on: chalk04 resource chalkfs-OST_3 is running on: chalk01 Further checking reveals: # crm status Last updated: Mon Dec 19 11:30:31 2011 Stack: openais Current DC: chalk01 - partition with quorum Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87 4 Nodes configured, 4 expected votes 3 Resources configured. Online: [ chalk01 chalk02 chalk03 chalk04 ] MGS_1 (ocf::hydra:Target): Started chalk01 chalkfs-OST_3 (ocf::hydra:Target) Started [ chalk02 chalk03 chalk04 chalk01 ] resource chalkfs-OST_3 is running on: chalk02 resource chalkfs-OST_3 is running on: chalk03 resource chalkfs-OST_3 is running on: chalk04 resource chalkfs-OST_3 is running on: chalk01 Clearly this resource is not running on all nodes, so why is it being reported as such? Probably because your resource agent reports OCF_SUCCESS on a probe operation when it ought to be returning OCF_NOT_RUNNING. Pastebin the source of ocf:hydra:Target and someone will be able to point you to the exact part of the RA that's causing the problem. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] resources show as running on all nodes right after adding them
On Wed, Mar 28, 2012 at 5:07 PM, Brian J. Murrell br...@interlinx.bc.ca wrote: On 12-03-28 10:39 AM, Florian Haas wrote: Probably because your resource agent reports OCF_SUCCESS on a probe operation To be clear, is this the status $OP in the agent? Nope, monitor. Of course, in your implementation monitor may be just a wrapper around status -- no way to tell without knowing any details about the agent. That being said, if there's really an upstream supported resource agent as Bernd is suggesting, why not use that? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] High Performance High Availability Guide: new community documentation project
Hi everyone, for those interested in contributing to a community documentation project focusing on performance optimization in high availability clusters, please take a look at the following URLs: https://github.com/fghaas/hp-ha-guide (GitHub repo) http://www.hastexo.com/node/173 (blog post -- feel free to skip the Past and Present part; those are unimportant compared to Future) This is a fledgling project and not complete by any stretch of the imagination. Comments and feedback are much, much appreciated. Let's see if we can get this done. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Resource-level fencing without stonith?
On Fri, Mar 23, 2012 at 6:07 PM, Lajos Pajtek lajospaj...@yahoo.com wrote: Hi, I am building a two-node, active-standby cluster with shared storage. I think I got the basic primitives right, but fencing, implemented using SCSI persistent reservations, gives me some headache.First, I am unable to get stonith:fence_scsi work on RH/CentOS 6. (Using the sg_persist utility I am able to register keys, etc so that's not the problem.) Any specific reason for not using IPMI? That's practically ubiquitous, and pretty much always works. This made me think about the fact that conceptually SCSI fencing should be resource-level fencing, not node-level fencing. The other node is not powered down or rebooted so perhaps I shouldn't be using stonith at all. Currently I think about having stonith-enabled=false and I am writing a master-slave resource agent script to manage the SCSI persistent reservations in case of fail-over. That idea of such a resource agent is fine, but please don't write one from scratch. Instead, expand on this one: https://github.com/nif/ClusterLabs__resource-agents/blob/master/heartbeat/sg_persist That one's off to a good start, but the original author never had time to finish it. Mind you; you'll likely still want STONITH, even if you use the sg_persist RA. Hope this helps. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Resource Agent ethmonitor
On Tue, Mar 20, 2012 at 4:18 PM, Fiorenza Meini fme...@esseweb.eu wrote: Hi there, has anybody configured successfully the RA specified in the object of the message? I got this error: if_eth0_monitor_0 (node=fw1, call=2297, rc=-2, status=Timed Out): unknown exec error Your ethmonitor RA missed its 50-second timeout on the probe (that is, the initial monitor operation). You should be seeing Monitoring of if_eth0 failed, X retries left warnings in your logs. Grepping your syslog for ethmonitor will probably turn up some useful results. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [Openstack] Howto Nova setup with HA?
Hi everyone, apologies for the cross-post; I believe this might be interesting to people on both the openstack and the pacemaker lists. Please see below. On Tue, Feb 14, 2012 at 9:07 AM, i3D.net - Tristan van Bokkem tristanvanbok...@i3d.nl wrote: Hi Stackers, It seems running Openstack components in High Availability hasn't been really a focus point lately, am I right? The general docs don't really mention HA except for nova-network. So I did some resource on how to run Nova in a High Availability and have some questions about it: The docs guides you on how to setup one cloud controller (running MySQL, nova-api, RabbitMQ etc.) and 2+n nodes for nova compute/network. But it does not mention how to make the cloud controller redundant. If the cloud controller brakes we have a serious problem! So, we can run MySQL in master-master mode on multiple hosts, we can run nova-api on serveral hosts and load balance those and RabbitMQ has a cluster ha setup as well but is this the way to go? I can't find a clear answer to this. I am hoping one can shine some light on this! Best regards, Tristan van Bokkem Datacenter Operations I've taken the liberty to put together a bit of a summary of the discussion we've had here,[1] roll it into a design summit brainstorm proposal, and also post it on my blog, here: http://www.hastexo.com/blogs/florian/2012/03/21/high-availability-openstack I hope it's not a violation of list etiquette to say that instead of cross-posting all replies to both lists, everyone's welcome to make comments on that blog post, too (use your Launchpad OpenID). Please feel free to flame me to a crisp or call me an idiot; as some of you are aware I'm quite firmly an HA guy getting into OpenStack, rather than the other way around. Even if the design summit proposal doesn't make it through, perhaps a few interested people (Monty? Jay? Adam? Major?) would like to sit down over beverages to discuss this in person. All comments and feedback much appreciated. Thanks! Cheers, Florian [1] Pacemaker subscribers, for context the full thread is at http://www.mail-archive.com/openstack@lists.launchpad.net/msg07495.html ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] How can I preview the shadow configuration?
On Tue, Mar 20, 2012 at 11:15 AM, Rasto Levrinc rasto.levr...@gmail.com wrote: 2012/3/20 Mars gu gukaicoros...@163.com: Hi, I want to excute the command ,the problem occurred: [root@h10_148 ~]# ptest -bash: ptest: command not found How can I preview the shadow configuration? ptest has been replaced by crm_simulate. I thought I recalled that ptest was kicked out of the RHEL/CentOS packages in 1.1.6, and that 1.1.5 still shipped with it. At any rate, crm_simulate should be in both, and it would be the preferred utility to use. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Using shadow configurations noninteractively
On Mon, Mar 19, 2012 at 8:00 PM, Phil Frost p...@macprofessionals.com wrote: I'm attempting to automate my cluster configuration with Puppet. I'm already using Puppet to manage the configuration of my Xen domains. I'd like to instruct puppet to apply the configuration (via cibadmin) to a shadow config, but I can't find any sure way to do this. The issue is that running crm_shadow --create ... starts a subshell, but there's no easy way I can tell puppet to run a command, then run another command in the subshell it creates. Normally I'd expect some command-line option, but I can't find any. It does look like it sets the environment variable CIB_shadow. Is that all there is to it? Is it safe to rely on that behavior? I've never tried this specific use case, so bear with me while I go out on a limb, but the crm shell is fully scriptable. Thus you *should* be able to generate a full-blown crm script, with cib foo commands and whathaveyou, in a temporary file, and then just do crm /path/to/temp/file. Does that work for you? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] How to setup STONITH in a 2-node active/passive linux HA pacemaker cluster?
On Mon, Mar 19, 2012 at 8:14 PM, Mathias Nestler mathias.nest...@barzahlen.de wrote: Hi everyone, I am trying to setup an active/passive (2 nodes) Linux-HA cluster with corosync and pacemaker to hold a PostgreSQL-Database up and running. It works via DRBD and a service-ip. If node1 fails, node2 should take over. The same if PG runs on node2 and it fails. Everything works fine except the STONITH thing. Between the nodes is an dedicated HA-connection (10.10.10.X), so I have the following interface configuration: eth0 eth1 host 10.10.10.251 172.10.10.1 node1 10.10.10.252 172.10.10.2 node2 Stonith is enabled and I am testing with a ssh-agent to kill nodes. crm configure property stonith-enabled=true crm configure property stonith-action=poweroff crm configure rsc_defaults resource-stickiness=100 crm configure property no-quorum-policy=ignore crm configure primitive stonith_postgres stonith:external/ssh \ params hostlist=node1 node2 crm configure clone fencing_postgres stonith_postgres You're missing location constraints, and doing this with 2 primitives rather than 1 clone is usually cleaner. The example below is for external/libvirt rather than external/ssh, but you ought to be able to apply the concept anyhow: http://www.hastexo.com/resources/hints-and-kinks/fencing-virtual-cluster-nodes Hope this helps. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Using shadow configurations noninteractively
On Mon, Mar 19, 2012 at 9:00 PM, Phil Frost p...@macprofessionals.com wrote: On Mar 19, 2012, at 15:22 , Florian Haas wrote: On Mon, Mar 19, 2012 at 8:00 PM, Phil Frost p...@macprofessionals.com wrote: I'm attempting to automate my cluster configuration with Puppet. I'm already using Puppet to manage the configuration of my Xen domains. I'd like to instruct puppet to apply the configuration (via cibadmin) to a shadow config, but I can't find any sure way to do this. The issue is that running crm_shadow --create ... starts a subshell, but there's no easy way I can tell puppet to run a command, then run another command in the subshell it creates. Normally I'd expect some command-line option, but I can't find any. It does look like it sets the environment variable CIB_shadow. Is that all there is to it? Is it safe to rely on that behavior? I've never tried this specific use case, so bear with me while I go out on a limb, but the crm shell is fully scriptable. Thus you *should* be able to generate a full-blown crm script, with cib foo commands and whathaveyou, in a temporary file, and then just do crm /path/to/temp/file. Does that work for you? I don't think so, because the crm shell, unlike cibadmin, has no idempotent method of configuration I've found. With cibadmin, I can generate the configuration for the primitive and associated location constraints for each Xen domain in one XML file, and feed it cibadmin -M as many times as I want without error. I know that by running that command, the resulting configuration is what I had in the file, regardless if the configuration already existed, did not exist, or existed but some parameters were different. To do this with with crm, I'd have to also write code which checks if things are configured as I want them, then take different actions if it doesn't exist, already exists, or already exists but has the incorrect value. That's not impossible, but it's far harder to develop and quite likely I'll make an error in all that logic that will automate the destruction of my cluster. Huh? What's wrong with crm configure load replace somefile? Anyhow, I think you haven't really stated what you are trying to achieve, in detail. So: what is it that you want to do exactly? Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] offtopic scalable block-device
On Fri, Mar 16, 2012 at 10:13 AM, ruslan usifov ruslan.usi...@gmail.com wrote: Hello I search a solution for scalable block device (dist that can extend if we add some machines to cluster). Only what i find accepten on my task is ceph + RDB, but ceph on my test i very unstable(regulary crash of all it daemons) + have poor integration with pacemaker. So does anybody recommend some solution??? Which Ceph version are you using? Both the Ceph daemons and RBD are fully integrated into Pacemaker in upstream git. https://github.com/ceph/ceph/tree/master/src/ocf You may want to look at http://www.hastexo.com/category/tags/ceph for upcoming updates on this (RSS feed icon at the bottom). Hope this helps. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] offtopic scalable block-device
On Fri, Mar 16, 2012 at 11:06 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 16.03.2012 12:13, ruslan usifov wrote: Hello I search a solution for scalable block device (dist that can extend if we add some machines to cluster). Only what i find accepten on my task is ceph + RDB, but ceph on my test i very unstable(regulary crash of all it daemons) + have poor integration with pacemaker. So does anybody recommend some solution??? I'm now investigating possibilities of using Lustre+DRBD+pacemaker. Lustre is now available for EL6 thanks whamcloud and others. That's an option for a scalable _filesystem_, but the OP's question was about a block device, and to the best of my knowledge Lustre doesn't offer that. Unless you want to use loop devices in Lustre, which sounds awkward to say the least. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] offtopic scalable block-device
On Fri, Mar 16, 2012 at 11:14 AM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-03-16T11:13:17, Florian Haas flor...@hastexo.com wrote: Which Ceph version are you using? Both the Ceph daemons and RBD are fully integrated into Pacemaker in upstream git. https://github.com/ceph/ceph/tree/master/src/ocf You may want to look at http://www.hastexo.com/category/tags/ceph for upcoming updates on this (RSS feed icon at the bottom). is there a reason for integrating ceph with pacemaker? ceph does internal monitoring of OSTs etc anyway, doesn't it? Assuming you're referring to OSDs, yes it does. It does automatic failover for MDSs (if you use them) and MONs too. But it currently has no means of recovering an osd/mds/mon daemon in place when it crashes, and that's what those RAs do. Really trivial. Clearly, and the ceph devs and I agree on this, this is a stop-gap until upstart or systemd jobs for the ceph daemons (with respawn capability, of course) become widely available. The ocf:ceph:rbd RA by contrast serves an entirely different purpose, and I currently don't see how _that_ would be replaced by upstart or systemd. Unless either of those becomes so powerful (and cluster-aware) that we don't need Pacemaker at all anymore, but I don't see that happen anytime soon. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] offtopic scalable block-device
On Fri, Mar 16, 2012 at 12:50 PM, ruslan usifov ruslan.usi...@gmail.com wrote: I crash i have follow stack trcae How about taking that to the ceph-devel list? Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] offtopic scalable block-device
On Fri, Mar 16, 2012 at 12:42 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-03-16T11:28:36, Florian Haas flor...@hastexo.com wrote: is there a reason for integrating ceph with pacemaker? ceph does internal monitoring of OSTs etc anyway, doesn't it? Assuming you're referring to OSDs, yes it does. It does automatic failover for MDSs (if you use them) and MONs too. But it currently has no means of recovering an osd/mds/mon daemon in place when it crashes, and that's what those RAs do. Really trivial. Yes, I need to stop calling them OSTs, but that's what object storage targets were called before ceph came along ;-) Sorry. Yes, of course, I mean OSDs. Would this not be more readily served by a simple while loop doing the monitoring, even if systemd/upstart aren't around? Pacemaker is kind of a heavy-weight here. If you prefer to suggest a self-hacked while loop to your customers I'm not stopping you. The ocf:ceph:rbd RA by contrast serves an entirely different purpose, and I currently don't see how _that_ would be replaced by upstart or systemd. Unless either of those becomes so powerful (and cluster-aware) that we don't need Pacemaker at all anymore, but I don't see that happen anytime soon. Agreed. I was mostly curious about the server-side. Thanks for the clarification. I forgot to add, if you actually want to use a ceph _filesystem_ as a cloned Pacemaker resource, ocf:heartbeat:Filesystem now has support for that too. But that was just a trivial three-line patch, so nothing new there. Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] offtopic scalable block-device
On Fri, Mar 16, 2012 at 12:24 PM, ruslan usifov ruslan.usi...@gmail.com wrote: Luster looks very cool and stability, but it doesn't provide scalable block device (Ceph allow it throw RDB), require patched kernel (i doesn't find more modern patched kernels for ubuntu lucid), so i think that it doesn't acceptable for my use case Warning, pet peeve here. Everybody, it's RBD. OK? RBD. Not totally unlike in naming to DRBD. Although they're completely different, the one thing they do share is that they're block devices, not databases. End of pet peeve. Thanks for putting up with me. :) Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] offtopic scalable block-device
On Fri, Mar 16, 2012 at 4:55 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-03-16T13:36:34, Florian Haas flor...@hastexo.com wrote: Would this not be more readily served by a simple while loop doing the monitoring, even if systemd/upstart aren't around? Pacemaker is kind of a heavy-weight here. If you prefer to suggest a self-hacked while loop to your customers I'm not stopping you. I didn't say self-hacked, this could be a wrapper officially included. Surely you've submitted one? Actually, better still, you could submit a systemd job; as the Ceph guys themselves seem to be focused more on upstart at this time. It just seems that pacemaker+corosync+... is overkill for watching the health of a single service on one node. Up to 3 services, really, but that's a technicality. (And no, I think I wouldn't want to run pacemaker on my OSD cluster, because that doesn't scale.) If *your* Ceph cluster needs to be 100 nodes plus, then you're right. Mine don't. And, anyway, at this point in time, I'd tell my customers to skip ceph/RADOS for the next 6-12 months still, but to contact us off-list if they're interested in PoCs ;-) Right. Feel free to point them to http://www.hastexo.com/blogs/florian/2012/03/08/ceph-tickling-my-geek-genes if they want a quick overview. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] getting started - crm hangs when adding resources, even crm ra classes hangs
On Wed, Mar 14, 2012 at 2:16 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Tue, Mar 13, 2012 at 05:59:35PM -0400, Phillip Frost wrote: On Mar 13, 2012, at 2:21 PM, Jake Smith wrote: From: Phillip Frost p...@macprofessionals.com Subject: [Pacemaker] getting started - crm hangs when adding resources, even crm ra classes hangs more interestingly, even crm ra classes never terminates, again with no output, and nothing appended to syslog. In Ubuntu 10.04 there is a bug in glib causing hanging on shutdown as well as hanging on some crm commands - there are patches out to fix it for Ubuntu specifically (https://bugs.launchpad.net/ubuntu/oneiric/+source/cluster-glue/+bug/821732). Not sure if they affect Debian too. Seems to be the same issue, somewhat. I noticed sometimes I'd get lrmadmin -C to work once, but the 2nd time it would deadlock. That behavior was described in the launchpad link you gave. It seems what's happened is the glib bug has been patched in debian unstable, and this raexecupstart patch is disabled in the cluster-glue package as described in launchpad. squeeze-backports took the package from unstable, but glib is not patched in squeeze, so raexecupstart.patch is still needed. Not re-enabled in squeeze-backports, however. So, I built cluster-glue from the debian source package after manually applying that patch, and now I can run lrmadmin -C all day. Now it's also leaking sockets, but I guess I can live with that. Do you have upstart at all? In that case, the debian package shouldn't have the upstart enabled when building cluster-glue. The current cluster-glue package in squeeze-backports, cluster-glue_1.0.9+hg2665-1~bpo60+2, has upstart disabled. Double-check that you're running that version. If you do, and the issue persists, please let us know. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] getting started - crm hangs when adding resources, even crm ra classes hangs
On Wed, Mar 14, 2012 at 2:37 PM, Phillip Frost p...@macprofessionals.com wrote: On Mar 14, 2012, at 9:25 AM, Florian Haas wrote: Do you have upstart at all? In that case, the debian package shouldn't have the upstart enabled when building cluster-glue. The current cluster-glue package in squeeze-backports, cluster-glue_1.0.9+hg2665-1~bpo60+2, has upstart disabled. Double-check that you're running that version. If you do, and the issue persists, please let us know. Indeed, that's the version that hit the repo last night when I decided to quit. This morning, I tried that version and concluded I was experiencing the same issue. Are you absolutely certain? Can you confirm that you're running the ~bpo60+2 (note trailing 2) build, that you're actually running an lrmd binary from that version (meaning: that you properly killed your lrmd prior to installing that package), _and_ that lrmadmin - C does *not* list upstart? Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] getting started - crm hangs when adding resources, even crm ra classes hangs
On Wed, Mar 14, 2012 at 4:58 PM, Phillip Frost p...@macprofessionals.com wrote: Can you confirm that you're running the ~bpo60+2 (note trailing 2) build, that you're actually running an lrmd binary from that version (meaning: that you properly killed your lrmd prior to installing that package), _and_ that lrmadmin - C does *not* list upstart? Let's discard all of my previous conclusions. Apparently I was confused. Now, I'm sure I'm running +2 on all three nodes. And, I restarted pacemaker and corosync on all the nodes. I'm basing my knowledge of what versions I'm running on apt-cache policy, output copied below. dpkg -l package would also tell you what versions you have installed, in a more concise fashion. I can confirm that lrmadmin -C does not list upstart (also below). Nor does it leak sockets, as reported by lsof -f | grep lrm_callback_sock. Yep, no surprise here. However, sometimes pacemakerd will not stop cleanly. OK. Whether this is related to your original problem or not a complete open question, jftr. I thought it might happen when stopping pacemaker on the current DC, but after successfully reproducing this failure twice, I couldn't do it again. Pacemakerd seems to exit, but fail to notify the other nodes of its shutdown. Syslog is flooded with Retransmit List messages (log attached). These persist until I stop corosync. Asked immediately after stopping pacemaker and corosync on one node, crm status other nodes will report that node as still online. After a while, the stopped node switches to offline; I assume some timeout is expiring and they are assuming it crashed. You didn't give much other information, so I'm asking this on a hunch: does your pacemaker service configuration stanza for corosync (either in /etc/corosync/corosync.conf or in /etc/corosync/service.d/pacemaker) say ver: 0 or ver: 1? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] 1.1.6 rpm build for RHEL5
On Sat, Mar 10, 2012 at 12:39 AM, Larry Brigman larry.brig...@gmail.com wrote: I have looked and cannot seem to find the pre-built 1.1.6 rpm set in the clusterlabs repo. It ships with RHEL/CentOS 6.2. On RHEL 5 however, 1.1.6 doesn't build. If you don't want to wait for 1.1.7, you'll either need to apply this post-1.1.6 patch: https://github.com/ClusterLabs/pacemaker/commit/eade0edee5605dcab96522eef779ccc041eddb21 ... or just use 1.1.5, which should be fine for most practical purposes. For RHEL 5, you'll probably also want to build Corosync 1.4.2. Or run on Heartbeat. Hope this helps. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] DRBD M/S Promotion
On Fri, Mar 9, 2012 at 11:24 PM, Scott Piazza scott.pia...@bespokess.com wrote: I have a two-node active/passive pacemaker cluster running with a single DRBD resource set up as master-slave. Today, we restarted both servers in the cluster, and when they came back up, both started pacemaker and corosync correctly, but the DRBD resource didn't promote. I manually promoted the DRBD resource and all of the child services were able to start up without issue. There were no error counts showing in crm_mon. The only error I noted in the /var/log/messages log referenced not being able to mount /dev/drbd0 because the device wasn't present. CIB is available at http://pastebin.com/4b6Fi87w. I'm trying to figure out what is wrong with my configuration. This, most probably: location cli-prefer-ms_drbd_exports ms_drbd_exports \ rule $id=cli-prefer-rule-ms_drbd_exports inf: #uname eq pawhsrv01.libertydistribution.com location cli-prefer-pawhsrv pawhsrv \ rule $id=cli-prefer-rule-pawhsrv inf: #uname eq pawhsrv01.libertydistribution.com Remove those. The first is completely wrong, the second is likely a leftover from when you did crm resource migrate pawhsrv pawhsrv01.libertydistribution.com and forgot to clear the constraints when you were done. Hope this helps. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Surprisingly fast start of resources on cluster failover.
On Tue, Mar 6, 2012 at 1:49 PM, Florian Crouzat gen...@floriancrouzat.net wrote: I have Florian's rsyslog config: https://github.com/fghaas/pacemaker/blob/syslog/extra/rsyslog/pacemaker.conf.in I should mention that that rsyslog configuration is no longer being considered for upstream inclusion. See the discussion on the pull request, here: https://github.com/ClusterLabs/pacemaker/pull/17 As I understand it, Andrew's plan is to reduce excessively verbose logging with the switch to libqb. But thanks for trying it out. :) Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] What's the exact booth revision that ships in SLES 11 SP2?
Jiaju, would you mind pushing your git tags your GitHub booth repo? Currently, as far as I can see, there are no tags in that repo at all. It would be nice to be able to find out what exactly is the git revision that you guys ship in SP2. Thanks! Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] stonith in a virtual cluster
Jean-François, I realize I'm late to this discussion, however allow me to chime in here anyhow: On Mon, Feb 27, 2012 at 11:45 PM, Jean-Francois Malouin jean-francois.malo...@bic.mni.mcgill.ca wrote: Have you looked at fence_virt? http://www.clusterlabs.org/wiki/Guest_Fencing Yes I did. I had a quick go last week at compiling it on Debian/Squeeze with backports but with no luck. Seeing as you're on Debian, there really is no need to use fence_virt. Instead, you should just be able to use the external/libvirt STONITH plugin that ships with cluster-glue (in squeeze-backports). That plugin works like a charm and I've used it in testing many times. No need to compile anything. http://www.hastexo.com/resources/hints-and-kinks/fencing-virtual-cluster-nodes may be a helpful resource. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] OCFS2 in Pacemaker, post Corosync 2.0
Andrew, just a quick question out of curiosity: the ocf:pacemaker:o2cb resource and ocfs2_controld.pcmk require the OpenAIS CKPT service which is currently deprecated (as all of OpenAIS) and going away completely (IIUC) with Corosync 2.0. Does that mean that OCFS2 will be unsupported from Corosync 2.0 forward, as far as Pacemaker is concerned? Or has that CKPT dependency been removed, or will there be another supported way to run it? All insight much appreciated. Thanks! Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Upstart resources
2012/2/27 Ante Karamatić ante.karama...@canonical.com: On 27.02.2012 12:27, Florian Haas wrote: Alas, to the best of my knowledge the only way to change a specific job's respawn policy is by modifying its job definition. Likewise, that's the only way to enable or disable starting on system boot. So there is a way to overrule the package maintainer's default -- hacking the job definition. I've explained '(no)respawn' in the other mail. Manual starting/stopping is done by: echo 'manual' /etc/init/${service}.override That's all you need to forbid automatic starting or stopping the service. Oh thanks! I didn't know that, much to my dismay. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Question about master/slave resource promotion
On Sat, Feb 25, 2012 at 12:31 AM, David Vossel dvos...@redhat.com wrote: Hey, I have a 2 node cluster with a multi-state master/slave resource. When the multi-state resources start up on each node they enter the Slave role. At that point I can't figure out how to promote the resource to activate the Master role on one of the nodes. Is there anything special I need to do to get an instance of my multi-state resource to promote to the Master role? Yeah, actually using a resource type that is capable of running in master/slave mode would be a good start. :) Use ocf:pacemaker:Stateful instead of ocf:pacemaker:Dummy in your test setup. Hope this helps. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Last chance to object to the syntax for cluster tickets (multi-site clusters)
On 02/24/12 02:53, Andrew Beekhof wrote: We're about to lock in the syntax for cluster tickets (used for multi-ste clusters). The syntax rules are at: https://github.com/gao-yan/pacemaker/commit/9e492f6231df2d8dd548f111a2490f02822b29ea And its use, along with some examples, can be found here: https://github.com/gao-yan/pacemaker/commit/5f75da8d99171cc100e87935c8c3fd2f83243f93 If there are any comments/concerns, now is the time to raise them. For naming, I must confess I find it a bit strange that while all other constraint types use ordinary English names (order, location, colocation), this one uses a rather strange looking abbreviation. However, I'll also concede that the only alternative that currently comes to my mind would be to rename the constraint type to ticket, but that obviously creates ambiguity between ticket the constraint and ticket the thing that booth manages, so it would probably be worse. Perhaps others have a better idea. About the documentation, I generally find it very useful; I only have one addition for a suggestion: it's not immediately clear from the existing docs that multiple resources can depend on the same ticket. It does mention resource sets (which, still, could use an additional sentence à la thus, multiple resources can depend on the same ticket as a courtesy to the novice reader), but it doesn't say whether it's OK to have multiple constraints referring to the same ticket. If I can spare the time some time in the next few weeks I might also prepare a die, passive voice, die patch for that documentation page, but that's just a pet peeve of mine. :) Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Pacemaker will not mount ocfs2
On 02/24/12 08:50, Johan Rosing Bergkvist wrote: Hi Just an update. So I upgraded to pacemaker 1.1.6 and tried to configure it all again, without dlm. It didn't work, I still got the OCF_ERR_INSTALLED so I started looking through the setup and found that I didn't specify the drbd.conf path. When I added that meta and boom, it mounted like a dream. Huh? drbdconf in ocf:linbit:drbd is a regular param, not a meta attribute. It sounds like you're mixing something up, can you clarify please? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Pacemaker will not mount ocfs2
On 02/24/12 09:21, Johan Rosing Bergkvist wrote: Sorry parameter, you're right. But still It didn't mount untill I added the drbdconf parameter. primitive clusterDRBD ocf:linbit:drbd \ params drbd_resource=cluster-ocfs *drbdconf=/etc/drbd.conf *#This is what I added \ op monitor interval=20 role=Master timeout=20 \ op monitor interval=30 role=Slave timeout=20 I was just wondering if this parameter was required and if so, since I used the default path shouldn't it be preconfigured? It is the default path, it is preconfigured, and you shouldn't need to add this. This isn't some in-place upgrade from an age-old DRBD version like 0.7, is it? (*shudder*) Also, what does ls /etc/drbd* say? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] DRBD, Fedora, and systemd (tangent off of Re: Upstart resources)
On 02/23/12 23:48, Andrew Beekhof wrote: On Thu, Feb 23, 2012 at 6:31 PM, Ante Karamatic iv...@ubuntu.com wrote: On 23.02.2012 00:10, Andrew Beekhof wrote: Do you still have LSB scripts on a machine thats using upstart? Yes, some LSB scripts can't be easily converted to upstart jobs. Or, let's rephrase that - can't be converted to upstart jobs without losing some of the functionality. On fedora they purged them all. All? Even the stuff like drbd? I have to take a look at that. I think any package that doesn't have a unit file is going to be blacklisted from F-17. That was the threat at least. In a Pacemaker cluster, nothing needs to touch DRBD during the system boot sequence. Nothing should, really. So the absence of any bootup script in a DRBD package should hardly be a reason to zap it from the distro. I'm CC'ing the Fedora DRBD package maintainer so he's at least informed of this thread, as I'm unsure if he follows the Pacemaker list on a regular basis. Hi Major. :) Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Pacemaker will not mount ocfs2
On 02/21/12 13:39, Johan wrote: I've been following this http://publications.jbfavre.org/virtualisation/cluster- xen-corosync-pacemaker-drbd-ocfs2.en tutorial on how to setup a pacemaker xen cluster. I'm all new to this so pls bear over with me. The big problem is that when UI get to the pont where the filasystem should automagically mount, it doesn't. Here's my config: node cluster01 node cluster02 primitive Cluster-FS-DLM ocf:pacemaker:controld \ op monitor interval=15 \ meta target-role=Stopped primitive Cluster-FS-DRBD ocf:linbit:drbd \ params drbd_resource=cluster-ocfs \ operations $id=Cluster-FS-DRBD-ops \ op monitor interval=20 role=Master timeout=20 \ op monitor interval=30 role=Slave timeout=20 primitive Cluster-FS-Mount ocf:heartbeat:Filesystem \ params device=/dev/drbd/by-res/cluster-ocfs directory=/cluster fstype=ocfs2 ms Cluster-FS-DRBD-Master Cluster-FS-DRBD \ meta resource-stickines=100 master-max=2 notify=true interleave=true target-role=Stopped clone Cluster-FS-Mount-Clone Cluster-FS-Mount \ meta interleave=true ordered=true target-role=Stopped order Cluster-FS-After-DRBD inf: Cluster-FS-DRBD-Master:promote Cluster-FS- Mount-Clone:start order Cluster-FS-DLM-Order inf: Cluster-FS-DRBD-Master:promote Cluster-FS-Mount- Clone:start property $id=cib-bootstrap-options \ dc-version=1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b \ cluster-infrastructure=openais \ expected-quorum-votes=2 \ no-quorum-policy=ignore \ default-resource-stickiness=1000 \ stonith-enabled=false \ last-lrm-refresh=1329823386 I keep getting the: info: RA output: (Cluster-FS-Mount:1:start:stderr) FATAL: Module scsi_hostadapter not found. That's a red herring. Why the Filesystem RA is still trying to modprobe scsi_hostadapter, and is even logging any failure to do so with a FATAL priority, don't ask. :) However, with all those target-role=Stopped attributes in there nothing of interest is really expected to start. in the /var/log/syslog I've been googling around for a solution but all of them seem to fail for me. any help is much appreciated That tutorial is wrong in several places. Specifically, One word about OCFS2. In a perfect world, we should manage OCFS2 with pacemaker. In this particular case, this won't be the case (I had issues with lock managment which is mandatory for pacemaker). ... is just nonsense. You can (and should) put the DLM and O2CB under Pacemaker management. See the ocf:pacemaker:controld and ocf:pacemaker:o2cb resource agents for details. Also, you'll probably need to update your OCFS2 with tunefs.ocfs2 --update-cluster-stack before you can mount it. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Requesting re-sync
On Tue, Feb 21, 2012 at 3:57 PM, Pieter Baele pieter.ba...@gmail.com wrote: After upgrading a node (RHEL 6.1 to 6.2), my /var/log/messages grows really really fast because of this error, what can be wrong? So you upgraded just one node, and the other is still unchanged? Can you give the Pacemaker and Corosync version for both? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Upstart resources
Jake, sorry, I missed your original post due to travel; let me toss in one more thing here: On Tue, Feb 21, 2012 at 3:32 PM, Jake Smith jsm...@argotec.com wrote: Are upstart jobs expected to conform to the LSB spec with regards to exit codes, etc? Is there any reference documentation using upstart resources in Pacemaker? Or any good advice :-) Newer versions of pacemaker and lrmd are able to deal with upstart resources via dbus. Only if the LRM is compiled with --enable-upstart, of course. Which, to the best of my knowledge, is only set on the Ubuntu builds (and Ubuntu builds are currently the only ones for which this makes sense to set, obviously). This, however, requires that you run with an updated libglib2 package (again, only on Ubuntu). All of that should be available either in the upstream Ubuntu repos or, for the current LTS, in the ubuntu-ha-maintainers PPA.[1] Hope this helps. Cheers, Florian [1] Why do I need to use a PPA if this release is ostensibly on long-term support? Don't ask me, ask someone from Canonical. :) -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Pacemaker will not mount ocfs2
On Tue, Feb 21, 2012 at 4:22 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Tue, Feb 21, 2012 at 02:26:31PM +0100, Florian Haas wrote: On 02/21/12 13:39, Johan wrote: I keep getting the: info: RA output: (Cluster-FS-Mount:1:start:stderr) FATAL: Module scsi_hostadapter not found. That's a red herring. Why the Filesystem RA is still trying to modprobe scsi_hostadapter, and is even logging any failure to do so with a FATAL priority, don't ask. :) Removed. Let's see who'll complain, then perhaps we'll know why it was there ;-) Could you zap that from the Raid1 RA too, please? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] problems with cman + corosync + pacemaker in debian
On 02/18/12 10:59, diego fanesi wrote: are you saying I can install drbd + gfs2 + pacemaker without using cman? It seems that gfs2 depends on cman... Only on RHEL/CentOS/Fedora. Not on Debian. I want to realize active/active cluster and I'm following the document cluster from scratch that you can found on this website. I don't know if there are other ways to realize it. Here's a reference config; we use this in classes we teach (where we run the Pacemaker stack on Debian because that's the only distro that supports all of Pacemaker, OCFS2, GFS2, GlusterFS and Ceph). This makes no claims at being perfect, but it works rather well. primitive p_dlm_controld ocf:pacemaker:controld \ params daemon=dlm_controld.pcmk \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ op monitor interval=10 primitive p_gfs_controld ocf:pacemaker:controld \ params daemon=gfs_controld.pcmk \ op start interval=0 timeout=90 \ op stop interval=0 timeout=100 \ op monitor interval=10 group g_gfs2 p_dlm_controld p_gfs_controld clone cl_gfs2 g_gfs2 \ meta interleave=true Here's the corresponding DRBD/Pacemaker configuration. primitive p_drbd_gfs2 ocf:linbit:drbd \ params drbd_resource=gfs2 \ op monitor interval=10 role=Master \ op monitor interval=30 role=Slave ms ms_drbd_gfs2 p_drbd_gfs2 \ meta notify=true master-max=2 \ interleave=true colocation c_gfs2_on_drbd inf: cl_gfs2 ms_drbd_gfs2:Master order o_drbd_before_gfs2 inf: ms_drbd_gfs2:promote cl_gfs2:start Of course, you'll have to add proper fencing, and there are several DRBD configuration options that you must remember to set. And, obviously, you need the actual Filesystem resources to manage your GFSs proper. That being said, it's entirely possible that a GlusterFS based solution would solve your issue as well, and be easier to set up. Or even something NFS based, backed by a single-Primary DRBD config for HA. You didn't give many details of your setup, however, so it's impossible to tell for certain. Hope this helps. Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Resource inter-dependency without being a 'group'
On Sat, Feb 18, 2012 at 7:19 PM, David Coulson da...@davidcoulson.net wrote: I have an active/active LVS cluster, which uses pacemaker for managing IP resources. Currently I have one environment running on it which utilizes ~30 IP addresses, so a group was created so all resources could be stopped/started together. Downside of that is that all resources have to run on the same node. [...] Is there a recommendation or best practice for this type of configuration? Is there something similar to 'group', which allows all the resources to be referenced as a single 'parent' resource without requiring them all to run on the same node? Is setting meta collocated=false not working for your group? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] problems with cman + corosync + pacemaker in debian
On Sun, Feb 12, 2012 at 10:01 PM, diego fanesi diego.fan...@gmail.com wrote: Hi, I'm trying to install corosync with pacemaker using drbd + gfs2 with cman support. Why? GFS2 with dual-Primary DRBD with Pacemaker 1.1.6 is working very well in squeeze-backports with the dlm_controld.pcmk and gfs_controld.pcmk daemons. No need to run on cman. Just install dlm-pcmk and gfs-pcmk and configure appropriately. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Percona Replication Manager
On Fri, Feb 10, 2012 at 1:38 PM, Nick Khamis sym...@gmail.com wrote: May I ask where the original blog resides? The one with the bizerk blog comments http://www.lmgtfy.com/?q=percona+replication+managerl=1 SCNR. :) Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] LVM Setup
On Wed, Jan 25, 2012 at 6:49 PM, Gregg Stock gr...@damagecontrolusa.com wrote: Hi, I'm trying to setup a 5 node cluster, the same topology as described in Roll Your Own Cloud: Enterprise Virtualization with KVM, DRBD, iSCSI and Pacemaker http://blip.tv/linuxconfau/roll-your-own-cloud-enterprise-virtualization-with-kvm-drbd-iscsi-and-pacemaker-4738148 I'm stuck creating the pacemaker LVM resource. I'm not sure if pacemaker doesn't see the volume group or something else is wrong. Here is the basic setup on CentOS 5.7: 1. Six disks setup with a raid 0 array. 2. The raid device md0 is a physical volume with a volume group vg_cluster and logical volume lv_iscsi0 on top. 3. A DRDB resource r0 that uses the logical volume lv_iscsi0 as the disk - the device is /dev/drbd1 4. A physical volume that uses /dev/drbd1 5. A volume group iscsivg0 that uses /dev/drbd1 All of this seems to work fine but when I try and create a the LVM primitive with iscsivg0, it is not able to start. I've tried different filtering schemes in the lvm.conf file but no luck. I'm not sure if pacemaker is not able to see the volume group or there is some fundamental problem with what I'm trying to do. Please pastebin your lvm.conf and an screendump of vgscan -vvv, taken on a node where DRBD is primary. Thanks, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] MySQL Master-Master replication with Corosync and Pacemaker
On Thu, Jan 26, 2012 at 12:43 AM, Peter Scott pe...@psdt.com wrote: Hello. Our problem is that a Corosync restart on the idle machine in a 2-node cluster shutds down the mysqld process there and we need it to stay up for replication. Well if you just want to restart Corosync by administrative intervention (i.e. in a planned, controlled fashion), then why not put the cluster in maintenance mode before you restart Corosync? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [PATCH 0/2] rsyslog/logrotate configuration snippets
On Sun, Jan 15, 2012 at 9:27 PM, Andrew Beekhof and...@beekhof.net wrote: On Thu, Jan 12, 2012 at 11:01 PM, Florian Haas flor...@hastexo.com wrote: On Thu, Jan 5, 2012 at 10:15 PM, Florian Haas flor...@hastexo.com wrote: Florian Haas (2): extra: add rsyslog configuration snippet extra: add logrotate configuration snippet configure.ac | 4 +++ extra/Makefile.am | 2 +- extra/logrotate/Makefile.am | 5 extra/logrotate/pacemaker.conf.in | 7 ++ extra/rsyslog/Makefile.am | 5 extra/rsyslog/pacemaker.conf.in | 39 + 6 files changed, 61 insertions(+), 1 deletions(-) create mode 100644 extra/logrotate/Makefile.am create mode 100644 extra/logrotate/pacemaker.conf.in create mode 100644 extra/rsyslog/Makefile.am create mode 100644 extra/rsyslog/pacemaker.conf.in Any takers on these? Sorry, I was off working on the new fencing logic and then corosync 2.0 support (when cman and all the plugins, including ours, go away). So a couple of comments... I fully agree that the state of our logging needs work and I can understand people wanting to keep the vast majority of our logs out of syslog. I'm less thrilled about one-file-per-subsystem, the cluster will often do a lot within a single second and splitting everything up really hurts the ability to correlate messages. I'd also suggest that /some/ information not coming directly from the RAs is still appropriate for syslog (such as I'm going to move A from B to C or I'm about to turn of node D), so the nuclear option isn't really thrilling me. So everything that is logged by the RAs with ocf_log, as I wrote in the original post, _is_ still going to whatever the default syslog destination may be. The rsyslog config doesn't change that at all. (Stuff that the RAs simply barf out to stdout/err would go to the lrmd log.) I maintain that this is the stuff that is also most useful to people. And with just that information in the syslog, you usually get a pretty clear idea of what the heck the cluster is doing on a node, and in what order, in about 20 lines of logs close together -- rather than intermingled with potentially hundreds of lines of other cluster-related log output. And disabling the nuclear option is a simple means of adding a # before ~ in the config file. You can ship it that way by default if you think that's more appropriate. That way, people would get the split-out logs _plus_ everything in one file, which IMHO is sometimes very useful for pengine or lrmd troubleshooting/debugging. I, personally, just don't want Pacemaker to flood my /var/log/messages, so I'd definitely leave the ~ in there, but that may be personal preference. I wonder what others think. In addition to the above distractions, I've been coming up to speed on libqb's logging which is opening up a lot of new doors and should hopefully help solve the underlying log issues. For starters it lets syslog/stderr/logfile all log at different levels of verbosity (and formats), it also supports blackboxes of which a dump can be triggered in response to an error condition or manually by the admin. The plan is something along the lines of: syslog gets NOTICE and above, anything else (depending on debug level and trace options) goes to /var/log/(cluster/?)pacemaker or whatever was configured in corosync. However, before I can enact that there will need to be an audit of the messages currently going to INFO (674 entries) and NOTICE(160 entries) with some getting bumped up, others down (possibly even to debug). I'd certainly be interested in feedback as to which logs should and should not make it. Yes, even so, I (again, this is personal preference) would definitely not want pengine logging (which even if half its INFO messages get demoted to DEBUG, would still be pretty verbose) in my default messages file. If you want to get analytical about it, there is also an awk script that I use when looking at what we log. I'd be interested in some numbers from the field. Thanks; I can look at that after LCA. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [PATCH 0/2] rsyslog/logrotate configuration snippets
On Mon, Jan 16, 2012 at 10:59 AM, Andrew Beekhof and...@beekhof.net wrote: By Nuclear, I meant nothing at all from Pacemaker. Which is not what it does. If thats what you want, there's a far easier way to achieve this and keep usable logs around for debugging, set facility to none and add a logfile. No, I don't want that. (Stuff that the RAs simply barf out to stdout/err would go to the lrmd log.) I maintain that this is the stuff that is also most useful to people. And with just that information in the syslog, you usually get a pretty clear idea of what the heck the cluster is doing on a node, and in what order, in about 20 lines of logs close together -- rather than intermingled with potentially hundreds of lines of other cluster-related log output. Did I not just finish agreeing that hundreds of lines of other cluster-related log[s] was a problem? What in my statement above indicates that I assumed otherwise? I just don't think your knee-jerk everything must go approach is the answer. That is not my approach. And disabling the nuclear option is a simple means of adding a # before ~ in the config file. You can ship it that way by default if you think that's more appropriate. That way, people would get the split-out logs _plus_ everything in one file, which IMHO is sometimes very useful for pengine or lrmd troubleshooting/debugging. I, personally, just don't want Pacemaker to flood my /var/log/messages, Did you see me arguing against that? No. What makes you think I did? so I'd definitely leave the ~ in there, but that may be personal preference. I wonder what others think. In addition to the above distractions, I've been coming up to speed on libqb's logging which is opening up a lot of new doors and should hopefully help solve the underlying log issues. For starters it lets syslog/stderr/logfile all log at different levels of verbosity (and formats), it also supports blackboxes of which a dump can be triggered in response to an error condition or manually by the admin. The plan is something along the lines of: syslog gets NOTICE and above, anything else (depending on debug level and trace options) goes to /var/log/(cluster/?)pacemaker or whatever was configured in corosync. However, before I can enact that there will need to be an audit of the messages currently going to INFO (674 entries) and NOTICE(160 entries) with some getting bumped up, others down (possibly even to debug). I'd certainly be interested in feedback as to which logs should and should not make it. Yes, even so, I (again, this is personal preference) would definitely not want pengine logging (which even if half its INFO messages get demoted to DEBUG, would still be pretty verbose) in my default messages file. Sigh, please take time out from preaching to actually read the replies. You might learn something. This is getting frustrating. Not this logging discussion, but pretty much any discussion the two of us have been having lately. (And no, this is not an assignment of guilt or responsibility -- it takes two to tango.) Let's try and sort this out in person on Thursday. Florian ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] need cluster-wide variables
On Tue, Jan 10, 2012 at 10:24 PM, Arnold Krille arn...@arnoldarts.de wrote: Is it possible for slaves to modify their score for promotion? I think that would be an interesting feature. Probably something like that could already be achieved with dependency-rules and variables. But I think a function for resource agents to increase or decrease the score would be more clean. http://www.linux-ha.org/doc/dev-guides/_specifying_a_master_preference.html crm_master has been around for as long as I can remember. Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [PATCH 0/2] rsyslog/logrotate configuration snippets
On Thu, Jan 5, 2012 at 10:15 PM, Florian Haas flor...@hastexo.com wrote: Florian Haas (2): extra: add rsyslog configuration snippet extra: add logrotate configuration snippet configure.ac | 4 +++ extra/Makefile.am | 2 +- extra/logrotate/Makefile.am | 5 extra/logrotate/pacemaker.conf.in | 7 ++ extra/rsyslog/Makefile.am | 5 extra/rsyslog/pacemaker.conf.in | 39 + 6 files changed, 61 insertions(+), 1 deletions(-) create mode 100644 extra/logrotate/Makefile.am create mode 100644 extra/logrotate/pacemaker.conf.in create mode 100644 extra/rsyslog/Makefile.am create mode 100644 extra/rsyslog/pacemaker.conf.in Any takers on these? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [PATCH 0/2] rsyslog/logrotate configuration snippets
On Thu, Jan 12, 2012 at 2:15 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: I marked that message as Important and will include into my builds even if it does not go upstream. One question - does it break default hb_report and crm_report behavior? Good point. I presume it would make sense to include anything in /var/log/pacemaker in hb_report/crm_report. In the meantime, you can of course use the -E option to include these files manually. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Configuring 3rd Node as Quorum Node in 2 Node Cluster
On Wed, Jan 11, 2012 at 1:44 AM, Andrew Beekhof and...@beekhof.net wrote: On Wed, Jan 11, 2012 at 3:30 AM, Andrew Martin amar...@xes-inc.com wrote: 3. Limit the DRBD, nfs, and smbd resources to only node1 and node2 by adding a location rule for the g_nfs group (which includes p_fs_drbd0 p_lsb_nfsserver p_exportfs_drbd0 p_ip_nfs): # crm configure location ms-drbd0-placement ms-drbd0 rule -inf: uname ne node1 and uname ne node2 Right. Another option would be to permanently run the 3rd node in standby mode. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Cannot Create Primitive in CRM Shell
On Mon, Jan 9, 2012 at 11:42 AM, Dan Frincu df.clus...@gmail.com wrote: Hi, On Fri, Jan 6, 2012 at 11:24 PM, Andrew Martin amar...@xes-inc.com wrote: Hello, I am working with DRBD + Heartbeat + Pacemaker to create a 2-node highly-available cluster. I have been following this official guide on DRBD's website for configuring all of the components: http://www.linbit.com/fileadmin/tech-guides/ha-nfs.pdf However, once I go to configure the primitives in pacemaker's CRM shell (section 4.1 in the PDF above) I am unable to create the primitive. For example, I enter the following configuration for a DRBD device called drive: primitive p_drbd_drive \ ocf:linbit:drbd \ params drbd_resource=drive \ op monitor interval=15 role=Master \ op monitor interval=30 role=Slave After entering all of these lines I hit enter and nothing is returned - it appears frozen and I am never returned to the crm(live)configure# shell. An strace of the process does not reveal any obvious blocks. I have also tried entering the entire configuration on a single line with the same result. I would recommend going through this guide first http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/ That's a bit of a knee-jerk response if I may say so, and when I wrote those guides[1] the intention was specifically that people could peruse them _without_ first having to check the documentation that covers the configuration internals. At any rate, Andrew, if your crm shell is freezing up when you're simply trying to add a primitive, something must be seriously awry in your setup -- it's something that I've not run into personally, unless the cluster was already responding to an error state on one of the nodes. Are you sure your cluster is behaving OK otherwise? Are you getting meaningful output from crm_mon -1? Does your cluster report it has successfully elected a DC? Cheers, Florian [1] Which I did while employed by Linbit, which is no longer the case, as they have asked I point out. http://wp.me/p4XzQ-bN -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Resource ping fails on passive node after upgrading to second nic
Stefan, sorry, your report triggers a complete -EPARSE in my brain. On Mon, Jan 9, 2012 at 10:38 AM, Senftleben, Stefan (itsc) stefan.senftle...@itsc.de wrote: Hello everybody, last week I installed and configured in each cluster node a second network interface. After configuring the corosync.cfg the passive node stops the primative ping (three ping targets). The Corosync config shouldn't affect the ping resource at all. Such errors are in the corosync.log: Jan 09 10:12:28 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. Jan 09 10:12:28 corosync [MAIN ] Completed service synchronization, ready to provide service. Jan 09 10:12:30 corosync [TOTEM ] ring 1 active with no faults Jan 09 10:12:37 lxds05 crmd: [1347]: info: process_lrm_event: LRM operation pri_ping:1_start_0 (call=11, rc=0, cib-update=17, confirmed=true) ok Jan 09 10:12:42 lxds05 attrd: [1345]: info: attrd_trigger_update: Sending flush op to all hosts for: pingd (3000) Jan 09 10:13:37 lxds05 crmd: [1347]: WARN: cib_rsc_callback: Resource update 17 failed: (rc=-41) Remote node did not respond Jan 09 10:17:25 lxds05 attrd: [1345]: info: attrd_trigger_update: Sending flush op to all hosts for: master-pri_drbd_omd:0 (1) Jan 09 10:17:25 lxds05 attrd: [1345]: info: attrd_perform_update: Sent update 22: master-pri_drbd_omd:0=1 Jan 09 10:19:25 lxds05 attrd: [1345]: WARN: attrd_cib_callback: Update 22 for master-pri_drbd_omd:0=1 failed: Remote node did not respond Jan 09 10:22:08 lxds05 cib: [1343]: info: cib_stats: Processed 67 operations (1044.00us average, 0% utilization) in the last 10min Jan 09 10:22:25 lxds05 attrd: [1345]: info: attrd_trigger_update: Sending flush op to all hosts for: master-pri_drbd_omd:0 (1) Jan 09 10:22:25 lxds05 attrd: [1345]: info: attrd_perform_update: Sent update 24: master-pri_drbd_omd:0=1 Jan 09 10:24:25 lxds05 attrd: [1345]: WARN: attrd_cib_callback: Update 24 for master-pri_drbd_omd:0=1 failed: Remote node did not respond Jan 09 10:27:25 lxds05 attrd: [1345]: info: attrd_trigger_update: Sending flush op to all hosts for: master-pri_drbd_omd:0 (1) Jan 09 10:27:25 lxds05 attrd: [1345]: info: attrd_perform_update: Sent update 26: master-pri_drbd_omd:0=1 Jan 09 10:29:25 lxds05 attrd: [1345]: WARN: attrd_cib_callback: Update 26 for master-pri_drbd_omd:0=1 failed: Remote node did not respond Jan 09 10:32:08 lxds05 cib: [1343]: info: cib_stats: Processed 6 operations (1666.00us average, 0% utilization) in the last 10min Jan 09 10:32:25 lxds05 attrd: [1345]: info: attrd_trigger_update: Sending flush op to all hosts for: master-pri_drbd_omd:0 (1) Jan 09 10:32:25 lxds05 attrd: [1345]: info: attrd_perform_update: Sent update 28: master-pri_drbd_omd:0=1 Jan 09 10:34:25 lxds05 attrd: [1345]: WARN: attrd_cib_callback: Update 28 for master-pri_drbd_omd:0=1 failed: Remote node did not respond Not a single message from any ping resource here. The check with corosync-cfg -s runs without errors on both nodes. Does corosync-objctl | grep member yield two members or one? I do not know, what is wrong, because the targets used in the crm config can be pinged successfully. Can someone help me, please? Thanks in advance. Unlikely, you didn't give an awful lot of useful information, even your resource config is missing. cibadmin -Q dump posted to pastebin, and the URL shared here, might help. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Resource ping fails on passive node after upgrading to second nic
On Mon, Jan 9, 2012 at 2:01 PM, Senftleben, Stefan (itsc) stefan.senftle...@itsc.de wrote: This is the cibadmin dump of the active one: http://pastebin.com/Yg4Jsaxy You would see this in a crm_mon -rf: Failed actions: pri_ping:1_start_0 (node=lxds05, call=-1, rc=1, status=Timed Out): unknown error Timed out should be pretty self explanatory. However: corosync-objctl | grep member brings no output on the nodes combined with root@lxds05:~# cibadmin -Q Call cib_query failed (-41): Remote node did not respond combined with Online: [ lxds05 lxds07 ] ... in other words, the totem member list being empty plus one node saying it can't talk to the DC plus the DC listing both nodes as healthy, looks positively odd. I'm afraid I wouldn't be able to help a lot more without being able to actually look at the box though; please see the link in my sig block if interested. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/services/remote ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] syslog full of redundand link messages
On Mon, Jan 9, 2012 at 3:15 PM, Attila Megyeri amegy...@minerva-soft.com wrote: Hi, I might be taking something wrong, but, bindnetaddr: 10.100.1.255 does not mean it will listen on this address, but will listen on every interface where this mask matches. This is just to make the config file simpler and common for all nodes in the same subnet. Or am I taking something terribly wrong? As Dan states, what you configured looks more like a broadcast address, not a network address. Assuming your boxes have IP addresses of 10.100.1.x in a /24 subnet, the correct network address would be 10.100.1.0. ipcalc is your friend, btw. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] [PATCH 1/2] extra: add rsyslog configuration snippet
--- configure.ac|4 extra/Makefile.am |2 +- extra/rsyslog/Makefile.am |5 + extra/rsyslog/pacemaker.conf.in | 39 +++ 4 files changed, 49 insertions(+), 1 deletions(-) create mode 100644 extra/rsyslog/Makefile.am create mode 100644 extra/rsyslog/pacemaker.conf.in diff --git a/configure.ac b/configure.ac index ecae986..ec81938 100644 --- a/configure.ac +++ b/configure.ac @@ -1714,6 +1714,10 @@ fencing/Makefile \ extra/Makefile \ extra/resources/Makefile\ extra/rgmanager/Makefile\ + extra/rsyslog/Makefile \ + extra/rsyslog/pacemaker.conf\ + extra/logrotate/Makefile\ + extra/logrotate/pacemaker.conf \ tools/Makefile \ tools/crm_report\ tools/coverage.sh \ diff --git a/extra/Makefile.am b/extra/Makefile.am index 5ad7dc7..d9e3360 100644 --- a/extra/Makefile.am +++ b/extra/Makefile.am @@ -18,7 +18,7 @@ MAINTAINERCLEANFILES= Makefile.in -SUBDIRS = resources rgmanager +SUBDIRS = resources rgmanager rsyslog mibdir = $(datadir)/snmp/mibs mib_DATA = PCMK-MIB.txt diff --git a/extra/rsyslog/Makefile.am b/extra/rsyslog/Makefile.am new file mode 100644 index 000..dbde43c --- /dev/null +++ b/extra/rsyslog/Makefile.am @@ -0,0 +1,5 @@ +MAINTAINERCLEANFILES = Makefile.in + +rsyslogdir = $(sysconfdir)/rsyslog.d + +rsyslog_DATA = pacemaker.conf diff --git a/extra/rsyslog/pacemaker.conf.in b/extra/rsyslog/pacemaker.conf.in new file mode 100644 index 000..4c52698 --- /dev/null +++ b/extra/rsyslog/pacemaker.conf.in @@ -0,0 +1,39 @@ +# rsyslog configuration snippet for Pacemaker daemons +# +# Include this file in your rsyslog configuration file, +# _before_ your default log processing rules. +# +# If you want Pacemaker log entries in individual log +# files _and_ your catch-all syslog file, remove the +# v~ lines. + +$template PacemakerDaemonLog,@localstatedir@/log/@PACKAGE_TARNAME@/%programname%.log + +# Entries from the crm_attribute binary and attrd go +# to one log file. +:programname,isequal,crm_attribute @localstatedir@/log/@PACKAGE_TARNAME@/attrd.log + ~ +:programname,isequal,attrd ?PacemakerDaemonLog + ~ + +# CIB status messages +:programname,isequal,cib ?PacemakerDaemonLog + ~ + +# Messages from crmd +:programname,isequal,crmd ?PacemakerDaemonLog + ~ + +# Messages from lrmd, including stdout and stderr +# from poorly-written resource agents that don't +# use ocf_log and/or ocf_run +:programname,isequal,lrmd ?PacemakerDaemonLog + ~ + +# Policy Engine messages +:programname,isequal,pengine ?PacemakerDaemonLog + ~ + +# Messages from the fencing daemons +:programname,startswith,stonith ?PacemakerDaemonLog + ~ -- 1.7.5.4 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] [PATCH 0/2] rsyslog/logrotate configuration snippets
Hi everyone, apologies for sending patches to the user list -- my subscription request is still pending on pcmk-devel, so here goes. One of the most commonly voiced criticisms against Pacemaker is that it floods people's logs. And while I contend that the rather verbose logging that Pacemaker offers is a good thing, and that it shouldn't necessarily have to tune down the amount of messages it emits, we should offer users a facility where these log messages don't interfere with more critical logging info. So the patches contain a simple rsyslog configuration snippet which, when included in the rsyslog configuration, will log Pacemaker logging output to files named /var/log/pacemaker/daemon.log, where daemon can be attrd, cib, crmd, lrmd, pengine, and stonith. (Output from the crm_attribute binary also goes to the attrd log.) So what remains in the default system log (/var/log/messages, /var/log/syslog)? The stuff that you're most likely to care about, namely the log messages from the resource agents -- i.e. stuff that's actually relevant to the health of your application, rather than the health of your cluster infrastructure. I find this makes issues _much_ easier to troubleshoot (but then of course, that may be my personal preference). What's also included is a simple logrotate configuration snippet that makes sure these log files are compressed and rotated once a week. These changes, since commit d35d6f96daa04d9a2c3c54a0c60a3ff5db5fc293: High: Core: Rempove stray character from qb_ipc_response_header definition (2012-01-03 11:38:46 +1100) are also available in my git repository at: git://github.com/fghaas/pacemaker syslog Florian Haas (2): extra: add rsyslog configuration snippet extra: add logrotate configuration snippet configure.ac |4 +++ extra/Makefile.am |2 +- extra/logrotate/Makefile.am |5 extra/logrotate/pacemaker.conf.in |7 ++ extra/rsyslog/Makefile.am |5 extra/rsyslog/pacemaker.conf.in | 39 + 6 files changed, 61 insertions(+), 1 deletions(-) create mode 100644 extra/logrotate/Makefile.am create mode 100644 extra/logrotate/pacemaker.conf.in create mode 100644 extra/rsyslog/Makefile.am create mode 100644 extra/rsyslog/pacemaker.conf.in Hope this is useful. Cheers, Florian ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] [PATCH 2/2] extra: add logrotate configuration snippet
--- extra/Makefile.am |2 +- extra/logrotate/Makefile.am |5 + extra/logrotate/pacemaker.conf.in |7 +++ 3 files changed, 13 insertions(+), 1 deletions(-) create mode 100644 extra/logrotate/Makefile.am create mode 100644 extra/logrotate/pacemaker.conf.in diff --git a/extra/Makefile.am b/extra/Makefile.am index d9e3360..6fe0a28 100644 --- a/extra/Makefile.am +++ b/extra/Makefile.am @@ -18,7 +18,7 @@ MAINTAINERCLEANFILES= Makefile.in -SUBDIRS = resources rgmanager rsyslog +SUBDIRS = resources rgmanager rsyslog logrotate mibdir = $(datadir)/snmp/mibs mib_DATA = PCMK-MIB.txt diff --git a/extra/logrotate/Makefile.am b/extra/logrotate/Makefile.am new file mode 100644 index 000..8e400a4 --- /dev/null +++ b/extra/logrotate/Makefile.am @@ -0,0 +1,5 @@ +MAINTAINERCLEANFILES = Makefile.in + +logrotatedir = $(sysconfdir)/logrotate.d + +logrotate_DATA = pacemaker.conf diff --git a/extra/logrotate/pacemaker.conf.in b/extra/logrotate/pacemaker.conf.in new file mode 100644 index 000..3edd17e --- /dev/null +++ b/extra/logrotate/pacemaker.conf.in @@ -0,0 +1,7 @@ +@localstatedir@/log/@PACKAGE_TARNAME@/*.log { + rotate 4 + weekly + compress + missingok + notifempty +} -- 1.7.5.4 ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Patch: use NFSv4 with RA nfsserver
On Tue, Dec 27, 2011 at 12:05 PM, Vogt Josef josef.v...@telecom.li wrote: Hi all, I wrote a patch to the ressource agent nfsserver which deals with NFSv4 (see attachment). It's now possible to use either NFSv3 or NFSv4 with this ressource agent. Any specific reason for not using exportfs? http://linux-ha.org/doc/man-pages/re-ra-exportfs.html It looks to me that your patch largely reimplements what the wait_for_leasetime_on_stop exportfs parameter already does. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Patch: use NFSv4 with RA nfsserver
On Tue, Dec 27, 2011 at 3:30 PM, Vogt Josef josef.v...@telecom.li wrote: Just a question here: I could't get it to work without setting the gracetime - which isn't set in the exportfs RA. Are you sure this works as expected? Thanks, good input. I'd be happy to add that (as in, wait_for_gracetime_on_start or similar). However, can you do me a favor please? Take a look at the discussion archived at http://www.spinics.net/lists/linux-nfs/msg22670.html and let me know if nlm_grace_period (as mentioned in http://www.spinics.net/lists/linux-nfs/msg22737.html) made any difference? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] OCFS2 problems when connectivity lost
2011/12/21 Ivan Savčić | Epix ivan.sav...@epix.rs: Hello, We are having a problem with a 3-node cluster based on Pacemaker/Corosync with 2 primary DRBD+OCFS2 nodes and a quorum node. Nodes run on Debian Squeeze, all packages are from the stable branch except for Corosync (which is from backports for udpu functionality). Each node has a single network card. Strongly suggest to also use pacemaker and resource-agents from squeeze-backports. When the network is up, everything works without any problems, graceful shutdown of resources on any node works as intended and doesn't reflect on the remaining cluster partition. When the network is down on one OCFS2 node, Pacemaker (no-quorum-policy=stop) tries to shut the resources down on that node, but fails to stop the OCFS2 filesystem resource stating that it is in use. Are you sure you have fencing configured correctly? Normally the remaining nodes should attempt to fence the misbehaving node. *Both* OCFS2 nodes (ie. the one with the network down and the one which is still up in the partition with quorum) hang with dmesg reporting that events, ocfs2rec and ocfs2_wq are blocked for more than 120 seconds. That, again, would be an expected side effect if your fencing malfunctioned: I/O on the device has to freeze until those nodes that are scheduled for fencing, are in fact fenced. If that fencing operation never succeeds, then I/O on the remaining nodes freezes indefinitely. When the network is operational, umount by hand works without any problems, because for the testing scenario there are no services running which are keeping the mountpoint busy. Configuration we used is pretty much from ClusterStack/LucidTesting document [1], with clone-max=2 added where needed because of the additional quorum node in comparison to that document. From that document: property $id=cib-bootstrap-options \ dc-version=1.0.7-54d7869bfe3691eb723b1d47810e5585d8246b58 \ cluster-infrastructure=openais \ stonith-enabled=false \ no-quorum-policy=ignore stonith-enabled=false in an OCFS2 cluster with dual-Primary DRBD. I just don't think so. Hope this helps. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] More then one stonith-resource on one node
On Tue, Dec 20, 2011 at 3:42 PM, Marc K. marcus.k...@stuttgart.de wrote: Hello together, I found an older Posting from September this year, with the same problem: - a two node cluster - every node has two power supplies - power supply one is connected to wti-powerswitch 1 - power supply two is connected to wti-powerswitch 2 - wit-powerswitch 1 is connected to datacenter-ups 1 - wit-powerswitch 2 is connected to datacenter-ups 2 Problem: I need two stonith-resources for each node. Working is only one. The second will ingnored. (On commandline both working fine.) Google found an older post from September this year with the same problem. There are new solutions in meantime? (In this post was no really solutions:-( ) Two STONITH devices for one host, _both_ of which you expect to trigger, is nothing I remember as ever having been supported, up to this point. What you can do is to run staggered fencing, that is, a higher-priority fencing device fires first, and then _if that fails_, another lower-priority one does. However, in Pacemaker 1.1 this fallback to secondary STONITH devices (with staggered priorities) simply hasn't yet been implemented in stonith-ng. It's currently on the list for Fedora 17, and if I understood Andrew correctly that release should also cover your both A and B must succeed for fencing to be considered successful scenario. So perhaps you can be patient until then and settle for IPMI in the meantime? Cheers, Florian -- Need help with Pacemaker? http://www.hastexo.com/knowledge/pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Doc: Resource templates
On Mon, Dec 12, 2011 at 10:04 AM, Gao,Yan y...@suse.com wrote: On 12/12/11 15:55, Gao,Yan wrote: Hi, As some people have noticed, we've provided a new feature Resource templates since pacemaker-1.1.6. I made a document about it which is meant to be included into Pacemaker_Explained. I borrowed the materials from Tanja Roth , Thomas Schraitle, (-- the documentation specialists from SUSE) and Dejan Muhamedagic. Thanks to them! Attaching it here first. If you are interested, please help review it. And if anyone would like to help convert it into DocBook and made a patch, I would be much appreciate. :-) I can tell people would like to see a crm shell version of it as well. I'll sort it out and post it here soon. Attached the crm shell version of the document. As much as I appreciate the new feature, was it really necessary that you re-used a term that already has a defined meaning in the shell? http://www.clusterlabs.org/doc/crm_cli.html#_templates Couldn't you have called them resource prototypes instead? We've already confused users enough in the past. Florian ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Doc: Resource templates
On Mon, Dec 12, 2011 at 11:20 AM, Gao,Yan y...@suse.com wrote: On 12/12/11 17:52, Florian Haas wrote: On Mon, Dec 12, 2011 at 10:36 AM, Gao,Yan y...@suse.com wrote: On 12/12/11 17:16, Florian Haas wrote: On Mon, Dec 12, 2011 at 10:04 AM, Gao,Yan y...@suse.com wrote: On 12/12/11 15:55, Gao,Yan wrote: Hi, As some people have noticed, we've provided a new feature Resource templates since pacemaker-1.1.6. I made a document about it which is meant to be included into Pacemaker_Explained. I borrowed the materials from Tanja Roth , Thomas Schraitle, (-- the documentation specialists from SUSE) and Dejan Muhamedagic. Thanks to them! Attaching it here first. If you are interested, please help review it. And if anyone would like to help convert it into DocBook and made a patch, I would be much appreciate. :-) I can tell people would like to see a crm shell version of it as well. I'll sort it out and post it here soon. Attached the crm shell version of the document. As much as I appreciate the new feature, was it really necessary that you re-used a term that already has a defined meaning in the shell? http://www.clusterlabs.org/doc/crm_cli.html#_templates Couldn't you have called them resource prototypes instead? We've already confused users enough in the past. Since Dejan adopted the object name rsc_template in crm shell, and call it Resource template in the help. I'm not inclined to use another term in the document. Opinion, Dejan? I didn't mean to suggest to use a term in the documentation that's different from the one the shell uses. I am suggesting to rename the feature altogether. Granted, it may be a bit late to have a naming discussion now, but I haven't seen this feature discussed on the list at all, so there wasn't really a chance to voice these concerns sooner. Actually there were discussions in pcmk-devel mailing list. Given that it has been included into pacemaker-1.2 schema and released with pacemaker-1.1.6, it seems too late for us to change it from cib side now. Unless Dejan would like to rename it from crm shell... From http://oss.clusterlabs.org/mailman/listinfo/pcmk-devel: The current archive is only available to the list members. Seriously? And that's supposedly the list to discuss issues like 'last commit broke the build' (paraphrasing Andrew, from earlier this year), not feature additions. When did this change? Florian ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] right way to update resource configuration on a live cluster?
On Fri, Dec 9, 2011 at 10:25 PM, MA Martin Andrews (5542) mandr...@ag.com wrote: I have several heartbeat clusters running Centos 5 and heartbeat 2.1.4. Argll. Please: http://www.linux-ha.org/doc/users-guide/_upgrading_from_crm_enabled_heartbeat_2_1_clusters.html Is this procedure correct? I was surprised I couldn't find any discussion of this process online. If I was using a newer pacemaker would the process be simpler? Proper answer is expletive yeah. (Please don't put this on your greeting cards.) Please, by all means, follow the upgrade process. Cheers, Florian -- Need help with Pacemaker? http://www.hastexo.com/knowledge/Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Fw: Unable to start pacemaker due to WARN: do_cib_control: Couldn't complete CIB registration [In reply to]
Hi Graham, On Tue, Dec 6, 2011 at 8:06 AM, Graham Rawolle rawol...@daintreesystems.com wrote: I too am having all sorts of dramas getting pacemaker to start. Andrew you mentioned the new way “ver:1” to start the pacemaker daemons. The problem is that the two packaged versions of pacemaker that I can find for openSUSE 11.4 do not have an /etc/init.d/pacemaker script or even a pacemakerd executable – so how can pacemaker be started? The versions of pacemaker I have tried are 1.1.5-3.2-x86_64 from OpenSUSE-11.4-Oss repository and 1.0.12-1-x86_64 from Cluster Labs repository for openSUSE-11.4. 1.0.12 (as any of the 1.0.x releases) did not ship pacemakerd. On those systems, configuring the Pacemaker service with ver: 1 is not supported. One would not expect a pacemaker init script there. However, 1.1.5 would support it. But OpenSUSE doesn't ship it (from the OpenSUSE 11.4 spec file): # Don't want to ship this just yet: rm $RPM_BUILD_ROOT/etc/init.d/pacemaker || true rm $RPM_BUILD_ROOT/usr/sbin/pacemaker{d,} || true This is unchanged in the spec file for 12.1, which ships Pacemaker 1.1.6. Tim Serong would be the best person to explain the reasoning behind this (or correct me if my observation is wrong, always a possibility). But IIUC Tim is currently traveling back home from Europe, so please give him a day or two to respond. Thanks! Cheers, Florian -- Need help with Pacemaker? http://www.hastexo.com/knowledge/pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] CMAN - Pacemaker - Porftpd setup
Hello, On Tue, Dec 6, 2011 at 2:36 PM, Bensch, Kobus kobus.ben...@bauerservices.co.uk wrote: colocation ftpsite-with-webip inf: ActiveFTPSite WebIP colocation website-with-ip inf: ActiveFTPSite WebIP order apache-after-ip inf: WebIP ActiveFTPSite order propftpd-after-webip inf: WebIP ActiveFTPSite Any specific reasons for these double colo and order constraints? Also, does crm_mon -rf yield any failcounts? Cheers, Florian -- Need help with Pacemaker? http://www.hastexo.com/knowledge/pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] CMAN - Pacemaker - Porftpd setup
On Tue, Dec 6, 2011 at 3:16 PM, Bensch, Kobus kobus.ben...@bauerservices.co.uk wrote: Hi Florian Thanks for the reply. 1.) No reason. I can get rid of one of each Did you, and if so has it changed the situation? 2.) The result of crm_mon -rf OK, no failcounts. Can you create a CIB dump with cibadmin -Q /tmp/cib.xml, upload that _unchanged_ to pastebin or whatever similar service is your favorite, and share the link here? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Where to install applications
On Fri, Dec 2, 2011 at 5:35 PM, Charles DeVoe scarecrow...@yahoo.com wrote: We are building a 4 node active/active cluster, which I believe is the same as High Performance. Not quite. That's still an HA cluster with some scale-out capability. HPC is a slightly different ballgame. The Cluster has a SAN formatted with GFS2. The discussion is whether to install the applications on the shared drive and point each machine to that install point or install the applications locally. Your call, really. Slapping all applications onto the shared storage means that every time you update that piece of software, you essentially have to restart everything all at once -- but only once. So, for updates you'll normally have downtime. If everything goes nicely, you're back up very quickly. If something breaks, you're down for some time. Putting just the data on shared storage, and the applications on individual nodes, means you're capable of rolling upgrades where you update your software node by node -- but then again, you have to do it on every node. If everything works on the first try, you'll normally take a bit longer than in the approach explained above. If something breaks on the upgrade of your first node, well, you shut it down, go back to square one, find and fix the root cause, while three others continue to hum along. I for one much prefer the second approach. Cheers, Florian -- Want to know how we've helped others? http://www.hastexo.com/shoutbox ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] managing config files as resources
Hi Larry, On Thu, Dec 1, 2011 at 6:59 PM, Larry Brigman larry.brig...@gmail.com wrote: Is there a method to manage individual files as resources? Which RA would be used and any pointer as to how to configure it would be great. Specifically we need to sync some files between nodes that have configuration data for our applications like which IP addresses are assigned to each node and what is the virtual IP of the cluster. So these files change, dynamically, and _all_ cluster nodes need to know about it? Or are the files just expected to move along with the resources? If the former, one possible approach is to put all files on central storage (say, an NFS mount point), and then use ocf:heartbeat:symlink to manage symlinks where your services expect to find the config files. If the latter, you can slap everything on DRBD, and mount the DRBD-backed filesystem wherever your resource is active. You may, of course, combine this with managed symlinks. Cheers, Florian -- Need help with Pacemaker? http://www.hastexo.com/knowledge/pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] managing config files as resources
On Thu, Dec 1, 2011 at 10:45 PM, Larry Brigman larry.brig...@gmail.com wrote: On Thu, Dec 1, 2011 at 1:42 PM, Florian Haas flor...@hastexo.com wrote: On Thu, Dec 1, 2011 at 10:35 PM, Larry Brigman larry.brig...@gmail.com wrote: Yes, the files can be changed dynamically - mostly by a user doing a configuration change. Is a user someone with shell access to the box, or a visitor on your web site (or whatever the service is)? Normally shell access but we also have an external service that pushes a config file into place also via sftp. Use csync2 then. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org