Re: [Pacemaker] Concurrent runs of 'crm configure primitive' interfering
On Wed, Sep 28, 2011 at 10:52:16AM -0400, Brian J. Murrell wrote: On 11-09-28 10:20 AM, Dejan Muhamedagic wrote: Hi, Hi, I'm really not sure. Need to investigate this area more. Well, I am experimenting with cibadmin. It's certainly not as nice and shiny as crm shell though. :-) cibadmin talks to the cib (the process) and cib should allow only one writer at the time. Good. That's needed of course. But what does it do with other attempting writers? Do they block until the CIB is available to write or do they turn their attempted writers away in error? Hmm, don't know. The shell keeps the changes in its memory until the user says commit (or if it's a single-shot configure command). Just before doing the commit, it checks (using cibadmin) if the CIB changed in the meantime (i.e. since it was last time loaded or refreshed in crm) and if so it refuses to commit changes. A. Hope that everything is OK over there :) That is, _unless_ it is forced to do so. So, if you use the -F option, one crm instance is likely to override changes of another crm instance or, for that matter, of anybody else. But is crm writing (i.e. replacing) entire CIBs or just updating fragments of it, like the resources and constraints, etc. it's being asked to operate on by the user? Entire CIB. It used to do only changed elements, but then everybody agreed that it is too complex to keep dependencies satisfied at all times. If the the latter, then two crm instances that are forced to write non-overlapping fragments should result in both being successful, if the cib is locking out concurrent cibadmin writers the way it should be, yes? Yes, but there could be a time frame when crm thinks that it can write the configuration. However, if the epoch changed in the meantime, then that write should fail. In short, having more than one crm instance trying to modify the configuration simultaneously probably won't give good results. As long as they are making non-colliding changes, shouldn't they both be successful? crm writes the whole CIB. And the matter is simple: If the cluster CIB changed since the crm itself accepted configuration modifications, there's no way to say which changes should take precedence and there's no obvious way to merge the changes coming from two different sources. Indeed, assuming they conflict. But if they don't, there shouldn't be any problem with two crms working on independent resources and constraints, yes? If you give me a patch which makes sure that the CIB change in the meantime doesn't affect the change done by the user in crm, perhaps I'll consider applying it. I guess that that is possible since the shell keeps track of which elements changed, though not in which way did they change. Then we'd need to switch back again to applying smaller changes to the CIB. If that is possible at all. At any rate, it's quite an undertaking. Now, this may be getting too far... CIB was not meant to be a real distributed database. What's your use case? We're using tools to drive HA configuration where those tools go out to the various nodes in the cluster and perform configuration tasks, possibly and probably in parallel, one of which is to issue the crm commands to configure the resources and constraints that that node will primarily be responsible for. Well, that may be a good use case, but we may not be that well equipped for such a scenario. Why not do all the changes on one node? Shouldn't it have all the information it needs? Or are the configuration commands issued based on some other local state? Thanks, Dejan b. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Concurrent runs of 'crm configure primitive' interfering
Hi, On Wed, Sep 28, 2011 at 09:12:57AM -0400, Brian J. Murrell wrote: On 11-09-16 11:14 AM, Dejan Muhamedagic wrote: On Thu, Sep 08, 2011 at 03:41:42PM +0100, John Spray wrote: * Is there another way of adding resources which would be safe when run concurrently? cibadmin. But doesn't crm use cibadmin itself and if so, shouldn't whatever benefits of using cibadmin directly filter up to crm shell? Put another way, if crm shell is just using cibadmin, isn't it likely that cibadmin will exhibit the same concurrency issue? I'm really not sure. Need to investigate this area more. cibadmin talks to the cib (the process) and cib should allow only one writer at the time. If that's not the case, then we have a bug. This sounds to me like a deficiency in the crm shell. Of course I'm still willing to believe that. I just wonder what crm shell could be doing that is failing with concurrent commands that cibadmin will not fail with also. The shell keeps the changes in its memory until the user says commit (or if it's a single-shot configure command). Just before doing the commit, it checks (using cibadmin) if the CIB changed in the meantime (i.e. since it was last time loaded or refreshed in crm) and if so it refuses to commit changes. That is, _unless_ it is forced to do so. So, if you use the -F option, one crm instance is likely to override changes of another crm instance or, for that matter, of anybody else. In short, having more than one crm instance trying to modify the configuration simultaneously probably won't give good results. And the matter is simple: If the cluster CIB changed since the crm itself accepted configuration modifications, there's no way to say which changes should take precedence and there's no obvious way to merge the changes coming from two different sources. What's your use case? Thanks, Dejan Cheers, b. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Concurrent runs of 'crm configure primitive' interfering
On 11-09-28 10:20 AM, Dejan Muhamedagic wrote: Hi, Hi, I'm really not sure. Need to investigate this area more. Well, I am experimenting with cibadmin. It's certainly not as nice and shiny as crm shell though. :-) cibadmin talks to the cib (the process) and cib should allow only one writer at the time. Good. That's needed of course. But what does it do with other attempting writers? Do they block until the CIB is available to write or do they turn their attempted writers away in error? The shell keeps the changes in its memory until the user says commit (or if it's a single-shot configure command). Just before doing the commit, it checks (using cibadmin) if the CIB changed in the meantime (i.e. since it was last time loaded or refreshed in crm) and if so it refuses to commit changes. A. That is, _unless_ it is forced to do so. So, if you use the -F option, one crm instance is likely to override changes of another crm instance or, for that matter, of anybody else. But is crm writing (i.e. replacing) entire CIBs or just updating fragments of it, like the resources and constraints, etc. it's being asked to operate on by the user? If the the latter, then two crm instances that are forced to write non-overlapping fragments should result in both being successful, if the cib is locking out concurrent cibadmin writers the way it should be, yes? In short, having more than one crm instance trying to modify the configuration simultaneously probably won't give good results. As long as they are making non-colliding changes, shouldn't they both be successful? And the matter is simple: If the cluster CIB changed since the crm itself accepted configuration modifications, there's no way to say which changes should take precedence and there's no obvious way to merge the changes coming from two different sources. Indeed, assuming they conflict. But if they don't, there shouldn't be any problem with two crms working on independent resources and constraints, yes? What's your use case? We're using tools to drive HA configuration where those tools go out to the various nodes in the cluster and perform configuration tasks, possibly and probably in parallel, one of which is to issue the crm commands to configure the resources and constraints that that node will primarily be responsible for. b. signature.asc Description: OpenPGP digital signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Concurrent runs of 'crm configure primitive' interfering
On Wed, Sep 28, 2011 at 11:12 PM, Brian J. Murrell br...@interlinx.bc.ca wrote: On 11-09-16 11:14 AM, Dejan Muhamedagic wrote: On Thu, Sep 08, 2011 at 03:41:42PM +0100, John Spray wrote: * Is there another way of adding resources which would be safe when run concurrently? cibadmin. But doesn't crm use cibadmin itself and if so, shouldn't whatever benefits of using cibadmin directly filter up to crm shell? Put another way, if crm shell is just using cibadmin, isn't it likely that cibadmin will exhibit the same concurrency issue? Presumably the shell is reading and writing the whole configuration section instead of just the part you changed. This sounds to me like a deficiency in the crm shell. Of course I'm still willing to believe that. I just wonder what crm shell could be doing that is failing with concurrent commands that cibadmin will not fail with also. Cheers, b. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Concurrent runs of 'crm configure primitive' interfering
Hi, On Thu, Sep 08, 2011 at 03:41:42PM +0100, John Spray wrote: Hi, I have some scripts which configure resources across a number of nodes in a cluster. I'm finding that when more than one crm configure primitive invokation is run at the same time, they sometimes interfere with each other: e.g. when adding resource A and B concurrently, I sometimes end up with just A configured, sometimes just B, sometimes A and B. I see this when there are two concurrent runs on the same host, and I'm guessing that the same thing will happen with concurrent runs on multiple hosts. Probably not. But if so, then my theory below is partly wrong. Questions: * Is this expected behaviour? No. * Is there another way of adding resources which would be safe when run concurrently? cibadmin. This sounds to me like a deficiency in the crm shell. You may open a bugzilla (once linuxfoundation site is back). Thanks, Dejan Regards, John ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker