Re: [Linux-ha-dev] Patch: VirtualDomain - fix probe if config is not on shared storage
Hi Dominik, On Fri, Jun 24, 2011 at 03:50:40PM +0200, Dominik Klein wrote: Hi Dejan, this way, the cluster never learns that it can't start a resource on that node. This resource depends on shared storage. So, the cluster won't try to start it unless the shared storage resource is already running. This is something that needs to be specified using either a negative preference location constraint or asymmetrical cluster. There's no need for yet another mechanism (the extra parameter) built into the resource agent. It's really an overkill. Makes sense? Cheers, Dejan I don't consider this a solution. Regards Dominik ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/ ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] Patch: VirtualDomain - fix probe if config is not on shared storage
On 06/27/2011 11:09 AM, Dejan Muhamedagic wrote: Hi Dominik, On Fri, Jun 24, 2011 at 03:50:40PM +0200, Dominik Klein wrote: Hi Dejan, this way, the cluster never learns that it can't start a resource on that node. This resource depends on shared storage. So, the cluster won't try to start it unless the shared storage resource is already running. This is something that needs to be specified using either a negative preference location constraint or asymmetrical cluster. There's no need for yet another mechanism (the extra parameter) built into the resource agent. It's really an overkill. As requested on IRC, I describe my setup and explain why I think this is a regression. 2 node cluster with a bunch of drbd devices. Each /dev/drbdXX is used as a block device of a VM. The VMs configuration files are not on shared storage but have to be copied manually. So it happened that during configuration of a VM, the admin forgot to copy the configuration file to node2. The machine's DRBD was configured though. So the cluster decided to promote the VMs DRBD on node2 and then start the master-colocated and ordered VM. With the agent before the mentioned patch, during probe of a newly configured resource, the cluster would have learned that the VM is not available on one of the nodes (ERR_INSTALLED), so it would never start the resource there. Now it sees NOT_RUNNING on all nodes during probe and may decide to start the VM on a node where it cannot run. That, with the current version of the agent, leads to a failed start, a failed stop during recovery and therefore: an unnecessary stonith operation. With Dejan's patch, it would still see NOT_RUNNING during probe, but at least the stop would succeed. So the difference to the old version would be that we had an unnecessary failed start on the node that does not have the VM but it would not harm the node and I'd be fine with applying that patch. There's a case though that might stop the vm from running (for an amount of time). And that is if start-failure-is-fatal is false. Then we would have $migration-threshold failed start/succeeded stop iterations while the VMs service would not be running. Of course I do realize that the initial fault is a human one. but the cluster used to protect from this, does not any more and that's why I think this is a regression. I think the correct way to fix this is to still return ERR_INSTALLED during probe unless the cluster admin configures that the VMs config is on shared storage. Finding out about resource states on different nodes is what the probe was designed to do, was it not? And we work around that in this resource agent just to support certain setups. Regards Dominik ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] Patch: VirtualDomain - fix probe if config is not on shared storage
On Mon, Jun 27, 2011 at 12:00:28PM +0200, Dominik Klein wrote: On 06/27/2011 11:09 AM, Dejan Muhamedagic wrote: Hi Dominik, On Fri, Jun 24, 2011 at 03:50:40PM +0200, Dominik Klein wrote: Hi Dejan, this way, the cluster never learns that it can't start a resource on that node. This resource depends on shared storage. So, the cluster won't try to start it unless the shared storage resource is already running. This is something that needs to be specified using either a negative preference location constraint or asymmetrical cluster. There's no need for yet another mechanism (the extra parameter) built into the resource agent. It's really an overkill. As requested on IRC, I describe my setup and explain why I think this is a regression. 2 node cluster with a bunch of drbd devices. Each /dev/drbdXX is used as a block device of a VM. The VMs configuration files are not on shared storage but have to be copied manually. So it happened that during configuration of a VM, the admin forgot to copy the configuration file to node2. The machine's DRBD was configured though. So the cluster decided to promote the VMs DRBD on node2 and then start the master-colocated and ordered VM. With the agent before the mentioned patch, during probe of a newly configured resource, the cluster would have learned that the VM is not available on one of the nodes (ERR_INSTALLED), so it would never start the resource there. This is exactly the problem with shared storage setups, where such an exit code can prevent resource from ever being started on a node which is otherwise perfectly capable of running that resource. Now it sees NOT_RUNNING on all nodes during probe and may decide to start the VM on a node where it cannot run. But really, if a resource can _never_ run on a node, then there should be a negative location constraint or the cluster should be setup as asymmetrical. Now, I understand that in your case, it is actually due to the administrator's fault. That, with the current version of the agent, leads to a failed start, a failed stop during recovery and therefore: an unnecessary stonith operation. With Dejan's patch, it would still see NOT_RUNNING during probe, but at least the stop would succeed. So the difference to the old version would be that we had an unnecessary failed start on the node that does not have the VM but it would not harm the node and I'd be fine with applying that patch. There's a case though that might stop the vm from running (for an amount of time). And that is if start-failure-is-fatal is false. Then we would have $migration-threshold failed start/succeeded stop iterations while the VMs service would not be running. Of course I do realize that the initial fault is a human one. but the cluster used to protect from this, does not any more and that's why I think this is a regression. I think the correct way to fix this is to still return ERR_INSTALLED during probe unless the cluster admin configures that the VMs config is on shared storage. Finding out about resource states on different nodes is what the probe was designed to do, was it not? And we work around that in this resource agent just to support certain setups. This particular setup is a special case of shared storage. The images are on shared storage, but the configurations are local. I think that you really need to make sure that the configurations are present where they need to be. Best would be that the configuration is kept on the storage along with the corresponding VM image. Since you're using a raw device as image, that's obviously not possible. Otherwise, use csync2 or similar to keep files in sync. Cheers, Dejan Regards Dominik ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/ ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] Patch: VirtualDomain - fix probe if config is not on shared storage
With the agent before the mentioned patch, during probe of a newly configured resource, the cluster would have learned that the VM is not available on one of the nodes (ERR_INSTALLED), so it would never start the resource there. This is exactly the problem with shared storage setups, where such an exit code can prevent resource from ever being started on a node which is otherwise perfectly capable of running that resource. I see and understand that that, too, is a valid setup and concern. But really, if a resource can _never_ run on a node, then there should be a negative location constraint or the cluster should be setup as asymmetrical. There did not have to be a negative location constraint up to now, because the cluster took care of that. Now, I understand that in your case, it is actually due to the administrator's fault. Yes, that's how I noticed the problem with the agent. This particular setup is a special case of shared storage. The images are on shared storage, but the configurations are local. I think that you really need to make sure that the configurations are present where they need to be. Best would be that the configuration is kept on the storage along with the corresponding VM image. Since you're using a raw device as image, that's obviously not possible. Otherwise, use csync2 or similar to keep files in sync. Actually, this is a wanted setup. It happened that VMs configs were changed in ways that lead to a VM not being startable any more. For that case, they wanted to be able to start the old config on the other node. I agree that the cases that lead me to finding this change in the agent are cases that could have been solved with better configuration and that your suggestions make sense. Still, I feel that the change introduces a new way of doing things that might affect running and working setups in unintended ways. I refuse to believe that I am the only one doing HA VMs like this (although of course I might be wrong on that, too ...). Regards Dominik ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] Patch: VirtualDomain - fix probe if config is not on shared storage
On 2011-06-27T12:00:28, Dominik Klein dominik.kl...@googlemail.com wrote: Now it sees NOT_RUNNING on all nodes during probe and may decide to start the VM on a node where it cannot run. That, with the current version of the agent, leads to a failed start, a failed stop during recovery and therefore: an unnecessary stonith operation. Yes, the 'stop' in the old agent is/was broken. The probe, alas, can't explicitly check all pre-requisites, since they may not be online yet. It, perhaps, was a mistake to use a monitor as the probe, with 20:20 hindsight. It seemed an improvement at the time, but nowadays I'm no longer so sure; it requires the ocf_is_probe special case that I'm not so fond of and leads to discussions like this. ;-) Dejan is correct: unless the monitor op during probe has more evidence than a missing file, it probably shouldn't return ERR_INSTALLED (nor _CONFIGURED); that'll block the resource from the node completely. It _is_ a valid return code of course, but inappropriate for bits that could be on shared storage and simply missing. Actually, all we _must_ know for monitor_0 is if the resource is active in any capacity. Any further requirements probably are best checked at start time. I think the correct way to fix this is to still return ERR_INSTALLED during probe unless the cluster admin configures that the VMs config is on shared storage. Finding out about resource states on different nodes is what the probe was designed to do, was it not? And we work around that in this resource agent just to support certain setups. Yeah, and that is a pretty depressing result. But I definitely dislike the special switch for telling the cluster that the config is on shared storage like that. That would be a scenario that no admin would test. So it seems defining a specific probe operation would appear to be a good idea going forward; it can, in fact, do exactly the same thing as a monitor (if it has enough definite evidence), but it would be more obvious that the emphasis is different. And hopefully be less confusing. Regards, Lars -- Architect Storage/HA, OPS Engineering, Novell, Inc. SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) Experience is the name everyone gives to their mistakes. -- Oscar Wilde ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
[Linux-ha-dev] [PATCH] [glue] Fix compilation with GCC 4.6
The attached patch fixes compilation -Werrors with GCC 4.6 cheers, Pádraig. Fix compilation with GCC 4.6 avoid -Werror=unused-but-set-variable issues remove -Wcast-qual which caused issues with copyHostList() diff -r 856ae1408ff9 configure.ac --- a/configure.ac Sun Jun 19 21:03:24 2011 +0200 +++ b/configure.ac Tue Jun 28 01:23:28 2011 +0100 @@ -1186,12 +1186,12 @@ CFLAGS=$CFLAGS -ggdb3 -O0 # We had to eliminate -Wnested-externs because of libtool changes + # -Wcast-qual gives errors with GCC 4.6 EXTRA_FLAGS=-fgnu89-inline -fstack-protector-all -Wall -Waggregate-return -Wbad-function-cast - -Wcast-qual -Wcast-align -Wdeclaration-after-statement -Wendif-labels diff -r 856ae1408ff9 lib/clplumbing/base64_md5_test.c --- a/lib/clplumbing/base64_md5_test.c Sun Jun 19 21:03:24 2011 +0200 +++ b/lib/clplumbing/base64_md5_test.c Tue Jun 28 01:23:28 2011 +0100 @@ -108,5 +108,6 @@ error_count++; } +(void) rc; /* Suppress -Werror=unused-but-set-variable */ return error_count; } diff -r 856ae1408ff9 lib/clplumbing/cl_msg.c --- a/lib/clplumbing/cl_msg.c Sun Jun 19 21:03:24 2011 +0200 +++ b/lib/clplumbing/cl_msg.c Tue Jun 28 01:23:28 2011 +0100 @@ -598,7 +598,6 @@ { size_t startlen = sizeof(MSG_START)-1; - int internal_type; int (*addfield) (struct ha_msg* msg, char* name, size_t namelen, @@ -633,8 +632,6 @@ return(HA_FAIL); } - internal_type = type; - HA_MSG_ASSERT(type DIMOF(fieldtypefuncs)); addfield = fieldtypefuncs[type].addfield; diff -r 856ae1408ff9 lib/clplumbing/cl_msg_types.c --- a/lib/clplumbing/cl_msg_types.c Sun Jun 19 21:03:24 2011 +0200 +++ b/lib/clplumbing/cl_msg_types.c Tue Jun 28 01:23:28 2011 +0100 @@ -931,7 +931,6 @@ void* value, size_t vallen, int depth) { int next; - struct ha_msg* childmsg; if ( !msg || !name || !value || depth 0){ @@ -940,8 +939,6 @@ return HA_FAIL; } - childmsg = (struct ha_msg*)value; - next = msg-nfields; msg-names[next] = name; msg-nlens[next] = namelen; @@ -964,7 +961,6 @@ int next; int j; GList* list = NULL; - int stringlen_add; if ( !msg || !name || !value || namelen = 0 @@ -983,13 +979,8 @@ } if ( j = msg-nfields){ - int listlen; list = (GList*)value; - listlen = string_list_pack_length(list); - - stringlen_add = list_stringlen(namelen,listlen , value); - next = msg-nfields; msg-names[next] = name; msg-nlens[next] = namelen; @@ -1001,8 +992,7 @@ } else if( msg-types[j] == FT_LIST ){ GList* oldlist = (GList*) msg-values[j]; - int oldlistlen = string_list_pack_length(oldlist); - int newlistlen; + int listlen; size_t i; for ( i =0; i g_list_length((GList*)value); i++){ @@ -1014,12 +1004,10 @@ return HA_FAIL; } - newlistlen = string_list_pack_length(list); + listlen = string_list_pack_length(list); - stringlen_add = newlistlen - oldlistlen; - msg-values[j] = list; - msg-vlens[j] = string_list_pack_length(list); + msg-vlens[j] = listlen; g_list_free((GList*)value); /*we don't free each element because they are used in new list*/ free(name); /* this name is no longer necessary @@ -1069,7 +1057,6 @@ void* value, size_t vallen, int depth) { int next; - struct ha_msg* childmsg; if ( !msg || !name || !value || depth 0){ @@ -1078,8 +1065,6 @@ return HA_FAIL; } - childmsg = (struct ha_msg*)value; - next = msg-nfields; msg-names[next] = name; msg-nlens[next] = namelen; diff -r 856ae1408ff9 lib/clplumbing/cl_poll.c --- a/lib/clplumbing/cl_poll.c Sun Jun 19 21:03:24 2011 +0200 +++ b/lib/clplumbing/cl_poll.c Tue Jun 28 01:23:28 2011 +0100 @@ -497,7 +497,6 @@ int cl_poll_ignore(int fd) { - short nsig; int flags; if (debug) { @@ -511,7 +510,6 @@ if (!is_monitored[fd]) { return 0; } - nsig = monitorinfo[fd].nsig; is_monitored[fd] = FALSE; memset(monitorinfo+fd, 0, sizeof(monitorinfo[0])); diff -r 856ae1408ff9 lib/clplumbing/cl_syslog.c --- a/lib/clplumbing/cl_syslog.c Sun Jun 19 21:03:24 2011 +0200 +++ b/lib/clplumbing/cl_syslog.c Tue Jun 28 01:23:28 2011 +0100 @@ -120,14 +120,12 @@ int cl_syslogfac_str2int(const char *fname) { - struct _syslog_code *fnames; int i; if(fname == NULL || strcmp(none, fname) == 0) { return 0; } - fnames = (struct _syslog_code *) facilitynames; for (i = 0; facilitynames[i].c_name != NULL; i++) { if (strcmp(fname, facilitynames[i].c_name) == 0) { return facilitynames[i].c_val; @@ -140,10 +138,8 @@ const char * cl_syslogfac_int2str(int fnum) { - struct _syslog_code *fnames; int i; - fnames = (struct _syslog_code *) facilitynames; for (i = 0; facilitynames[i].c_name != NULL; i++) { if (facilitynames[i].c_val == fnum) { return facilitynames[i].c_name; diff -r 856ae1408ff9 lib/clplumbing/ipcsocket.c --- a/lib/clplumbing/ipcsocket.c Sun Jun 19 21:03:24 2011 +0200 +++ b/lib/clplumbing/ipcsocket.c Tue Jun 28 01:23:28 2011 +0100
Re: [Linux-ha-dev] regressions in resource-agents 3.9.1
Hi, Is there any backlogs for the 3.9.2 release? I'm very looking forward to see it soon since 3.9.1 was not really usable for me... Thanks, 2011/6/22 Dejan Muhamedagic deja...@fastmail.fm: Hi all, On Wed, Jun 22, 2011 at 11:22:48PM +0900, Keisuke MORI wrote: 2011/6/22 Florian Haas florian.h...@linbit.com: On 2011-06-22 11:48, Dejan Muhamedagic wrote: Hello all, Unfortunately, it turned out that there were two regressions in the 3.9.1 release: - iscsi on platforms which run open-iscsi 2.0-872 (see http://developerbugs.linux-foundation.org/show_bug.cgi?id=2562) - pgsql probes with shared storage (iirc), see http://marc.info/?l=linux-ham=130858569405820w=2 Thanks to Vadym Chepkov for finding and reporting them. I'd suggest to make a quick fix release 3.9.2. Opinions? Agree. +1 OK. Let's do that on Friday morning. Tomorrow is holiday here. Cheers, Dejan -- Keisuke MORI ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/ ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/ -- Keisuke MORI ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/