Re: [Linux-ha-dev] MAXMSG too small
Alan Robertson wrote: Guochun Shi wrote: Andrew Beekhof wrote: On 5/29/06, Alan Robertson <[EMAIL PROTECTED]> wrote: Andrew Beekhof wrote: > Running CTS on 6 nodes has shown MAXMSG to be too small - the PE cannot > send its transition graph and the cluster stalls indefinitely. So, that means the CIB is > 256K compressed? Or is it > 256K uncompressed? its being added with ha_msg_addstruct_compress(msg, field, xml); and sent via IPC to the crmd (from the pengine) whether its actually been compressed or not i dont know. It should be compressed if you have specified compression method ha.cf. However it would be good to have some proof that it is compressed. Having a message > 256K after compression means the uncompressed one probably has 1M ~2M Another way that might be interesting is to provide an API that has much higher bound, which is suited for local usage only. > We could increase the value but looking through the code this seems to > be an artificial limitation to various degrees... > > * In some cases its used as a substitute for get_netstringlen(msg) - I > believe these should be fixed > > * In some cases its used to pre-empt checks by "child" functions - I > believe these should be removed. > > The two cases that seem to legitimately use MAXMSG are the HBcomm > plugins and the decompression code (though even that could retry a > "couple" of time with larger buffers). > > > Alan, can you please take a look at the use of MAXMSG in the IPC > layer which is really not my area of expertise (especially the HBcomm > plugins) and verify that my assessment is correct (and possibly get > someone to look at fixing it). Unfortunately, this means various buffers get locked into memory at this size. Our processes are already pretty huge. get_netstringlen() is an expensive call. Thats basically the tradeoff... either we increase MAXMSG and take a hit on the process size, or we do more dynamically and take a runtime hit. Not being a guru in the IPC layer, I dont know which is worse. However, my suspicion was that get_(net)stringlen was not too bad for flat messages and would therefore be preferred. Why do you think that predicting that child buffers will be too large is a bad idea? How do you understand that removing it will help? For low values of MAXMSG I think its fine to do that. But we keep upping the value and allocating 256k for regular heartbeat packets seems like a real waste. Is your concern related to compressed/uncompressed sizes? As above. I'm doing my part and indicating that it can/should be compressed, but i dont know the internals well enough to say for sure. Andrew, if you can send log/debug file to me, I may (or may not) find some clue I think that MAXMSG is inappropriately used for the size of IPC messages - which would prevent messages from being sent in some cases. are you saying that there should be higher limit or no limit in IPC-only messages? I think the message layer can provide another API for that -Guochun ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] MAXMSG too small
Guochun Shi wrote: Andrew Beekhof wrote: On 5/29/06, Alan Robertson <[EMAIL PROTECTED]> wrote: Andrew Beekhof wrote: > Running CTS on 6 nodes has shown MAXMSG to be too small - the PE cannot > send its transition graph and the cluster stalls indefinitely. So, that means the CIB is > 256K compressed? Or is it > 256K uncompressed? its being added with ha_msg_addstruct_compress(msg, field, xml); and sent via IPC to the crmd (from the pengine) whether its actually been compressed or not i dont know. It should be compressed if you have specified compression method ha.cf. However it would be good to have some proof that it is compressed. Having a message > 256K after compression means the uncompressed one probably has 1M ~2M Another way that might be interesting is to provide an API that has much higher bound, which is suited for local usage only. > We could increase the value but looking through the code this seems to > be an artificial limitation to various degrees... > > * In some cases its used as a substitute for get_netstringlen(msg) - I > believe these should be fixed > > * In some cases its used to pre-empt checks by "child" functions - I > believe these should be removed. > > The two cases that seem to legitimately use MAXMSG are the HBcomm > plugins and the decompression code (though even that could retry a > "couple" of time with larger buffers). > > > Alan, can you please take a look at the use of MAXMSG in the IPC > layer which is really not my area of expertise (especially the HBcomm > plugins) and verify that my assessment is correct (and possibly get > someone to look at fixing it). Unfortunately, this means various buffers get locked into memory at this size. Our processes are already pretty huge. get_netstringlen() is an expensive call. Thats basically the tradeoff... either we increase MAXMSG and take a hit on the process size, or we do more dynamically and take a runtime hit. Not being a guru in the IPC layer, I dont know which is worse. However, my suspicion was that get_(net)stringlen was not too bad for flat messages and would therefore be preferred. Why do you think that predicting that child buffers will be too large is a bad idea? How do you understand that removing it will help? For low values of MAXMSG I think its fine to do that. But we keep upping the value and allocating 256k for regular heartbeat packets seems like a real waste. Is your concern related to compressed/uncompressed sizes? As above. I'm doing my part and indicating that it can/should be compressed, but i dont know the internals well enough to say for sure. Andrew, if you can send log/debug file to me, I may (or may not) find some clue I think that MAXMSG is inappropriately used for the size of IPC messages - which would prevent messages from being sent in some cases. -- Alan Robertson <[EMAIL PROTECTED]> "Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] NetBSD / HeartBeat Question ...
KlinT wrote: Hi there, Well, I'm very new to heartbeat technology ... And I have a quite simple question ... Does somebody successfully compile / install / run heartbeat on NetBSD ? I've to compile it from sources but it fails ... :( From heartbeat-2.0.5.tar.gz : # ./configure --prefix=/usr/local --localstatedir=/var/lib/heartbeat # make .../... gcc -DHAVE_CONFIG_H -I. -I. -I../../linux-ha -I../../include -I../../include -I../../include -I../../linux-ha -I../../linux-ha -I../../libltdl -I../../libltdl -I/usr/local/include -DLIBNET_BSDISH_OS -DLIBNET_BSD_BYTE_SWAP -DLIBNET_LIL_ENDIAN -I/usr/pkg/include -I/usr/pkg/include/glib/glib-2.0 -I/usr/pkg/lib/glib-2.0/include -I/usr/pkg/include/libxml2 -g -O2 -Wall -Wmissing-prototypes -Wmissing-declarations -Wstrict-prototypes -Wpointer-arith -Wwrite-strings -Wcast-qual -Wcast-align -Wbad-function-cast -Winline -Wmissing-format-attribute -Wformat=2 -Wformat-security -Wformat-nonliteral -Wno-long-long -Wno-strict-aliasing -Werror -ggdb3 -funsigned-char -DALIGNFUNC=ALIGN -DMSGHDR_TYPE=msghdr -DHALIB=\"/usr/local/lib/heartbeat\" -g -O2 -Wall -Wmissing-prototypes -Wmissing-declarations -Wstrict-prototypes -Wpointer-arith -Wwrite-strings -Wcast-qual -Wcast-align -Wbad-function-cast -Winline -Wmissing-format-attribute -Wformat=2 -Wformat-security -Wformat-nonliteral -Wno-long-long -Wno-strict-aliasing -Werror -ggdb3 -funsigned-char -MT ipcsocket.lo -MD -MP -MF .deps/ipcsocket.Tpo -c ipcsocket.c -fPIC -DPIC -o .libs/ipcsocket.o ipcsocket.c: In function `socket_verify_auth': ipcsocket.c:2489: error: `ngroups' undeclared (first use in this function) ipcsocket.c:2489: error: (Each undeclared identifier is reported only once ipcsocket.c:2489: error: for each function it appears in.) gmake[1]: *** [ipcsocket.lo] Error 1 gmake[1]: Leaving directory `/root/heartbeat-2.0.5/lib/clplumbing' gmake: *** [all-recursive] Error 1 *** Error code 1 Stop. make: stopped in /root/heartbeat-2.0.5 Here is the version of : ipcsocket.c /* $Id: ipcsocket.c,v 1.173 2006/02/02 15:58:00 alan Exp $ */ I've verified that the little patch from Alan is applied ... Well... I don't think anyone has finished the NetBSD port yet (as you can see). Probably best to compile from CVS, if that's OK with you. But, it won't fix this problem :-). Doesn't NetBSD have the getpeereid() call? If so, this code shouldn't even get compiled... I'm pretty sure that Matt Soffen put this particular piece of code in... And, it's pretty obviously broken... The only place 'ngroups' appears is here: # define EXTRASPACE SOCKCREDSIZE(ngroups) And EXTRASPACE is only used here; #define CMSGSIZE \ (sizeof(struct cmsghdr)+(sizeof(Cred))+EXTRASPACE) So, this could have never compiled, without some kind of definition of 'ngroups'. If NetBSD _does_ have getpeereid(), then we should figure out why it's not getting used, and fix that. Then this problem will go away :-) -- Alan Robertson <[EMAIL PROTECTED]> "Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
[Linux-ha-dev] NetBSD / HeartBeat Question ...
Hi there, Well, I'm very new to heartbeat technology ... And I have a quite simple question ... Does somebody successfully compile / install / run heartbeat on NetBSD ? I've to compile it from sources but it fails ... :( From heartbeat-2.0.5.tar.gz : # ./configure --prefix=/usr/local --localstatedir=/var/lib/heartbeat # make .../... gcc -DHAVE_CONFIG_H -I. -I. -I../../linux-ha -I../../include -I../../include -I../../include -I../../linux-ha -I../../linux-ha -I../../libltdl -I../../libltdl -I/usr/local/include -DLIBNET_BSDISH_OS -DLIBNET_BSD_BYTE_SWAP -DLIBNET_LIL_ENDIAN -I/usr/pkg/include -I/usr/pkg/include/glib/glib-2.0 -I/usr/pkg/lib/glib-2.0/include -I/usr/pkg/include/libxml2 -g -O2 -Wall -Wmissing-prototypes -Wmissing-declarations -Wstrict-prototypes -Wpointer-arith -Wwrite-strings -Wcast-qual -Wcast-align -Wbad-function-cast -Winline -Wmissing-format-attribute -Wformat=2 -Wformat-security -Wformat-nonliteral -Wno-long-long -Wno-strict-aliasing -Werror -ggdb3 -funsigned-char -DALIGNFUNC=ALIGN -DMSGHDR_TYPE=msghdr -DHALIB=\"/usr/local/lib/heartbeat\" -g -O2 -Wall -Wmissing-prototypes -Wmissing-declarations -Wstrict-prototypes -Wpointer-arith -Wwrite-strings -Wcast-qual -Wcast-align -Wbad-function-cast -Winline -Wmissing-format-attribute -Wformat=2 -Wformat-security -Wformat-nonliteral -Wno-long-long -Wno-strict-aliasing -Werror -ggdb3 -funsigned-char -MT ipcsocket.lo -MD -MP -MF .deps/ipcsocket.Tpo -c ipcsocket.c -fPIC -DPIC -o .libs/ipcsocket.o ipcsocket.c: In function `socket_verify_auth': ipcsocket.c:2489: error: `ngroups' undeclared (first use in this function) ipcsocket.c:2489: error: (Each undeclared identifier is reported only once ipcsocket.c:2489: error: for each function it appears in.) gmake[1]: *** [ipcsocket.lo] Error 1 gmake[1]: Leaving directory `/root/heartbeat-2.0.5/lib/clplumbing' gmake: *** [all-recursive] Error 1 *** Error code 1 Stop. make: stopped in /root/heartbeat-2.0.5 Here is the version of : ipcsocket.c /* $Id: ipcsocket.c,v 1.173 2006/02/02 15:58:00 alan Exp $ */ I've verified that the little patch from Alan is applied ... Any ideas, Many thanks. ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] MAXMSG too small
Andrew Beekhof wrote: On 5/29/06, Alan Robertson <[EMAIL PROTECTED]> wrote: Andrew Beekhof wrote: > Running CTS on 6 nodes has shown MAXMSG to be too small - the PE cannot > send its transition graph and the cluster stalls indefinitely. So, that means the CIB is > 256K compressed? Or is it > 256K uncompressed? its being added with ha_msg_addstruct_compress(msg, field, xml); and sent via IPC to the crmd (from the pengine) whether its actually been compressed or not i dont know. It should be compressed if you have specified compression method ha.cf. However it would be good to have some proof that it is compressed. Having a message > 256K after compression means the uncompressed one probably has 1M ~2M Another way that might be interesting is to provide an API that has much higher bound, which is suited for local usage only. > We could increase the value but looking through the code this seems to > be an artificial limitation to various degrees... > > * In some cases its used as a substitute for get_netstringlen(msg) - I > believe these should be fixed > > * In some cases its used to pre-empt checks by "child" functions - I > believe these should be removed. > > The two cases that seem to legitimately use MAXMSG are the HBcomm > plugins and the decompression code (though even that could retry a > "couple" of time with larger buffers). > > > Alan, can you please take a look at the use of MAXMSG in the IPC > layer which is really not my area of expertise (especially the HBcomm > plugins) and verify that my assessment is correct (and possibly get > someone to look at fixing it). Unfortunately, this means various buffers get locked into memory at this size. Our processes are already pretty huge. get_netstringlen() is an expensive call. Thats basically the tradeoff... either we increase MAXMSG and take a hit on the process size, or we do more dynamically and take a runtime hit. Not being a guru in the IPC layer, I dont know which is worse. However, my suspicion was that get_(net)stringlen was not too bad for flat messages and would therefore be preferred. Why do you think that predicting that child buffers will be too large is a bad idea? How do you understand that removing it will help? For low values of MAXMSG I think its fine to do that. But we keep upping the value and allocating 256k for regular heartbeat packets seems like a real waste. Is your concern related to compressed/uncompressed sizes? As above. I'm doing my part and indicating that it can/should be compressed, but i dont know the internals well enough to say for sure. Andrew, if you can send log/debug file to me, I may (or may not) find some clue -Guochun ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] Re: [Linux-ha-cvs] Linux-HA CVS: crm by lars from
On 2006-05-30T17:33:48, Andrew Beekhof <[EMAIL PROTECTED]> wrote: > lmb, i think this change should be in a separate patch, not in cvs. I can do that, yes. Good point. I'll revert them in CVS. Sincerely, Lars Marowsky-Brée -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
[Linux-ha-dev] Re: [Linux-ha-cvs] Linux-HA CVS: crm by lars from
lmb, i think this change should be in a separate patch, not in cvs. On 5/30/06, linux-ha-cvs@lists.linux-ha.org wrote: linux-ha CVS committal Author : lars Host: Project : linux-ha Module : crm Dir : linux-ha/crm Modified Files: crm-1.0.dtd Log Message: Temporary work-around for #1278: Allow rsc_location w/o rules into the CIB, as the GUI creates them. === RCS file: /home/cvs/linux-ha/linux-ha/crm/crm-1.0.dtd,v retrieving revision 1.71 retrieving revision 1.72 diff -u -3 -r1.71 -r1.72 --- crm-1.0.dtd 29 May 2006 15:59:34 - 1.71 +++ crm-1.0.dtd 30 May 2006 15:12:23 - 1.72 @@ -159,7 +159,7 @@ score (INFINITY|-INFINITY) #REQUIRED> - + http://lists.community.tummy.com/mailman/listinfo/linux-ha-cvs ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] Proposal for function calls in CIB rsc_location rules
On 2006-05-30T06:49:43, Alan Robertson <[EMAIL PROTECTED]> wrote: > This might be implemented by two separate types of plugins. One type > could require them to return a single integer for 'remaining capacity' > on a given node, and a different type which would compute the 'load' for > the resource in question (how much capacity it consumes). [For this > latter type, one might pass the resource definition to the plugin as an > ha_msg or something]. Or one could simply add an integer for > load/capacity consumed, and make it a constant... Several options > suggest themselves... Yes, that would be a good approach. (If one considers mapping all different aspects of the load vector into a single scalar metric acceptable - as long as they don't get mixed, that might do.) Yet, the complex part remains what to do with this, which is better than what will happen right now already (ie, trying to start, noticing it doesn't, and retrying somewhere else). > >Not that it will never happen, but I wouldn't be looking to go down this > >path before we at least have the colocation changes stabilized. > I was just floating this as a sort of trial balloon to what people > thought about it... > > It is something I would expect to consider for 2007 - if at all... Adding load balancing is certainly something worthwhile, no doubt about it. It's just that I think Andrew's and my bandwidth is currently completely used up by getting what we already have into a useable state, which makes us go "oh my god NO!" whenever someone suggests a new feature somewhere ;-) > This is not to try and solve the bin packing (or knapsack) problems. > But, there are simple heuristic approaches which do a credible job for > cases where sufficient capacity is available... Well, of course we ain't going to solve it, but it is a variant of it, and exactly where we'd steal the heuristic algorithm from, and then would need to figure out how to allocate it. ;-) Actually, it wouldn't be too bad. We could do this at the stage where we assign nodes to colors in the PE. (each color would have a "metric" computed for it, and then we could try to schedule the colors better than simple rr.) Alas, I think before we can consider going near that, we need weighted co-location. It's something on the radar, but it's most definetely 2007 material I think. Sincerely, Lars Marowsky-Brée -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] Re: [Linux-ha-cvs] Linux-HA CVS: linux-ha by zhenh from
On 2006-05-30T06:56:13, Alan Robertson <[EMAIL PROTECTED]> wrote: > But, if your default for locally compiled packages is /local/usr or > whatever, then as I understand it, you won't be able to build RPMs that > conform to your local conventions. I don't understand what you're saying, actually? You can even "relocate" rpms on install, if you so choose. (And if the package gets it right, which I've never bothered with ;-) Or override the macros on the rpm commandline, or give it an additional rc file, or have a site-wide rpm.rc file (/etc/rpmrc), per user defaults (~/.rpmrc) et cetera... Sincerely, Lars Marowsky-Brée -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] Re: [Linux-ha-cvs] Linux-HA CVS: linux-ha by zhenh from
On 5/30/06, Alan Robertson <[EMAIL PROTECTED]> wrote: Andrew Beekhof wrote: > On 5/29/06, Alan Robertson <[EMAIL PROTECTED]> wrote: >> Lars Marowsky-Bree wrote: >> > On 2006-05-29T12:11:07, Alan Robertson <[EMAIL PROTECTED]> wrote: >> > >> >> I didn't know until just a day or two ago that it broke anything. >> >> >> >> Having it in ConfigureMe did indeed break things. There were numerous >> >> complaints about the source RPMs being useless on platforms they >> weren't >> >> designed for. Sigh... >> > >> > The fix for that would be to use the rpm macros within the specfile. >> >> Unless of course, someone has overridden them... Which puts us back >> where we started > > i believe half the point is that by using the rpm macros, the .spec > file would automatically pick up any overrides or defaults for that > system (ie. ones that show up with rpm --showrc which is the proper > place to set options for an rpm-based system) But, if your default for locally compiled packages is /local/usr or whatever, then as I understand it, you won't be able to build RPMs that conform to your local conventions. surely the defaults describe your local conventions... otherwise why are they your defaults? and "because thats what the OS shipped with" isnt an answer either... because they can be changed. and by doing it this way the admin's defaults are automatically used for every package they build on the machine. ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] Re: [Linux-ha-cvs] Linux-HA CVS: linux-ha by zhenh from
Andrew Beekhof wrote: On 5/29/06, Alan Robertson <[EMAIL PROTECTED]> wrote: Lars Marowsky-Bree wrote: > On 2006-05-29T12:11:07, Alan Robertson <[EMAIL PROTECTED]> wrote: > >> I didn't know until just a day or two ago that it broke anything. >> >> Having it in ConfigureMe did indeed break things. There were numerous >> complaints about the source RPMs being useless on platforms they weren't >> designed for. Sigh... > > The fix for that would be to use the rpm macros within the specfile. Unless of course, someone has overridden them... Which puts us back where we started i believe half the point is that by using the rpm macros, the .spec file would automatically pick up any overrides or defaults for that system (ie. ones that show up with rpm --showrc which is the proper place to set options for an rpm-based system) But, if your default for locally compiled packages is /local/usr or whatever, then as I understand it, you won't be able to build RPMs that conform to your local conventions. -- Alan Robertson <[EMAIL PROTECTED]> "Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] Proposal for function calls in CIB rsc_location rules
Andrew Beekhof wrote: On May 29, 2006, at 11:40 PM, Alan Robertson wrote: The CIB doesn't take very well to continual updates of attribute values. It is a lot of overhead and expense to continually update them. For certain kinds of resources, like Xen, it _REALLY_ matters which nodes you run a particular resource on - and it depends on which nodes are already running what resources, and how much in the way of real system resources it already has. So, I'm going to make a proposal which might help with this - at the risk of saying something totally stupid, and getting various people irritated at me... So, it might go something like this... And the idea then would be to load a plugin of type CRM_weight which would return an int. This int would then be multiplied by the score to determine the final value for the weight. It would be passed the name/value pairs given as arguments of type struct ha_msg *. I suppose it would be nice to allow even more complex ha_msg type parameters - but this will do for now ;-). this part sounds exactly like attrd. why do we need plugins? Lars pointed this out. Let me see if I can explain it better... Let's say we were using ganglia... Then this plugin could ask ganglia the amount of memory available to the VM to run a new virtual machine to see if it is "enough" for the VM in question. If it is, then it could return -INFINITY for not enough, 1 for barely enough, 10 for plenty, and 100 for lots and lots... It could read it from a database, or get it from ganglia or get it from SMASH or whatever... Any of those would work. And, it wouldn't be your problem to worry about what criteria the user thought was a good one... One could have one policy function for virtual machines, and one criteria for databases, etc. And, how these worked are not your concern... The idea is that the values are _only_ considered when something else triggers a recomputation. something else for that resource or just anything else in the system? "anything else in the system" can be done today. The plugin functions would be able to do basically anything. They could, for example, consult with the data coming from ganglia, and get whatever they wanted to :-). http://ganglia.sourceforge.net/ The problem with this is that it looks at the values _before_ we schedule resources onto the system. Scheduling resources onto a system would naturally change the values that we would have used for scheduling it. So, this either requires a smarter plugin which tries to schedule things predictively, or something else smarter somewhere ;-). If the CRM took resource "costs" into account, then it could do a more balanced scheduling job. So, that's another, perhaps better alternative... we take (very simplistic) costs into account now (each resource "costs" the same). expanding that to take a *single* user-defined cost could be a short-term option but lmb is right... using multiple and possibly arbitrary metrics to compute cost explodes the complexity. If one were to implement such a system... This might be implemented by two separate types of plugins. One type could require them to return a single integer for 'remaining capacity' on a given node, and a different type which would compute the 'load' for the resource in question (how much capacity it consumes). [For this latter type, one might pass the resource definition to the plugin as an ha_msg or something]. Or one could simply add an integer for load/capacity consumed, and make it a constant... Several options suggest themselves... Not that it will never happen, but I wouldn't be looking to go down this path before we at least have the colocation changes stabilized. I was just floating this as a sort of trial balloon to what people thought about it... It is something I would expect to consider for 2007 - if at all... For most resources, the "by hand" scheduling we do now with the rsc_location rules will work "well enough". But for things like Xen which have absolute limits on what they're going to be able to accommodate, something more sophisticated seems worth considering. Thinking some more about it, I think this second approach (capacity available, and capacity consumed) is the better of the two general ideas I outlined. This is not to try and solve the bin packing (or knapsack) problems. But, there are simple heuristic approaches which do a credible job for cases where sufficient capacity is available... -- Alan Robertson <[EMAIL PROTECTED]> "Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] Re: [Linux-ha-cvs] Linux-HA CVS: include by zhenh from
--- Alan Robertson <[EMAIL PROTECTED]> wrote: > Huang Zhen: Have you run this with CTS? Not yet. will do soon. > > -- > Alan Robertson <[EMAIL PROTECTED]> > > "Openness is the foundation and preservative of friendship... Let me > claim from you at all times your undisguised opinions." - William > Wilberforce > ___ > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/ > __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] Re: [Linux-ha-cvs] Linux-HA CVS: lib by davidlee from
On Sun, 28 May 2006, Andrew Beekhof wrote: > David you have to be really careful when you change things like this in > loops. > > The effect of your change is that once one match is found, > *everything* from then on is deleted. Apologies. > The best policy i think is to *copy* the declaration to the new > location and change the existing one to an assignment. That way you > cant go wrong. Indeed. Thanks for correcting it. -- : David LeeI.T. Service : : Senior Systems ProgrammerComputer Centre : : Durham University : : http://www.dur.ac.uk/t.d.lee/South Road: : Durham DH1 3LE: : Phone: +44 191 334 2752 U.K. : ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/