Re: [Linux-ha-dev] Thinking about a new communications plugin

2010-11-23 Thread Lars Marowsky-Bree
On 2010-11-22T14:18:27, Alan Robertson  wrote:

> Any thoughts about this?

http://kronosnet.org/


-- 
Architect Storage/HA, OPS Engineering, Novell, Inc.
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Thinking about a new communications plugin

2010-11-24 Thread Lars Ellenberg
On Mon, Nov 22, 2010 at 02:18:27PM -0700, Alan Robertson wrote:
> Hi,
> 
> I've been thinking about a new unicast communications plugin that would 
> work slightly differently from the current ucast plugin.
> 
> It would take a filename giving the hostnames or ipv4 or ipv6 unicast 
> addresses that one wants to send heartbeats to.
> 
> When heartbeat receives a SIGHUP, this plugin would reread this file and 
> reconfigure the hosts to send heartbeats to.
> 
> This would mean that there would be no reason to have to restart 
> heartbeat just to add or delete a host from the list being sent heartbeats.
> 
> Some environments (notably clouds) don't allow either broadcasts or 
> multicasts.  This would allow those environments to be able to add and 
> delete hosts to the cluster without having to restart heartbeat - as 
> occurs now...  [and I'd like to support ipv6 for heartbeats].

mcast6 is already there.
ucast6 would be a matter of an afternoon.

> Any thoughts about this?
> 
> Would anyone else like such a plugin?

My direct answer to that question would be "Yes, I'd like that".

But it triggers a slightly longer answer, too:

There is much more interesting work to do in the heartbeat comm layer
than to reconfigure ucast on the fly.

Like
 a)  clearly separate control fields and payload fields,
   - for example, always put payload in it's own "FT_UNCOMPRESS",
 that way transparent compression could even compress
 very long FT_STRING payload fields,
 and we would no longer be confused by payload fields accidentally
 being named client_gen ...

 b)  support >= 64k media payload (hard udp limit) by sending multiple
 udp packets for one message.
 This limit, btw, may be even less, depending on network setup and
 equipment involved, and is not even mentioned anywhere in doc or
 code. It will just get a EMSGSIZE from sendto().

 c)  not sending node messages via every unicast link
   - Problem with global per node seq number space, that currently is
 shared for cluster-wide and directed-node messages.
 Next cluster message would generate rexmit requests.
 Possible solutions:
 - separate these seq number spaces
 - or append a new control field with seq numbers to cluster
   messages that record seq numbers used for node messages,
   so the receiving node of the cluster message would know
   which "missing" seq numbers to not re-request.

Pacemaker 1.1 currently won't work on heartbeat even with a just
normal-sized cib, because it sends down FT_STRING fields with
the full cib up to about 128k.  Workaround would be to enable
"traditional" compression...  or do it differently in pacemaker.
Or, see above -- I think it is actually a design but in the heartbeat
comm layer, and could be fixed by a) above.

Once you aim for more than a handful of nodes, the heartbeat media
cluster communication will break horribly, because of the hard 64k udp
message size limit, and no way to have a msg fragmented to more than one
udp packet.

Even with compression enabled, with 32 nodes and a few clones you will
quickly get > 64k messages.

The rds plugin I wrote as prove of concept could handle much bigger
messages, and would greatly benefit both of c) above, and a method to
re-read a list of peers from some config file (what you proposed fo
ucast). It would easily support multi megabyte message sizes, and even
do away with re-ordering and rexmit requests on the receiving side.
Only it is just proof of concept, does not do anything useful once
things break, nodes vanish, or on congestion (no need for rexmit
requests from the receiving side is traded against need to retry sending
on congestion on the sending side).
So much work to do there, too, if someone wants to pick that up.

So the question of joining additional nodes is not a question of
conveniently configuring it. It's a question whether the communication
layer can support the increased message size caused by one more node in
the cib, as full cib updates including status section must still be
supported, even though they became less frequent lately.

Currently, the answer to that question is
"No, one more node will break it", very quickly.

Once that basically works, then would be a time to think about
convenience of configuration, IMHO.

But that's obviously more work than re-reading a config file on a signal,
so it will likely not be done too soon. Unless someone has a specific
pressing need, and is not willing to try an alternative messaging layer,
but really wants it fixed in heartbeat.

Thanks for reading all of that.

Thoughts?

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Thinking about a new communications plugin

2010-11-24 Thread Bob Schatz
I am curious.

What is driving the need for more than 32 nodes?   Are many people doing that 
or 
planning on doing that?

In my experience, > 80% of the people just want 2 nodes to work reliability and 
more than 4 nodes is just a marketing requirement to put on a glossy handout.

Is that still the case or am I off base?


Thanks,

Bob


- Original Message 
From: Lars Ellenberg 
To: linux-ha-dev@lists.linux-ha.org
Sent: Wed, November 24, 2010 9:18:23 AM
Subject: Re: [Linux-ha-dev] Thinking about a new communications plugin

On Mon, Nov 22, 2010 at 02:18:27PM -0700, Alan Robertson wrote:
> Hi,
> 
> I've been thinking about a new unicast communications plugin that would 
> work slightly differently from the current ucast plugin.
> 
> It would take a filename giving the hostnames or ipv4 or ipv6 unicast 
> addresses that one wants to send heartbeats to.
> 
> When heartbeat receives a SIGHUP, this plugin would reread this file and 
> reconfigure the hosts to send heartbeats to.
> 
> This would mean that there would be no reason to have to restart 
> heartbeat just to add or delete a host from the list being sent heartbeats.
> 
> Some environments (notably clouds) don't allow either broadcasts or 
> multicasts.  This would allow those environments to be able to add and 
> delete hosts to the cluster without having to restart heartbeat - as 
> occurs now...  [and I'd like to support ipv6 for heartbeats].

mcast6 is already there.
ucast6 would be a matter of an afternoon.

> Any thoughts about this?
> 
> Would anyone else like such a plugin?

My direct answer to that question would be "Yes, I'd like that".

But it triggers a slightly longer answer, too:

There is much more interesting work to do in the heartbeat comm layer
than to reconfigure ucast on the fly.

Like
a)  clearly separate control fields and payload fields,
   - for example, always put payload in it's own "FT_UNCOMPRESS",
 that way transparent compression could even compress
 very long FT_STRING payload fields,
 and we would no longer be confused by payload fields accidentally
 being named client_gen ...

b)  support >= 64k media payload (hard udp limit) by sending multiple
 udp packets for one message.
 This limit, btw, may be even less, depending on network setup and
 equipment involved, and is not even mentioned anywhere in doc or
 code. It will just get a EMSGSIZE from sendto().

c)  not sending node messages via every unicast link
   - Problem with global per node seq number space, that currently is
 shared for cluster-wide and directed-node messages.
 Next cluster message would generate rexmit requests.
 Possible solutions:
 - separate these seq number spaces
 - or append a new control field with seq numbers to cluster
   messages that record seq numbers used for node messages,
   so the receiving node of the cluster message would know
   which "missing" seq numbers to not re-request.

Pacemaker 1.1 currently won't work on heartbeat even with a just
normal-sized cib, because it sends down FT_STRING fields with
the full cib up to about 128k.  Workaround would be to enable
"traditional" compression...  or do it differently in pacemaker.
Or, see above -- I think it is actually a design but in the heartbeat
comm layer, and could be fixed by a) above.

Once you aim for more than a handful of nodes, the heartbeat media
cluster communication will break horribly, because of the hard 64k udp
message size limit, and no way to have a msg fragmented to more than one
udp packet.

Even with compression enabled, with 32 nodes and a few clones you will
quickly get > 64k messages.

The rds plugin I wrote as prove of concept could handle much bigger
messages, and would greatly benefit both of c) above, and a method to
re-read a list of peers from some config file (what you proposed fo
ucast). It would easily support multi megabyte message sizes, and even
do away with re-ordering and rexmit requests on the receiving side.
Only it is just proof of concept, does not do anything useful once
things break, nodes vanish, or on congestion (no need for rexmit
requests from the receiving side is traded against need to retry sending
on congestion on the sending side).
So much work to do there, too, if someone wants to pick that up.

So the question of joining additional nodes is not a question of
conveniently configuring it. It's a question whether the communication
layer can support the increased message size caused by one more node in
the cib, as full cib updates including status section must still be
supported, even though they became less frequent lately.

Currently, the answer to that question is
"No, one more node will break it", very quickly.

Once that basically works, then would be a time to think about
convenience of configuration, IM

Re: [Linux-ha-dev] Thinking about a new communications plugin

2010-11-24 Thread Lars Ellenberg
On Wed, Nov 24, 2010 at 10:10:33AM -0800, Bob Schatz wrote:
> I am curious.
> 
> What is driving the need for more than 32 nodes?   Are many people doing that 
> or 
> planning on doing that?
> 
> In my experience, > 80% of the people just want 2 nodes to work reliability 
> and 
> more than 4 nodes is just a marketing requirement to put on a glossy handout.
> 
> Is that still the case or am I off base?

You are probably right.
Still there are some that want more nodes within one cluster,
either because they have a real need for it,
or even just because they like to push limits.

And, it does not need to be many hosts, the cib can grow for many
resources, many constraints, or many attributes just as well.

CTS runs with only three nodes and the "standard" auto-generated set of
CTS resources in the cib are already breaking the 64k plain text limit
so if you want to use pacemaker-1.1.4 on heartbeat, even on two nodes
you have to enable in ha.cf
compression on
compression_threshold 20 # or 30 or something
traditional_compression on
for reasons layed out already, which is, uhm, suboptimal.

And you may still hit the limit later,
depending on what exactly your cib look like.

It is just much more easy to produce a huge cib with
many nodes and a few clones ;-)

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Thinking about a new communications plugin

2010-11-24 Thread Bob Schatz
Lars,

Please take my opinions with a grain of salt.  I am just trying to share my 
experiences.  I am not sure if they apply here.

I appreciate all of the hard work involved in LinuxHA and Pacemaker!

Just to tell you where I am coming from while I count down the minutes before a 
holiday here in the states...

In a previous life I worked at VERITAS and I was one of the original developers 
of a product called VERITAS Cluster Server.  From the start it supported 32 
nodes.  Later I developed a piece of technology called I/O Fencing which was 
used to support Oracle RAC a parallel database.  Our customers were generally 
high end enterprise customers.

As developers, we used to obsess about how many nodes we could support and how 
to speed things up etc.  It was really interesting work, I got a couple of 
patents out of it and my ego grew. :)  I loved it.

However, I don't believe our customers really went past four nodes for many 
years.  I think that after 10 years customers of the parallel file system 
technology did go over 4 nodes.  They were the really high end customers who 
cared more about performance and support and cost was not their top priority.

As developers, we would always obsess about new features, etc and our Technical 
Product Managers would tell us that our customers did not want more features. 
 They cared more about the following:

1.  Reliability
2.  Patching the existing product since customers usually build a cluster for 
application APP1 and they do not want to upgrade to a new version until they 
build a whole new cluster with new hardware, new OS, new version of APP1, etc. 
 They hated when we told them that the fix for version 1.3 was in 2.0.  Once 
the 
cluster was built they did not want to touch it and when they did they had to 
go 
through a Change Control Board to get it approved.
3. Ease of use since they wanted to be able to have a less experience system 
administrator handle more clusters to reduce cost

My take away from it was the following (at least what I remember):

1. To increase reliability add less features and rewrite areas prone to bugs or 
user questions.  Besides the new features would not be used anyways and would 
only cause customer escalations the night before I was trying to go on 
vacation. 
:(
2. Patch the existing code as opposed to coming out with more frequent releases
3. Come up with a couple of recipes on how to do a couple of common system 
administration tasks like adding a patch, migrating an application regardless 
of 
two nodes or more than 3, etc

I am not sure how this maps to LinuxHA/Pacemaker.  It may be a different market.

I thought I should share my experiences to see how it maps to what others 
think. 
 I may be off base.

Thanks for listening and for all the hard work on creating LinuxHA and 
Pacemaker!


Thanks,

Bob

- Original Message 
From: Lars Ellenberg 
To: linux-ha-dev@lists.linux-ha.org
Sent: Wed, November 24, 2010 10:23:39 AM
Subject: Re: [Linux-ha-dev] Thinking about a new communications plugin

On Wed, Nov 24, 2010 at 10:10:33AM -0800, Bob Schatz wrote:
> I am curious.
> 
> What is driving the need for more than 32 nodes?   Are many people doing that 
>or 
>
> planning on doing that?
> 
> In my experience, > 80% of the people just want 2 nodes to work reliability 
> and 
>
> more than 4 nodes is just a marketing requirement to put on a glossy handout.
> 
> Is that still the case or am I off base?

You are probably right.
Still there are some that want more nodes within one cluster,
either because they have a real need for it,
or even just because they like to push limits.

And, it does not need to be many hosts, the cib can grow for many
resources, many constraints, or many attributes just as well.

CTS runs with only three nodes and the "standard" auto-generated set of
CTS resources in the cib are already breaking the 64k plain text limit
so if you want to use pacemaker-1.1.4 on heartbeat, even on two nodes
you have to enable in ha.cf
compression on
compression_threshold 20 # or 30 or something
traditional_compression on
for reasons layed out already, which is, uhm, suboptimal.

And you may still hit the limit later,
depending on what exactly your cib look like.

It is just much more easy to produce a huge cib with
many nodes and a few clones ;-)

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/



  
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Thinking about a new communications plugin

2010-11-24 Thread Lars Ellenberg
On Wed, Nov 24, 2010 at 11:43:05AM -0800, Bob Schatz wrote:
> Lars,
> 
> Please take my opinions with a grain of salt.  I am just trying to share my 
> experiences.  I am not sure if they apply here.
> 
> I appreciate all of the hard work involved in LinuxHA and Pacemaker!
> 
> Just to tell you where I am coming from while I count down the minutes before 
> a 
> holiday here in the states...

...

> My take away from it was the following (at least what I remember):
> 
> 1. To increase reliability add less features and rewrite areas prone to bugs 
> or 

absolutely...

> 2. Patch the existing code as opposed to coming out with more frequent 
> releases

well, whether to count "patch level" or "micro release" is not a
technical difference, though it may be of huge importance on a
"political" level.

Unless, of course, you meant feature releases...
that may be a different thing.

> 3. Come up with a couple of recipes on how to do a couple of common system 
> administration tasks like adding a patch, migrating an application regardless 
> of 
> two nodes or more than 3, etc
> 
> I am not sure how this maps to LinuxHA/Pacemaker.  It may be a different 
> market.

Or it may not. We'll see.

> I thought I should share my experiences to see how it maps to what others 
> think. 
>  I may be off base.

I just pointed out that, adding an other communication plugin to
heartbeat is one thing, but if the purpose of that new plugin was to
allow more nodes to join the cluster, then we should be aware of the
current limitations in the heartbeat messaging layer when used with
pacemaker and many nodes.

If I limit myself to a small number of nodes, then this plugin to allow
re-configuration of unicast peers is not necessary really, anyways.

The heartbeat messaging layer currently is not fit for many nodes.
Whether corosync is, really, I cannot say.
How much is "many"? That depends on several things, but mostly on
the resulting size of the cib (if used with pacemaker).

Why many?
Because "everyone" wants to go "cloud", and (ab)using a cluster manager
to manage resources in a cloud seems an obvious thing to (at least) try.

Neither of this affects Pacemaker, directly.

I'm not going to start any new features in heartbeat,
unless someone specifically pays linbit to do so ;-)

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Thinking about a new communications plugin

2010-11-27 Thread Alan Robertson

On 11/24/2010 01:41 PM, Lars Ellenberg wrote:

On Wed, Nov 24, 2010 at 11:43:05AM -0800, Bob Schatz wrote:

Lars,

Please take my opinions with a grain of salt.  I am just trying to share my
experiences.  I am not sure if they apply here.

I appreciate all of the hard work involved in LinuxHA and Pacemaker!

Just to tell you where I am coming from while I count down the minutes before a
holiday here in the states...

...


My take away from it was the following (at least what I remember):

1. To increase reliability add less features and rewrite areas prone to bugs or

absolutely...


2. Patch the existing code as opposed to coming out with more frequent releases

well, whether to count "patch level" or "micro release" is not a
technical difference, though it may be of huge importance on a
"political" level.

Unless, of course, you meant feature releases...
that may be a different thing.


3. Come up with a couple of recipes on how to do a couple of common system
administration tasks like adding a patch, migrating an application regardless of
two nodes or more than 3, etc

I am not sure how this maps to LinuxHA/Pacemaker.  It may be a different market.

Or it may not. We'll see.


I thought I should share my experiences to see how it maps to what others think.
  I may be off base.

I just pointed out that, adding an other communication plugin to
heartbeat is one thing, but if the purpose of that new plugin was to
allow more nodes to join the cluster, then we should be aware of the
current limitations in the heartbeat messaging layer when used with
pacemaker and many nodes.

If I limit myself to a small number of nodes, then this plugin to allow
re-configuration of unicast peers is not necessary really, anyways.

The heartbeat messaging layer currently is not fit for many nodes.
Whether corosync is, really, I cannot say.
How much is "many"? That depends on several things, but mostly on
the resulting size of the cib (if used with pacemaker).

Why many?
Because "everyone" wants to go "cloud", and (ab)using a cluster manager
to manage resources in a cloud seems an obvious thing to (at least) try.

Neither of this affects Pacemaker, directly.

I'm not going to start any new features in heartbeat,
unless someone specifically pays linbit to do so ;-)


I was talking about a half-dozen or so nodes.  Not to /implement/ a 
cloud, but to /run in/ a cloud someone else implemented.



--
Alan Robertson

"Openness is the foundation and preservative of friendship...  Let me claim from you 
at all times your undisguised opinions." - William Wilberforce

___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Thinking about a new communications plugin

2010-11-27 Thread Lars Ellenberg
On Sat, Nov 27, 2010 at 10:16:54AM -0700, Alan Robertson wrote:
> 
> I was talking about a half-dozen or so nodes.  Not to /implement/ a
> cloud, but to /run in/ a cloud someone else implemented.

I know. Of course.

Maybe we should have stuck closer to the original subject ;)
Yes, as was my first (one line) answer: there such a feature like you
proposed (reconfiguring unicast peers at runtime) could come in handy.

But again:
We will have to address the messaging size limits with heartbeat,
if it is going to be used with pacemaker 1.1.4/1.2.

Even on a 3-node cluster, pacemaker 1.1.4 and "standard" cts generated
cib breaks on current heartbeat with -EMSGSIZE.

So that's relevant to your half-dozen or so nodes case.

It is because pacemaker now, mainly for performance reasons, hands down
up to 128k (I think) FT_STRING fields, before starting bz2'ing the
payload itself (and handing it down as FT_BINARY.
Unless you configure heartbeat to use "traditional compression"
(compressing every packet > threshold, including control fields,
so every node will have to uncompress it before it knows that this was a
node message to someone else and can be dropped), that'll obviously end
up with broken communication.

Sure, the threshold when pacemaker starts bz2'ing is only a define.
And pacemaker could also pass down FT_UNCOMPRESS or whatever it was
(it sometimes does).

But I think it should be address in the comm layer anyways.

The heartbeat ipc layer should know that to pass data between various
processes on the same node, it does not need to compress/decompress
anything. (There is also too much memcpy'ing going on in there, but
let's not go there yet.)
Once a message actually hits a media type (leaves a node), it should,
depending on media type, be able to transparently compress messages --
without the ipc client knowing too much about the internals.  And it
should not mix ipc layer internal (crontrol) fields and client payload
fields in the same client visible and manipulatible name space.

Thus my suggestion to, for starters, encapsulate all client payload
in a sub-message, which can then be compressed, either partially, or
fully, even using the existing infrastruture.

Sure, "if it ain't broke, don't fix it".
But, just because we can "make it work", does not mean it could not do
with a little fixing here and there. Since, if we have to "make" it
work, that hints at things being at least slightly broken, some way.

Anyways.

Once the things I mentioned above cause so much pain that either
heartbeat gets fixed, or replaced, I can just say "I told you so"

But until then, you could probably already have implemented your
original proposal in the cumulative man hours spent writing and reading
this thread, and I'm sure I will get used. So please, just go ahead.

Cheers,


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Thinking about a new communications plugin

2010-11-27 Thread Lars Ellenberg
On Sun, Nov 28, 2010 at 12:03:23AM +0100, Lars Ellenberg wrote:
> But until then, you could probably already have implemented your
> original proposal in the cumulative man hours spent writing and reading
> this thread, and I'm sure I will get used. So please, just go ahead.

tztztz.
Though possibly I get used, too, sometimes, I obviously meant
..., and I'm sure _it_ will be used.
And I'm going to be one of those that use it, probably...

 ;)

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Re: [Linux-ha-dev] Thinking about a new communications plugin

2010-11-28 Thread Alan Robertson
On 11/27/2010 04:19 PM, Lars Ellenberg wrote:
> On Sun, Nov 28, 2010 at 12:03:23AM +0100, Lars Ellenberg wrote:
>> But until then, you could probably already have implemented your
>> original proposal in the cumulative man hours spent writing and reading
>> this thread, and I'm sure I will get used. So please, just go ahead.
> tztztz.
> Though possibly I get used, too, sometimes, I obviously meant
> ..., and I'm sure _it_ will be used.
> And I'm going to be one of those that use it, probably...
It wasn't that bad to read it all.  I hadn't realized the messages had 
gotten so large.

We put in compression exactly to deal with this situation.  All that 
bulky XML is extremely compressible.

I didn't write that part of the code, and hadn't noticed that it did all 
that excessive compression/decompression.  But you will note that this 
only really happens during a cluster transition.   Most of the time 
nothing happens - and nothing but heartbeats go over the network - or 
has that changed too?

On a completely different subject, I'm modernizing my home production 
cluster.  Switching to Ubuntu, replacing motherboard with multi-core 
CPUs, replacing hard drives, adding striping.  I was planning on putting 
the DRBD metadata in an SSD - but there seems to some incompatibility 
between the SSD I bought and Linux and/or my motherboard.  On the other 
hand the SSD works nicely with non-Linux disk testing utilities.  Sigh...


-- 
 Alan Robertson

"Openness is the foundation and preservative of friendship...  Let me claim 
from you at all times your undisguised opinions." - William Wilberforce

___
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/