Re: [Gluster-users] client.c:1883 disconnected.

2011-04-26 Thread Mark "Naoki" Rogers
Aha, thank you very much. Checking back to the node in question then I 
find 'service gluster status' returning ok. Although the process 
controlling my volume wasn't running (maybe status could be a little 
more thorough), the error I found was:


[2011-04-26 11:10:03.461515] W [glusterfsd.c:700:cleanup_and_exit] 
(-->/lib64/libc.so.6(clone+0x6d) [0x7fa620587c2d] 
(-->/lib64/libpthread.so.0(+0x6ccb) [0x7fa620849ccb] 
(-->/opt/glusterfs/3.2.0/sbin/glusterfsd(glusterfs_sigwaiter+0xd5) 
[0x405bf5]))) 0-: received signum (15), shutting down


I've done a restart and now have a working node again.

Thanks again Amar,

2-distribute-client-1 -> 2 here (ie, the first number) means the graph 
id. (Graph id changes when ever some changes in volume file happens). 
'distribute' is your volume name, and if  you go through the volume 
file, (which will be logged in log file too), you can see a 
protocol/client volume with name 'distribute-client-1'.


See its definition,  and get remote-host IP and check if the brick is 
running in that particular node.


Mean time, with 3.2.x onwards these logs will be changed to print the 
IP:port of the remote host too, so one can get to the remote machine 
without much of greping around.


Regards,
Amar




___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


[Gluster-users] client.c:1883 disconnected.

2011-04-25 Thread Mark "Naoki" Rogers

Hi,

I'm seeing these messages every few seconds and looking for some assistance:

Client 1
[2011-04-26 12:25:07.178381] I [client.c:1883:client_rpc_notify] 
2-distribute-client-1: disconnected
[2011-04-26 12:25:10.188056] I [client.c:1883:client_rpc_notify] 
2-distribute-client-1: disconnected


Client 2
[2011-04-26 12:24:57.749822] I [client.c:1883:client_rpc_notify] 
0-distribute-client-1: disconnected
[2011-04-26 12:25:01.753013] I [client.c:1883:client_rpc_notify] 
0-distribute-client-1: disconnected


It's a case of RPC_CLNT_DISCONNECT according to the code but that alone 
doesn't tell me much. What does the '2-distribute-client-1' actually map to?


Cheers.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] No space left on device

2011-01-19 Thread Mark "Naoki" Rogers


I think you might want to look into re-balance: 
http://europe.gluster.org/community/documentation/index.php?title=Gluster_3.1:_Rebalancing_Volumes&redirect=no 



It's generally for adding/removing bricks but might re-distribute data 
in a way that solves your disk space issue.



On 01/19/2011 04:43 PM, Daniel Zander wrote:

Hi!


Assuming you are doing a straight distribute there(?), if the user in

Yes's it's a distributed volume.


question is hashed onto the brick that is 100% full you'll get a space

Is there a way around this other than moving files away from this one
brick by hand?


error. Not sure I followed your migration details though, when you say
"user directories were moved into one of the above folders" do you mean
copied directly onto the individual storage bricks?

Yes, eg. user_a had the following directories
server5:/storage/5/user_a
server6:/storage/6/user_a

Then we performed a move:
ssh server5 "mv /storage/5/user_a/ /storage/5/cluster/user_a"
ssh server6 "mv /storage/6/user_a/ /storage/6/cluster/user_a"

This was done as it would not cause any network traffic. Then the volume
was created like that:
Brick1: 192.168.101.249:/storage/4/cluster
Brick2: 192.168.101.248:/storage/5/cluster
Brick3: 192.168.101.250:/storage/6/cluster
Brick4: 192.168.101.247:/storage/7/cluster
Brick5: 192.168.101.246:/storage/8/cluster

Regards,
Daniel


On 01/19/2011 05:01 AM, zan...@ekp.uni-karlsruhe.de wrote:

mv: cannot create regular file `/storage/cluster/': No space
left on device

Doing df -h tells me, however:

glusterfs#192.168.101.247:/lemmy
104T 69T 36T 66% /storage/cluster

It may be of importance that one brick in the cluster is actually 100%
used. Others are almost completely empty. I am using GlusterFS.3.1.1,
the file servers are running debian lenny or ubuntu server 10.04,
clients are SLC4, SLC5, CentOS and ubuntu server 10.04.


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] No space left on device

2011-01-18 Thread Mark "Naoki" Rogers
Assuming you are doing a straight distribute there(?), if the user in 
question is hashed onto the brick that is 100% full you'll get a space 
error. Not sure I followed your migration details though, when you say 
"user directories were moved into one of the above folders" do you mean 
copied directly onto the individual storage bricks?


On 01/19/2011 05:01 AM, zan...@ekp.uni-karlsruhe.de wrote:

mv: cannot create regular file `/storage/cluster/': No space
left on device

Doing df -h tells me, however:

glusterfs#192.168.101.247:/lemmy
   104T   69T   36T  66% /storage/cluster

It may be of importance that one brick in the cluster is actually 100%
used. Others are almost completely empty. I am using GlusterFS.3.1.1,
the file servers are running debian lenny or ubuntu server 10.04,
clients are SLC4, SLC5, CentOS and ubuntu server 10.04.



___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] adding to gluster a node with 24TB of disk and 16GB RAM

2011-01-03 Thread Mark "Naoki" Rogers

Hello again,

Ideally you could run a benchmark of your application and use 
blktrace+seekwatcher <http://oss.oracle.com/%7Emason/seekwatcher/> to 
capture and view some really accurate IO stats and then tune 
accordingly. Other than that it's complete guess work, you have about 
50G of potential FS cache there which is 0.2% of your physical data 
capacity. It all depends then on your cache hit rates (to network 
performance) and ability to handle the cache misses. You could just run 
'iostat' on your backend nodes during a benchmark as well, that's good 
enough.


From your network stats it appears as though access is actually quite 
low (64in/43out) and so most might be ending up in your FS cache and 
even if it isn't there is no way, even if it's 100% small block random 
operations, that it'll saturate your drives.


By bonding <http://en.wikipedia.org/wiki/Channel_bonding> I mean 
aggregation, trunking or teaming, depending on what networking school 
you went to. My rough and totally inaccurate back-of-a-napkin numbers 
are designed to indicate 1Gbit probably won't be enough, and you might 
need to consider two or more gbit interfaces. Based on my testing with 
six servers I can kill the Gbit interface pretty easily (but that's not 
with your app of course).


Long story short the answer to your original question, "Any guide line 
we should follow for calculating the memory requirements" is no. It's 
all about your specific application requirements (and the money you're 
willing to spend).


The only advice I'd give then is;

   * Be sure to monitor your IO and know exactly what the numbers mean
 and what causes them.
   * Have a capacity plan with an eye on what you need to address any
 of the possible eventualities;

  1. Network throughput/latency - More/faster ports.
  2. Disk sequential read/write - More spindles or flash.
  3. Disk random read/write - More spindles or flash.
  4. File System cache misses - RAM increases on storage nodes.
  5. Single storage node overload - More nodes or striping that file.




On 12/30/2010 08:51 PM, admin iqtc wrote:

Hi,

Sorry Mark, but i don't understand what you exactly need. Could you give me
an example of information you're asking?

Regarding bonding, don't worry, all the current 5 machines are bonded(1gbit
each interface) to the switch, and the new machine would be installed the
same way.

That switch load is from the HPC clusters to the gluster. The info is from
the trunking interface in the switch. Our network topology is as follows:
each gluster server(and the new one) are connected with bonding to a L2
switch, then from that switch 4x1gbit cables goes to a L3 switch. Both
switches are configured for those 4 cables to be trunked. The traffic load i
told you is from the L3 switch.

We may expand that trunking some day, but for now we aren't having any
trouble..

Thanks

2010/12/28 Mark "Naoki" Rogers


Hi,

Your five machines should get you raw speeds of at least 300MB/s sequential
and 300-500 random IOP/s, your file-system cache alters things depending on
access patterns. Without knowing about those patterns I can't guess as to
the most beneficial disk/memory ratios for you. If possible run some
synthetic benchmarks for base-lining and then try and benchmark your
application, even if it's only a limited benchmark that's ok you can still
extrapolate from there.

The first thing you might hit though could be the 1Gbit interfaces so keep
an eye on those and perhaps have a plan to bond them, and get ready to think
about 10G on the larger one if needed.

Right now it seems the switch load is light, is that per port to the
storage bricks?



On 12/28/2010 05:38 PM, admin iqtc wrote:


Hi,

sorry for not giving more information on the first mail.

The setup would be straight distributed. The disks are SATA2 7200RPM. ATM
the 5 machines we're currently running have 5 disks of 1TB(4TB with RAID5)
each. The new machine would have 12 disks of 2TB with RAID5 as well, so 23TB
approx.

We're using gluster for storage of an HPC cluster. That means: Data gets
copied from gluster and to gluster all the times. For example looking at the
traffic on the switch, the average is 64Mbit/s IN(that is, writing) and
43Mbit/s OUT(that is, reading). That is among the 5 machines.

Is this enough?

Thanks!


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] adding to gluster a node with 24TB of disk and 16GB RAM

2010-12-28 Thread Mark &quot;Naoki&quot; Rogers

Hi,

Your five machines should get you raw speeds of at least 300MB/s 
sequential and 300-500 random IOP/s, your file-system cache alters 
things depending on access patterns. Without knowing about those 
patterns I can't guess as to the most beneficial disk/memory ratios for 
you. If possible run some synthetic benchmarks for base-lining and then 
try and benchmark your application, even if it's only a limited 
benchmark that's ok you can still extrapolate from there.


The first thing you might hit though could be the 1Gbit interfaces so 
keep an eye on those and perhaps have a plan to bond them, and get ready 
to think about 10G on the larger one if needed.


Right now it seems the switch load is light, is that per port to the 
storage bricks?



On 12/28/2010 05:38 PM, admin iqtc wrote:

Hi,

sorry for not giving more information on the first mail.

The setup would be straight distributed. The disks are SATA2 7200RPM. 
ATM the 5 machines we're currently running have 5 disks of 1TB(4TB 
with RAID5) each. The new machine would have 12 disks of 2TB with 
RAID5 as well, so 23TB approx.


We're using gluster for storage of an HPC cluster. That means: Data 
gets copied from gluster and to gluster all the times. For example 
looking at the traffic on the switch, the average is 64Mbit/s IN(that 
is, writing) and 43Mbit/s OUT(that is, reading). That is among the 5 
machines.


Is this enough?

Thanks!


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] adding to gluster a node with 24TB of disk and 16GB RAM

2010-12-26 Thread Mark &quot;Naoki&quot; Rogers

Hi Jordi,

To clarify - this new node going to take you up to six nodes with the 
sixth one of vastly different system spec to the others, in what 
configuration will it be, dist+rep or straight distributed ?


An hard question either way without knowing a little more about the 
hardware and a lot more about expected usage patterns. How random are 
the access going to be, what's your expected read hit rate, and then how 
fast are the disks?


Memory as buffer (FS) cache but isn't necessarily needed until your 
cache miss overflow exceeds the performance of the drives.


Cheers.

On 12/23/2010 07:27 PM, admin iqtc wrote:

Dear list,

we are planning to add a new node in our Gluster cluster ( 5 nodes with 5TB
and 8GB RAM each node and gigabit connection).

This new node will consist a 24TB disk and 16GB of RAM. Do you think that it
could be some performance problem with this proportion? A lot of disk in
front of memory capacity. We are multiplying x6 disk capacity and only x2
memory capacity.

Any guide line we should follow for calculating the memory requirements
etc.?

thanks in advance,

best regards,


jordi


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] 3.1.x Setup: distributed replicated volumes across 6 servers, each with 24 drives

2010-12-21 Thread Mark &quot;Naoki&quot; Rogers
If I may ask, is there a reason you're not putting the 24 drives on each 
server into raid6/5 arrays and doing dist+rep over just:


clustr-01:/mnt/data clustr-02:/mnt/data clustr-03:/mnt/data
clustr-04:/mnt/data clustr-05:/mnt/data clustr-06:/mnt/data

But in answer to your question I think your config looks good and to 
mount you would just issue:


mount -t glusterfs clustr-0?:/dist-datastore/ /


On 12/22/2010 07:38 AM, phil cryer wrote:

I have 6 servers, each with 24 drives, and I'm upgrading to 3.1.x and
want to redo my configuration from scratch. Really interested in some
of the new options and configurations in 3.1, so now I want to get it
setup right from the start. From this
page:http://www.gluster.com/community/documentation/index.php/Gluster_3.1:_Configuring_Distributed_Replicated_Volumes
I see this distributed, replicated, 6 server example:
# gluster volume create test-volume replica 2 transport tcp
server1:/exp1 server2:/exp2 server3:/exp3 server4:/exp4 server5:/exp5
server6:/exp6

Then from there I see an example more in line with what I'm trying to
do here, but using 4 nodes:
http://gluster.org/pipermail/gluster-users/2010-December/006001.html
# #gluster volume create vol1 replica 2 transport tcp
server1:/mnt/array1 server2:/mnt/array1 server3:/mnt/array1
server4:/mnt/array1 server1:/mnt/array2 server2:/mnt/array2
server3:/mnt/array2 server4:/mnt/array2 server1:/mnt/array3
server2:/mnt/array3 server3:/mnt/array3 server4:/mnt/array3
server1:/mnt/array4

So, if I have the following servers:
clustr-01
clustr-02
clustr-03
clustr-04
clustr-05
clustr-06

and all of my drives mounted under:
/mnt/data01
/mnt/data02
/mnt/data03
/mnt/data04
/mnt/data05
[...]
/mnt/data24

Should I issue a command like this to set it up:

gluster volume create dist-datastore replica 2 transport tcp /
clustr-01:/mnt/data01 clustr-02:/mnt/data01 clustr-03:/mnt/data01
clustr-04:/mnt/data01 clustr-05:/mnt/data01 clustr-06:/mnt/data01 /
clustr-01:/mnt/data02 clustr-02:/mnt/data02 clustr-03:/mnt/data02
clustr-04:/mnt/data02 clustr-05:/mnt/data02 clustr-06:/mnt/data02 /
clustr-01:/mnt/data03 clustr-02:/mnt/data03 clustr-03:/mnt/data03
clustr-04:/mnt/data03 clustr-05:/mnt/data03 clustr-06:/mnt/data03 /
[...]
clustr-01:/mnt/data24 clustr-02:/mnt/data24 clustr-03:/mnt/data24
clustr-04:/mnt/data24 clustr-05:/mnt/data24 clustr-06:/mnt/data24

So that each /mnt/dataxx is replicated and distributed across all 6 nodes?

Then, once this is completed successfully, how do I map all
/mnt/data01-24 to one mount point, say /mnt/cluster for example?
Before I would have added this to /etc/fstab and done `mount -a`
/etc/glusterfs/glusterfs.vol  /mnt/cluster  glusterfs  defaults  0  0

Is there a better way in 3.1.x, should I use mount.glusterfs or ?

Thanks

P
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] different back-end filesystems

2010-12-21 Thread Mark &quot;Naoki&quot; Rogers
Yep that's fine, I do it with ext4 and btrfs. The only requirement is 
extended attributes (which everything does these days). If in doubt you 
can test with:


# touch hello.txt
# setfattr -n user.foo -v bar hello.txt
# getfattr -n user.foo hello.txt
# file: hello.txt
user.foo="bar"
# rm hello.tx


On 12/20/2010 07:34 PM, David Lloyd wrote:

Can I have different backend filesystems eg some xfs, some ext4 on the
bricks in a glusterfs volume?

This would just be a temporary measure, I'm thinking of migrating our
replicated volume, and thought that doing one brick at a time, and
resyncing might be easier than backing up the whole lot and starting
from scratch.

David



___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Who's using Fedora in production on Glusterfs storage servers?

2010-12-03 Thread Mark &quot;Naoki&quot; Rogers

Hi James,

I'm using 3.1.1 on six bricks in dist+replicate all running F14+BTRFS, 
the clients are on fedora12/13/14. I build the RPMs from source on a F14 
machine. The cluster is running entirely on GbE (with some 10Gb lines 
going in shortly), no RDMA/infiniband so I can't help there.


It's gone through a series of looped benchmarks for a while now (from 
3.1.0 through a few qa releases) and have so far pushed/pulled over 
110TB through it - I'm happy in the stability but not /entirely/ sure of 
the performance just yet, just started up more testing under 3.1.1.


But back to your main question there really isn't enough difference 
between the near-term releases of Fedora for it to make a huge 
difference either way. I do think you're better off using the latest 
Fedora release than an older one that will be end of life soon (f12 
tomorrow). Being able to patch/maintain your system is more important 
than an, often very arbitrary, vendor support list which is usually just 
an outcome of what people have had time to look into, rather than any 
measured reason a newer OS isn't supported. Besides the only thing you 
ever have to /really/ care about is the kernel and glibc major versions, 
so if it compiles you're pretty much ok (ldd it, that's all it needs).



On 12/02/2010 01:45 AM, Burnash, James wrote:

How many people on the list are using Fedora 12 (or 13) in production for 
Glusterfs storage servers? I know that Gluster Platform uses Fedora 12 as its 
OS - I was thinking of building my new glusterfs storage servers using Fedora, 
and was wondering whether Fedora 13 was tested by Gluster for v 3.1.1 and what 
other people's experiences were.

One of the reasons for my interest was so that I could use ext4 as the backend 
file store, instead of ext3.

Thanks,

James Burnash, Unix Engineering


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Who's using Fedora in production on Glusterfs storage servers?

2010-12-03 Thread Mark &quot;Naoki&quot; Rogers

Hi James,

I'm using 3.1.1 on six bricks in dist+replicate all running F14+BTRFS, 
the clients are on fedora12/13/14. I build the RPMs from source on a F14 
machine. The cluster is running entirely on GbE (with some 10Gb lines 
going in shortly), no RDMA/infiniband so I can't help there.


It's gone through a series of looped benchmarks for a while now (from 
3.1.0 through a few qa releases) and have so far pushed/pulled over 
110TB through it - I'm happy in the stability but not /entirely/ sure of 
the performance just yet, just started up more testing under 3.1.1.


But back to your main question there really isn't enough difference 
between the near-term releases of Fedora for it to make a huge 
difference either way. I do think you're better off using the latest 
Fedora release than an older one that will be end of life soon (f12 
tomorrow). Being able to patch/maintain your system is more important 
than an, often very arbitrary, vendor support list which is usually just 
an outcome of what people have had time to look into, rather than any 
measured reason a newer OS isn't supported. Besides the only thing you 
ever have to /really/ care about is the kernel and glibc major versions, 
so if it compiles you're pretty much ok (ldd it, that's all it needs).



On 12/02/2010 01:45 AM, Burnash, James wrote:

How many people on the list are using Fedora 12 (or 13) in production for 
Glusterfs storage servers? I know that Gluster Platform uses Fedora 12 as its 
OS - I was thinking of building my new glusterfs storage servers using Fedora, 
and was wondering whether Fedora 13 was tested by Gluster for v 3.1.1 and what 
other people's experiences were.

One of the reasons for my interest was so that I could use ext4 as the backend 
file store, instead of ext3.

Thanks,

James Burnash, Unix Engineering


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Who's using Fedora in production on Glusterfs storage servers?

2010-12-03 Thread Mark &quot;Naoki&quot; Rogers
I'm using 3.1.1 on six bricks running F14+BTRFS. Been running looped 
benchmarks for a while and pushed/pulled over 110TB so far. I'm 
convinced it's stable but not entirely sure of the performance just yet.


On 12/02/2010 01:45 AM, Burnash, James wrote:

How many people on the list are using Fedora 12 (or 13) in production for 
Glusterfs storage servers? I know that Gluster Platform uses Fedora 12 as its 
OS - I was thinking of building my new glusterfs storage servers using Fedora, 
and was wondering whether Fedora 13 was tested by Gluster for v 3.1.1 and what 
other people's experiences were.

One of the reasons for my interest was so that I could use ext4 as the backend 
file store, instead of ext3.


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Who's using Fedora in production on Glusterfs storage servers?

2010-12-01 Thread Mark &quot;Naoki&quot; Rogers

Hi James,

I'm using 3.1.1 on six bricks in dist+replicate all running F14+BTRFS, 
the clients are on fedora12/13/14. I build the RPMs from source on a F14 
machine. The cluster is running entirely on GbE (with some 10Gb lines 
going in shortly), no RDMA/infiniband so I can't help there.


It's gone through a series of looped benchmarks for a while now (from 
3.1.0 through a few qa releases) and have so far pushed/pulled over 
110TB through it - I'm happy in the stability but not /entirely/ sure of 
the performance just yet, just started up more testing under 3.1.1.


But back to your main question there really isn't enough difference 
between the near-term releases of Fedora for it to make a huge 
difference either way. I do think you're better off using the latest 
Fedora release than an older one that will be end of life soon (f12 
tomorrow). Being able to patch/maintain your system is more important 
than an, often very arbitrary, vendor support list which is usually just 
an outcome of what people have had time to look into, rather than any 
measured reason a newer OS isn't supported. Besides the only thing you 
ever have to /really/ care about is the kernel and glibc major versions, 
so if it compiles you're pretty much ok (ldd it, that's all it needs).



On 12/02/2010 01:45 AM, Burnash, James wrote:

How many people on the list are using Fedora 12 (or 13) in production for 
Glusterfs storage servers? I know that Gluster Platform uses Fedora 12 as its 
OS - I was thinking of building my new glusterfs storage servers using Fedora, 
and was wondering whether Fedora 13 was tested by Gluster for v 3.1.1 and what 
other people's experiences were.

One of the reasons for my interest was so that I could use ext4 as the backend 
file store, instead of ext3.

Thanks,

James Burnash, Unix Engineering


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] Unable to self-heal permissions/ownership of '/' (possible split-brain)

2010-11-26 Thread Mark &quot;Naoki&quot; Rogers

On 11/25/2010 07:55 PM, Dan Bretherton wrote:

The volume seems to be working normally, but I am seeing the following
messages repeated in my volume log file:

[2010-11-25 10:27:06.867368] I [afr-common.c:672:afr_lookup_done]
marine2-replicate-0: split brain detected during lookup of /.
[2010-11-25 10:27:06.867454] I [afr-common.c:716:afr_lookup_done]
marine2-replicate-0: background  meta-data data self-heal triggered. path: /
[2010-11-25 10:27:06.867614] I [afr-common.c:672:afr_lookup_done]
marine2-replicate-1: split brain detected during lookup of /.
[2010-11-25 10:27:06.867633] I [afr-common.c:716:afr_lookup_done]
marine2-replicate-1: background  meta-data data self-heal triggered. path: /
[2010-11-25 10:27:06.868174] E
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] marine2-replicate-0:
Unable to self-heal permissions/ownership of '/' (possible split-brain).
Please fix the file on all backend volumes


I've been seeing the exact same thing with all 3.1 releases (not tested 
past qa8):
[2010-11-25 03:49:17.920483] I [afr-common.c:672:afr_lookup_done] 
test-volume-replicate-0: split brain detected during lookup of /.
[2010-11-25 03:49:17.926982] I [afr-common.c:716:afr_lookup_done] 
test-volume-replicate-0: background  meta-data data self-heal triggered. 
path: /
[2010-11-25 03:49:17.955846] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 
test-volume-replicate-0: Unable to self-heal permissions/ownership of 
'/' (possible split-brain). Please fix the file on all backend volumes
[2010-11-25 03:49:17.956445] I 
[afr-self-heal-common.c:1526:afr_self_heal_completion_cbk] 
test-volume-replicate-0: background  meta-data data self-heal completed on /
[2010-11-26 04:21:20.549679] I [afr-common.c:672:afr_lookup_done] 
test-volume-replicate-0: split brain detected during lookup of /.
[2010-11-26 04:21:20.589245] I [afr-common.c:716:afr_lookup_done] 
test-volume-replicate-0: background  meta-data data self-heal triggered. 
path: /
[2010-11-26 04:21:20.591699] E 
[afr-self-heal-metadata.c:524:afr_sh_metadata_fix] 
test-volume-replicate-0: Unable to self-heal permissions/ownership of 
'/' (possible split-brain). Please fix the file on all backend volumes


___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users