[Gluster-devel] AFR problem with 2.0rc4

2009-03-17 Thread nicolas prochazka
Hello guys, strange problem : with rc4, afr synchronisation seems to be not work : - If i copy a file on mount gluster, all is ok on all servers - if i add a new server in gluster, this server create my files ( 10G size ) , it's appear on XFS as 10G file but file does not contains original, just s

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-17 Thread Gordan Bobic
Are you sure this is rc4 specific? I've seen assorted weirdness when adding and removing servers in all versions up to and including rc2 (rc4 seems to lock up when starting udev on it, so I'm not using it). On Tue, 17 Mar 2009 11:15:30 +0100, nicolas prochazka wrote: > Hello guys, > > strange pr

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-17 Thread nicolas prochazka
I do not know, my previous test is about git tree from 01 March, and I do not notice this problem. This issue seems to be only affect big file, same test with a 8 octets text file has no problem. nicolas On Tue, Mar 17, 2009 at 11:20 AM, Gordan Bobic wrote: > Are you sure this is rc4 specific? I

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-17 Thread Gordan Bobic
Can you check if it works correctly with 2.0rc2 and/or 2.0rc1? On Tue, 17 Mar 2009 12:04:33 +0100, nicolas prochazka wrote: > oups, > same problem in fact with simple 8 bytes text file, the file seems to > be corrupt. > > Regards, > Nicolas Prochazka > > On Tue, Mar 17, 2009 at 11:20 AM, Gordan

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-17 Thread nicolas prochazka
oups, same problem in fact with simple 8 bytes text file, the file seems to be corrupt. Regards, Nicolas Prochazka On Tue, Mar 17, 2009 at 11:20 AM, Gordan Bobic wrote: > Are you sure this is rc4 specific? I've seen assorted weirdness when adding > and removing servers in all versions up to and

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-17 Thread nicolas prochazka
I 'm just trying with rc2 , same bug as rc4. Regards, Nicolas On Tue, Mar 17, 2009 at 12:06 PM, Gordan Bobic wrote: > Can you check if it works correctly with 2.0rc2 and/or 2.0rc1? > > On Tue, 17 Mar 2009 12:04:33 +0100, nicolas prochazka > wrote: >> oups, >> same problem in fact with simple 8 b

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-17 Thread nicolas prochazka
hello again, So this bug does not occur with RC1 RC2,RC4 contains bug describe below, not RC1 , any idea ? Nicolas On Tue, Mar 17, 2009 at 12:55 PM, nicolas prochazka wrote: > I 'm just trying with rc2 , same bug as rc4. > Regards, > Nicolas > > On Tue, Mar 17, 2009 at 12:06 PM, Gordan Bobic wr

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-17 Thread Gordan Bobic
Have you tried the later versions (rc2/rc4) without the performance trasnlators? Does the problem persist without them? Anything interesting looking in the logs? On Tue, 17 Mar 2009 14:46:41 +0100, nicolas prochazka wrote: > hello again, > So this bug does not occur with RC1 > > RC2,RC4 contains

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-17 Thread nicolas prochazka
Yes i'm trying without any translator but bugs persists. Into logs i can not see anything interesting, size of file seems to be always ok when it begin synchronize. As i write before, if i cp files during normal operation ( 2 servers ok ) all is ok, problem appears only when i try to resynchroniz

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-17 Thread Gordan Bobic
Ouch. Maybe this would be a good time to do md5sum comparison check on all of my servers and downgrade back to rc1... :-/ Thanks for reporting the problem. Gordan On Tue, 17 Mar 2009 15:22:32 +0100, nicolas prochazka wrote: > Yes i'm trying without any translator but bugs persists. > > Into l

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-17 Thread Anand Avati
We are looking into this problem and will get back to you soon. On Tue, Mar 17, 2009 at 8:04 PM, Gordan Bobic wrote: > Ouch. Maybe this would be a good time to do md5sum comparison check on all > of my servers and downgrade back to rc1... :-/ > > Thanks for reporting the problem. > > Gordan > _

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-17 Thread Amar Tumballi
Hi Nicolas, When you mean you 'add' a server here, you are adding another server to replicate subvolume? (ie, 2 to 3), or you had one server down when copying data (of 2 servers), and you bring back another server up and trigger the afr self heal ? Regards, Amar On Tue, Mar 17, 2009 at 7:22 AM,

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-17 Thread nicolas prochazka
My test is : Set two server in AFR mode copy file to mount point ( /mnt/vdisk ) : ok , synchro is ok on two server. Then delete (rm ) all file from storage on server 1 ( /mnt/disks/export ) then wait for synchronisation. with rc2 and rc4 => file with good size ( ls -l) but nothing here ( df -b s

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-18 Thread nicolas prochazka
Hello, I see in git tree correction of afr heal bug , can wa test this release, is stable enough in compare rc release ? nicolas On Tue, Mar 17, 2009 at 9:39 PM, nicolas prochazka wrote: > My test is : > Set two server in AFR mode > copy file to mount point ( /mnt/vdisk ) :  ok  , synchro is ok o

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-18 Thread Amar Tumballi
Hi Nicolas, Sure, We are in the process of internal testing. It should be out as release soon. Meanwhile, you can pull from git and test it out. Regards, On Wed, Mar 18, 2009 at 1:30 AM, nicolas prochazka < prochazka.nico...@gmail.com> wrote: > Hello, > I see in git tree correction of afr heal

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread nicolas prochazka
I'm trying last gluster from git, bug is corrected, but there's seem to be a lot of weird comportment in AFR mode. If i down one of two server, clients does not respond to a ls, or respond but with not all file, just one I'm trying with and without lock server to 2 , 1 or 0 , results are the s

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread nicolas prochazka
More , In fact, I cannot say that's self healing work very well. I'm trying to cp some big file ( 10g ) . fewer is ok, but all of them are partially copied so on first server my file is 10g ( ok ) but in the second = 8g : not ok and synchro does not occur. I'm trying to umount without success, i

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread Gordan Bobic
On Thu, 19 Mar 2009 15:56:25 +0530, Vikas Gorur wrote: > 2009/3/19 Gordan Bobic : >> The subvolumes have to be listed in the same order on all nodes, because >> that is the order of precedence and failover for lock servers. Could it >> be >> that this was the reason why you were seing problems? >

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread Vikas Gorur
2009/3/19 Gordan Bobic : > How does this affect adding new servers into an existing cluster? Adding a new server will work --- as and when files are accessed, new extended attributes will be written. Vikas -- Engineer - Z Research http://gluster.com/ ___

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread Gordan Bobic
On Thu, 19 Mar 2009 16:25:21 +0530, Vikas Gorur wrote: > 2009/3/19 Gordan Bobic : >> On Thu, 19 Mar 2009 16:14:18 +0530, Vikas Gorur >> wrote: >>> 2009/3/19 Gordan Bobic : How does this affect adding new servers into an existing cluster? >>> >>> Adding a new server will work --- as and when

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread Vikas Gorur
2009/3/19 Gordan Bobic : > The subvolumes have to be listed in the same order on all nodes, because > that is the order of precedence and failover for lock servers. Could it be > that this was the reason why you were seing problems? That is one reason. Another reason is that the extended attribute

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread Gordan Bobic
The subvolumes have to be listed in the same order on all nodes, because that is the order of precedence and failover for lock servers. Could it be that this was the reason why you were seing problems? You can set the preferred read subvolume with "option read-subvolume", but the order in which su

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread Gordan Bobic
On Thu, 19 Mar 2009 16:14:18 +0530, Vikas Gorur wrote: > 2009/3/19 Gordan Bobic : >> How does this affect adding new servers into an existing cluster? > > Adding a new server will work --- as and when files are accessed, new > extended attributes will be written. And presumably, permanently remo

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread Gordan Bobic
That's unavoidable to some extent, since the first server is the one that is authoritative for locking. That means that all reads have to make a hit on the 1st server, even if the data then gets retrieved from another server in the cluster. Whether that explains all of the disparity you are seing,

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread Vikas Gorur
2009/3/19 Gordan Bobic : > That's unavoidable to some extent, since the first server is the one that > is authoritative for locking. That means that all reads have to make a hit > on the 1st server, even if the data then gets retrieved from another server > in the cluster. Whether that explains all

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread nicolas prochazka
I've test with : option read-subvolume it does not work. I've have do tcpdump on my two server ,if i run a cp command from a client, traffic is always to the first server in subvolume parameter, option read-subvolume does not work in my case. Nicolas On Thu, Mar 19, 2009 at 12:40 PM, nicolas pro

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread Vikas Gorur
2009/3/19 Gordan Bobic : > On Thu, 19 Mar 2009 16:14:18 +0530, Vikas Gorur > wrote: >> 2009/3/19 Gordan Bobic : >>> How does this affect adding new servers into an existing cluster? >> >> Adding a new server will work --- as and when files are accessed, new >> extended attributes will be written.

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread nicolas prochazka
i understand that, but in this case, i have an other problem : it seems that's load balancing between subvolumes does not work very well, the first server in subvolumes list is very often use compare to other server ( in read ) = > so i 've big ressource network usage and this first server, not in

Re: [Gluster-devel] AFR problem with 2.0rc4

2009-03-19 Thread Anand Avati
> I've test with :  option read-subvolume > it does not work. > I've have do tcpdump on my two server ,if i run a cp command from a client, > traffic is always to the first server in subvolume parameter, option > read-subvolume does not work in my case. There are some issues with read-subvolume op