Re: [Gluster-users] Deleted files reappearing

2013-11-13 Thread Øystein Viggen
Lalatendu Mohanty lmoha...@redhat.com writes:

 I am just curious about what does  gluster v heal volumeName info
 split-brain returns when you see this issue?

Number of entries: 0 every time.

Here's a test I did today, across the same four virtual machines with
replica 2 and a fifth virtual machine as a native glusterfs client:

* shutdown -h now on server node 02

* On the client:
# rm -Rf linux-3.12

* wait 30 seconds, and boot up server node 02

* wait until this appears on the client:
rm: cannot remove `linux-3.12/arch/powerpc/platforms/52xx': Directory
not empty

* heal info split-brain and heal info on node 01, in full:

-
root@ovvm01:~# gluster v heal ovvmvol0 info split-brain
Gathering Heal info on volume ovvmvol0 has been successful

Brick ovvm01.itea.ntnu.no:/export/sdb1/brick
Number of entries: 0

Brick ovvm02.itea.ntnu.no:/export/sdb1/brick
Number of entries: 0

Brick ovvm03.itea.ntnu.no:/export/sdb1/brick
Number of entries: 0

Brick ovvm04.itea.ntnu.no:/export/sdb1/brick
Number of entries: 0
root@ovvm01:~# gluster v heal ovvmvol0 info 
Gathering Heal info on volume ovvmvol0 has been successful

Brick ovvm01.itea.ntnu.no:/export/sdb1/brick
Number of entries: 4
/linux-3.12
/linux-3.12/arch
/linux-3.12/arch/powerpc
/linux-3.12/arch/powerpc/platforms

Brick ovvm02.itea.ntnu.no:/export/sdb1/brick
Number of entries: 1
gfid:6ec5ceae-14fa-4d02-8f1e-d3c362860557/52xx/Kconfig

Brick ovvm03.itea.ntnu.no:/export/sdb1/brick
Number of entries: 0

Brick ovvm04.itea.ntnu.no:/export/sdb1/brick
Number of entries: 0
-

* Verify on the client, after rm has finished:
# find linux-3.12
linux-3.12
linux-3.12/arch
linux-3.12/arch/powerpc
linux-3.12/arch/powerpc/platforms
linux-3.12/arch/powerpc/platforms/52xx
linux-3.12/arch/powerpc/platforms/52xx/Kconfig


Øystein
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Deleted files reappearing

2013-11-12 Thread Øystein Viggen
Amar Tumballi atumb...@redhat.com writes:

 On 11/12/2013 12:54 PM, Øystein Viggen wrote:
 Should I file a bug about this somewhere?  It seems easy enough to
 replicate.
 https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS

Thanks.  I've tried to describe it as best I can, and linked back to
this thread.

https://bugzilla.redhat.com/show_bug.cgi?id=1029337

Øystein
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Deleted files reappearing

2013-11-11 Thread Øystein Viggen
Lalatendu Mohanty lmoha...@redhat.com writes:

 It sounds like a split brain issue. Below mentioned commands will help
 you to figure this out.

  gluster v heal volumeName info split-brain
  gluster v heal volumeName info heal-failed

 If you see any split-brain , then it is a bug. We can check with
 gluster-devel if it is  fixed in the master branch or there is bug for
 it in bugzilla.

Thank you for your reply.

I've repeated a similar test on my two node cluster like this:

1. shut down node 02
2. on the client run rm -Rf linux-3.12/
3. while the rm is running, boot up node 02

That has the following interesting results:

On the client:

rm: cannot remove `linux-3.12/arch/mips/netlogic/dts': Directory not
empty

On the servers:

gluster v heal ovvmvol0 info split brain and gluster v heal ovvmvol0
info heal-failed both show 0 entries.

It also claims to have healed some files:
-
# gluster v heal ovvmvol0 info healed
Gathering Heal info on volume ovvmvol0 has been successful

Brick ovvm01.itea.ntnu.no:/export/sdb1/brick
Number of entries: 4
atpath on brick
---
2013-11-11 13:49:32 /linux-3.12/arch/mips/netlogic
2013-11-11 13:49:32 /linux-3.12/arch/mips
2013-11-11 13:49:30 /linux-3.12/arch
2013-11-11 13:49:29 /linux-3.12

Brick ovvm02.itea.ntnu.no:/export/sdb1/brick
Number of entries: 3
atpath on brick
---
2013-11-11 13:49:29
gfid:2febb3e3-b72f-47f0-a6f6-cbec70d8874c/dts/xlp_fvp.dts
2013-11-11 13:49:29
gfid:2febb3e3-b72f-47f0-a6f6-cbec70d8874c/dts/xlp_evp.dts
2013-11-11 13:49:29
gfid:2febb3e3-b72f-47f0-a6f6-cbec70d8874c/dts/Makefile
-

On the client, these three files in linux-3.12/arch/mips/netlogic/dts/
are indeed shown as present.


Still curious if this was a somehow a quorum issue, I added two more
servers, for a total of four servers with one brick each.  Still
replicate 2.  I set cluster.server-quorum-type=server and
cluster.server-quorum-ratio=51%.

I repeated the experiment of shutting down node 02, starting an rm -Rf
on a client, and booting up node 02 again.  This time, it healed
seemingly half the linux-3.12/arch/x86/include/asm/ directory.  As one
might expect, the directory is completely empty on bricks 03 and 04,
while bricks 01 and 02 share the same files.


Should I file a bug about this somewhere?  It seems easy enough to
replicate.

Øystein
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Deleted files reappearing

2013-11-07 Thread Øystein Viggen
Hi,

I have a small test setup on Ubuntu 12.04, using the
3.4.1-ubuntu1~precise1 packages of glusterfs from the recommended PPA.
There are two gluster servers (replicate 2) and one client.  Bricks are
16 GB xfs -i size=512 filesystems.  All servers run on vmware.

I've been using the linux kernel source for some simple performance and
stability tests with many small files.  When deleting the linux kernel
tree with rm -Rf while rebooting one glusterfs server, it seems that
some deletes are missed, or recreated.  Here's how it goes:

root@client:/mnt# rm -Rf linux-3.12

At this point, I run shutdown -r now on one server.  The deletion
seems to keep running just fine, but just as the server comes back up, I
get something like this on the client:

rm: cannot remove `linux-3.12/arch/mips/ralink/dts': Directory not empty

After the rm has run to completion:

root@client:/mnt# find linux-3.12 -type f
linux-3.12/arch/mips/ralink/dts/Makefile

Sometimes it's more than one file, too.  gluster volume heal volname
info shows no outstanding entries. 

If I turn off one server before running rm, and turn it on during the rm
run, a similar thing happens, only it seems worse.  In one test, I had
9220 files left after rm had finished.

If both servers are up during the rm run, all files are deleted as
expected every time.


What is happening here, and can I do something to avoid it?

I was hoping that in a replica 2 cluster, you could safely reboot one
server at a time (with sync-up time in between) to, say, apply OS
patches without taking the gluster volume offline.


I'm thankful for any help.

Øystein
___
Gluster-users mailing list
Gluster-users@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-users