Hi Xavi, now there are some files on nodes1-2-3 and others on nodes4-5, so I think I'm going to destroy and re-create the volume from scratch (I can afford it now).
In your opinion, having 5 nodes with 10x 4TB disks each, what's the best way to dimension the bricks? Now we configured disperse FS, 2 bricks per node per volume (2x 4TB RAID0 each), if I'm not wrong we can afford losing 2 bricks (= an entire node) Would it be better using distributed FS, and having 1 brick per node (10x 4TB RAID5 each)? Or you have other suggestions? Thanks A. -----Original Message----- From: Xavier Hernandez [mailto:[email protected]] Sent: mercoledì 7 gennaio 2015 18:14 To: RASTELLI Alessandro Cc: [email protected]; CAZZANIGA Stefano; UBERTINI Gabriele; TECHNOLOGY - Supporto Sistemi OTT e Cloud; ORLANDO Luca Subject: Re: [Gluster-users] Input/Output Error when deleting folder If that file is missing only from gluster03-mi, and it has the same attributes in all remaining bricks, self-heal should recover it automatically. Are there differences in the extended attributes of the file on bricks that have it ? On 01/07/2015 05:22 PM, RASTELLI Alessandro wrote: > It worked... partially :) > now I can access the folders again, but I can't delete them because > that there are a couple of files into them (which I don't need) The files > exist only on node1,2,4,5 , but not on node3: > > [root@gluster02-mi ~]# getfattr -m. -e hex -d > /brick1/recorder/Rec218/Rec_218_1_part_14656.ts > getfattr: Removing leading '/' from absolute path names # file: > brick1/recorder/Rec218/Rec_218_1_part_14656.ts > trusted.ec.config=0x0000080a02000200 > trusted.ec.size=0x0000000034400000 > trusted.ec.version=0x0000000000001a20 > trusted.gfid=0x8d5da5a1cd1949618a5b96657857ceb6 > > [root@gluster03-mi ~]# getfattr -m. -e hex -d > /brick1/recorder/Rec218/Rec_218_1_part_14656.ts > getfattr: /brick1/recorder/Rec218/Rec_218_1_part_14656.ts: No such > file or directory > > How do I proceed? > Thanks > > -----Original Message----- > From: Xavier Hernandez [mailto:[email protected]] > Sent: mercoledì 7 gennaio 2015 16:45 > To: RASTELLI Alessandro > Cc: [email protected]; CAZZANIGA Stefano; UBERTINI Gabriele; > TECHNOLOGY - Supporto Sistemi OTT e Cloud; ORLANDO Luca > Subject: Re: [Gluster-users] Input/Output Error when deleting folder > > Sorry, the command should be: > > setfattr -n trusted.ec.version -v 0x0000000000000001 <brick > path>/Rec218 > > On 01/07/2015 04:34 PM, RASTELLI Alessandro wrote: >> See my answers below: >> 1. >> [root@gluster03-mi ~]# ls -l >> /brick1/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e >> 4 >> ls: cannot access >> /brick1/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e >> 4 >> : No such file or directory [root@gluster03-mi ~]# ls -l >> /brick1/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bc >> d lrwxrwxrwx 1 root root 55 Dec 17 17:37 >> /brick1/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bc >> d >> -> ../../00/00/00000000-0000-0000-0000-000000000001/Rec218 >> [root@gluster03-mi ~]# ls -l >> /brick2/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e >> 4 >> ls: cannot access >> /brick2/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e >> 4 >> : No such file or directory [root@gluster03-mi ~]# ls -l >> /brick2/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bc >> d lrwxrwxrwx 1 root root 55 Dec 17 17:37 >> /brick2/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bc >> d >> -> ../../00/00/00000000-0000-0000-0000-000000000001/Rec218 >> >> 2. >> /Rec218 is supposed to be empty (or, I don't need to restore the >> files) I stopped the volume, but when executing the command I get an error: >> [root@gluster01-mi ~]# setfattr -n trusted.ec.version -v 0x1 >> /brick1/recorder/Rec218 bad input encoding >> >> Regards >> A. >> >> >> >> -----Original Message----- >> From: Xavier Hernandez [mailto:[email protected]] >> Sent: mercoledì 7 gennaio 2015 16:08 >> To: RASTELLI Alessandro >> Cc: [email protected]; CAZZANIGA Stefano; UBERTINI Gabriele; >> TECHNOLOGY - Supporto Sistemi OTT e Cloud; ORLANDO Luca >> Subject: Re: [Gluster-users] Input/Output Error when deleting folder >> >> I see two problems here: >> >> 1. There has happened something very strange on gluster03-mi. It >> contains the directory, but it's not the same one that there's on the >> other bricks (8 bricks have gfid >> a9d904af-0d9e-4018-acb2-881bd8b3c2e4, >> while that node has gfid bda849fc-a556-469e-ad84-ed074f2c1bcd) >> >> Whatever that has happened here has affected both bricks of that node in the >> same way. >> >> What return these commands on gluster03-mi: >> >> ls -l >> /brick1/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e >> 4 >> ls -l >> /brick1/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bc >> d >> >> ls -l >> /brick2/recorder/.glusterfs/a9/d9/a9d904af-0d9e-4018-acb2-881bd8b3c2e >> 4 >> ls -l >> /brick2/recorder/.glusterfs/bd/a8/bda849fc-a556-469e-ad84-ed074f2c1bc >> d >> >> 2. It seems that node gluster04-mi has been stopped (or rebooted or >> has >> failed) while an operation that modifies the directory contents was being >> executed, so it has lost an update an it's out of sync (both bricks on the >> same server have missed one update, so it seems clear that it's not a brick >> problem but a server problem). >> >> The global result of all this is that you have 4 failed bricks on a >> configuration that only supports 2 failed bricks. >> >> BTW, having two or more bricks on the same server is not recommended because >> a single server failure causes multiple bricks to be lost. In this case a >> directory can be recovered, but if this happens to a file, it won't be 100% >> recoverable. >> >> Are there any files inside /Rec218 ? >> >> If you are going to delete the directory and all its contents and >> brick contents in gluster03-mi are the same than in other servers, >> the following commands should be safe (otherwise let me know before >> doing >> anything): >> >> Before starting you must be sure that nothing is creating or deleting >> entries inside /Rec218. It would be even better if this could be done with >> volume stopped. >> >> On each brick (including gluster03-mi): >> setfattr -n trusted.ec.version -v 0x1 <brick path>/Rec218 >> >> On bricks in gluster03-mi: >> setfattr -n trusted.gfid -v 0xa9d904af0d9e4018acb2881bd8b3c2e4 >> <brick path>/Rec218 >> setfattr -n trusted.glusterfs.dht -v >> 0x000000010000000000000000ffffffff <brick path>/Rec218 >> >> On client: >> check that the directory is accessible and its contents seem ok. If >> so: >> rm -rf <mount point>/Rec218 >> >> If you have a way to reproduce this situation, let me know. >> >> Xavi >> >> On 01/07/2015 03:31 PM, RASTELLI Alessandro wrote: >>> [root@gluster01-mi ~]# getfattr -m. -e hex -d >>> /brick1/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick1/recorder/Rec218 trusted.ec.version=0x000000000000693a >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> [root@gluster01-mi ~]# getfattr -m. -e hex -d >>> /brick2/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick2/recorder/Rec218 trusted.ec.version=0x000000000000693a >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> >>> [root@gluster02-mi ~]# getfattr -m. -e hex -d >>> /brick1/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick1/recorder/Rec218 trusted.ec.version=0x000000000000693a >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> [root@gluster02-mi ~]# getfattr -m. -e hex -d >>> /brick2/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick2/recorder/Rec218 trusted.ec.version=0x000000000000693a >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> >>> [root@gluster03-mi ~]# getfattr -m. -e hex -d >>> /brick1/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick1/recorder/Rec218 >>> trusted.gfid=0xbda849fca556469ead84ed074f2c1bcd >>> >>> [root@gluster03-mi ~]# getfattr -m. -e hex -d >>> /brick2/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick2/recorder/Rec218 >>> trusted.gfid=0xbda849fca556469ead84ed074f2c1bcd >>> >>> >>> [root@gluster04-mi ~]# getfattr -m. -e hex -d >>> /brick1/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick1/recorder/Rec218 >>> trusted.ec.version=0x0000000000006939 >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> [root@gluster04-mi ~]# getfattr -m. -e hex -d >>> /brick2/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick2/recorder/Rec218 >>> trusted.ec.version=0x0000000000006939 >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> >>> [root@gluster05-mi ~]# getfattr -m. -e hex -d >>> /brick1/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick1/recorder/Rec218 trusted.ec.version=0x000000000000693a >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff >>> >>> [root@gluster05-mi ~]# getfattr -m. -e hex -d >>> /brick2/recorder/Rec218 >>> getfattr: Removing leading '/' from absolute path names # file: >>> brick2/recorder/Rec218 trusted.ec.version=0x000000000000693a >>> trusted.gfid=0xa9d904af0d9e4018acb2881bd8b3c2e4 >>> trusted.glusterfs.dht=0x000000010000000000000000ffffffff _______________________________________________ Gluster-users mailing list [email protected] http://www.gluster.org/mailman/listinfo/gluster-users
