Re: [Gluster-devel] Rolling upgrades from glusterfs 3.4 to 3.5

2014-06-18 Thread Ravishankar N

On 06/12/2014 11:13 PM, Anand Avati wrote:
On Thu, Jun 12, 2014 at 10:33 AM, Vijay Bellur vbel...@redhat.com 
mailto:vbel...@redhat.com wrote:


On 06/12/2014 06:52 PM, Ravishankar N wrote:

Hi Vijay,

Since glusterfs 3.5, posix_lookup() sends ESTALE instead of
ENOENT [1]
when when a parent gfid (entry) is not present on the brick . In a
replicate set up, this causes a problem because AFR gives more
priority
to ESTALE than ENOENT, causing IO to fail [2]. The fix is in
progress at
[3] and is client-side specific , and would make it to 3.5.2

But we will still hit the problem when rolling upgrade is
performed from
3.4 to 3.5,  unless the clients are also upgraded to 3.5: To
elaborate
an example:

0) Create a 1x2 volume using 2 nodes and mount it from client. All
machines are glusterfs 3.4
1) Perform for i in {1..30}; do mkdir $i; tar xf
glusterfs-3.5git.tar.gz
-C $i done
2) While this is going on, kill one of the node in the replica
pair and
upgrade it to glusterfs 3.5 (simulating rolling upgrade)
3) After a while, kill all tar processes
4) Create a backup directory and move all 1..30 dirs inside
'backup'
5) Start the untar processes in 1) again
6) Bring up the upgraded node. Tar fails with estale errors.

Essentially the errors occur because [3] is a client side fix. But
rolling upgrades are targeted at servers while the older
clients still
need to access them without issues.

A solution is to have a fix in the posix translator wherein
the newer
client passes it's version (3.5) to posix_lookup() which then
sends
ESTALE if version is 3.5 or newer but sends ENOENT instead if
it is an
older client. Does this seem okay?


Cannot think of a better solution to this. Seamless rolling
upgrades are necessary for us and the proposed fix does seem okay
for that reason.

Thanks,
Vijay


I also like Justin's proposal, of having fixes in 3.4.X and requiring 
clients to be at least 3.4.X in order to have rolling upgrade to 
3.5.Y. This way we can add the special fix in 3.4.X client (just 
like the 3.5.2 client). Ravi's proposal works, but all LOOKUPs will 
have an extra xattr, and we will be carrying forward the compat code 
burden for a very long time. Whereas a 3.4.X client fix will remain in 
3.4 branch.


Thanks



I have sent a fix for review (http://review.gluster.org/#/c/8080/) . The 
change is in the server side only. I reckon if we are asking users to 
upgrade clients to a 3.4.x  which anyway involves app downtime, we might 
as well ask them to upgrade to 3.5.


The fix is only sent on 3.5 - it does not need to go to master as I 
understand from Pranith that we only support compatibility between the 
current two releases. (meaning 3.6 servers require clients to be at at 
least 3.5 and not lower).


Regards,
Ravi

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] 3.5.1-beta2 Problems with suid and sgid bits on directories

2014-06-18 Thread Anders Blomdell
On 2014-06-17 18:47, Anders Blomdell wrote:
 On 2014-06-17 17:49, Shyamsundar Ranganathan wrote:
 You maybe looking at the problem being fixed here, [1].

 On a lookup attribute mismatch was not being healed across
 directories, and this patch attempts to address the same. Currently
 the version of the patch does not heal the S_ISUID and S_ISGID bits,
 which is work in progress (but easy enough to incorporate and test
 based on the patch at [1]).
 Thanks, will look into it tomorrow.
 
 On a separate note, add-brick just adds a brick to the cluster, the
 lookup is where the heal (or creation of the directory across all sub
 volumes in DHT xlator) is being done.
 Thanks for the clarification (I guess that a rebalance would trigger it as 
 well?)
Attached slightly modified version of patch [1] seems to work correctly after a 
rebalance
that is allowed to run to completion on its own, if directories are traversed 
during rebalance, 
some 0 dirs show spurious 01777, 0 and sometimes ends up with the wrong 
permission.

Continuing debug tomorrow...
 

 Shyam

 [1] http://review.gluster.org/#/c/6983/

 - Original Message -
 From: Anders Blomdell anders.blomd...@control.lth.se To:
 Gluster Devel gluster-devel@gluster.org Sent: Tuesday, June 17,
 2014 10:53:52 AM Subject: [Gluster-devel] 3.5.1-beta2 Problems with
 suid and sgid bits on   directories

 With a glusterfs-3.5.1-0.3.beta2.fc20.x86_64 with a reverted 
 3dc56cbd16b1074d7ca1a4fe4c5bf44400eb63ff (due to local lack of IPv4 
 addresses), I get weird behavior if I:

 1. Create a directory with suid/sgid/sticky bit set
 (/mnt/gluster/test) 2. Make a subdirectory of #1
 (/mnt/gluster/test/dir1) 3. Do an add-brick

 Before add-brick

 755 /mnt/gluster 7775 /mnt/gluster/test 2755 /mnt/gluster/test/dir1

 After add-brick

 755 /mnt/gluster 1775 /mnt/gluster/test 755 /mnt/gluster/test/dir1

 On the server it looks like this:

 7775 /data/disk1/gluster/test 2755 /data/disk1/gluster/test/dir1 1775
 /data/disk2/gluster/test 755 /data/disk2/gluster/test/dir1

 Filed as bug:

 https://bugzilla.redhat.com/show_bug.cgi?id=1110262

 If somebody can point me to where the logic of add-brick is placed, I
 can give it a shot (a find/grep on mkdir didn't immediately point me
 to the right place).


 /Anders
 
/Anders

-- 
Anders Blomdell  Email: anders.blomd...@control.lth.se
Department of Automatic Control
Lund University  Phone:+46 46 222 4625
P.O. Box 118 Fax:  +46 46 138118
SE-221 00 Lund, Sweden

diff -urb glusterfs-3.5.1beta2/xlators/cluster/dht/src/dht-common.c glusterfs-3.5.1.orig/xlators/cluster/dht/src/dht-common.c
--- glusterfs-3.5.1beta2/xlators/cluster/dht/src/dht-common.c	2014-06-10 18:55:22.0 +0200
+++ glusterfs-3.5.1.orig/xlators/cluster/dht/src/dht-common.c	2014-06-17 22:46:28.710636632 +0200
@@ -523,6 +523,28 @@
 }
 
 int
+permission_changed (ia_prot_t *local, ia_prot_t *stbuf)
+{
+if( (local-owner.read != stbuf-owner.read) ||
+(local-owner.write != stbuf-owner.write) ||
+(local-owner.exec != stbuf-owner.exec) ||
+(local-group.read != stbuf-group.read) ||
+(local-group.write != stbuf-group.write) ||
+(local-group.exec != stbuf-group.exec) ||
+(local-other.read != stbuf-other.read) ||
+(local-other.write != stbuf-other.write) ||
+(local-other.exec != stbuf-other.exec ) ||
+(local-suid != stbuf-suid ) ||
+(local-sgid != stbuf-sgid ) ||
+(local-sticky != stbuf-sticky ))
+{
+return 1;
+} else {
+return 0;
+}
+}
+
+int
 dht_revalidate_cbk (call_frame_t *frame, void *cookie, xlator_t *this,
 int op_ret, int op_errno,
 inode_t *inode, struct iatt *stbuf, dict_t *xattr,
@@ -617,12 +639,16 @@
 stbuf-ia_ctime_nsec)) {
 local-prebuf.ia_gid = stbuf-ia_gid;
 local-prebuf.ia_uid = stbuf-ia_uid;
+local-prebuf.ia_prot = stbuf-ia_prot;
 }
 }
 if (local-stbuf.ia_type != IA_INVAL)
 {
 if ((local-stbuf.ia_gid != stbuf-ia_gid) ||
-(local-stbuf.ia_uid != stbuf-ia_uid)) {
+(local-stbuf.ia_uid != stbuf-ia_uid) ||
+(permission_changed ((local-stbuf.ia_prot)
+, ((stbuf-ia_prot)
+{
 local-need_selfheal = 1;
 }
 }
@@ -669,6 +695,8 @@
 uuid_copy (local-gfid, local-stbuf.ia_gfid);
 local-stbuf.ia_gid = 

[Gluster-devel] GlusterFS 3.6 Feature Freeze date pushed back 2 weeks

2014-06-18 Thread Justin Clift
Hi all,

Just a small heads up.  We're pushing back the GlusterFS Feature Freeze
date by two weeks.

This lets us focus on fixing bugs in 3.5 that have been reported recently,
so people don't have to burn themselves out developing 3.6 features in a
massive rush at the same time. ;)

Regards and best wishes,

Justin Clift

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift

___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] tests and umount

2014-06-18 Thread Pranith Kumar Karampuri


On 06/16/2014 09:08 PM, Pranith Kumar Karampuri wrote:


On 06/16/2014 09:00 PM, Jeff Darcy wrote:

   I see that most of the tests are doing umount and these may fail
sometimes because of EBUSY etc. I am wondering if we should change all
of them to umount -l.
Let me know if you foresee any problems.

I think I'd try umount -f first.  Using -l too much can cause an
accumulation of zombie mounts.  When I'm hacking around on my own, I
sometimes have to do umount -f twice but that's always sufficient.
Cool, I will do some kind of EXPECT_WITHIN with umount -f may be 5 
times just to be on the safer side.
I submitted http://review.gluster.com/8104 for one of the tests as it is 
failing frequently. Will do the next round later.


Pranith


If no one has any objections I will send out a patch tomorrow for this.

Pranith
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://supercolony.gluster.org/mailman/listinfo/gluster-devel