pvfs2-remove-dirent-count.patch:
=======================
This patch is straightforward; it simply removes the dirent_count check from the sys-remove path. The main problem with this check is that it relies on attributes which may have come from the attribute cache and doesn't give the server a chance to overrule the decision. You can therefore fool it by doing something like this:

bash-2.05b# /sbin/sysctl -w pvfs2.acache.timeout-msecs=100000000
bash-2.05b# mkdir -p /mnt/pvfs2/a/b/c
bash-2.05b# rm -rf /mnt/pvfs2/a
rm: cannot remove directory `/mnt/pvfs2/a/b': Directory not empty

The cache timeout doesn't have to be that high to show the problem. We should be able to stay consistent on a single client regardless of the cache setting at any rate anyway.

I don't think that removing the check will really hurt much:
- the directory entry has already been removed at this point in the operation, so it doesn't do anything to help avoid cleanup scenarios - the server will perform this check on its own when it gets the remove request

So having the check might have made remove _failures_ faster (since a client potentially reach the decision that a directory wasn't empty earlier) but it doesn't change the success path, and doesn't alter semantics or race condition behavior.

pvfs2-rename-dirent-count.patch:
=======================
This fixes a related but different bug in rename, in the tricky case in which someone tries to rename a directory to another directory, but the destination directory is not empty:

bash-2.05b# /sbin/sysctl -w pvfs2.acache.timeout-msecs=10000000
bash-2.05b# mkdir -p /mnt/pvfs2/scratch/boom
bash-2.05b# mkdir -p /mnt/pvfs2/scratch/yeah/1
bash-2.05b# rename boom yeah /mnt/pvfs2/scratch/boom
rename: renaming /mnt/pvfs2/scratch/boom to /mnt/pvfs2/scratch/yeah failed: Directory not empty
bash-2.05b# ls /mnt/pvfs2/scratch/yeah

I hate that the rename system call even lets you rename to an existing file/directory name. Trying to modify multiple points in the name space somewhat atomically isn't fun. At any rate there were two problems:

- We are doing a dirent_count check on the client side, but not asking for the dirent_count in the getattr mask. It therefore usually showed up as zero. - The same caching problem as described in the remove patch theoretically applies here, but in this case we really need to be able to do this check on the client side. I say theoretical because I couldn't come up with a test case to triger it after making the above mask fix, but I am still nervous about it. This fix here was to add a flag to the getattr nested machine to force it to bypass the cache in cases (this one!) where we want the getattr information to be as up to date as possible. The flag could be set by any state machine that uses the nested getattr machine in case we ever find another use for it.

The fundamental problem with renaming directories to existing names is that up to 4 metadata servers may be involved when you consider the two directories and the two parent directories. There is no way to get decent consistency with the client driving the operation in this special case unless you retrieve the dirent_count and check it on the client.

-Phil
---------------------
PatchSet 378 
Date: 2006/01/16 17:34:18
Author: pcarns
Branch: HEAD
Tag: (none) 
Log:
bug fix: make sure that we request the dirent_count attribute when
performing a getattr on the destination directory during rename; it is needed
for a safety check in the following state.
[artf29248]

Members: 
	src/client/sysint/sys-rename.sm:1.9->1.10 

Index: src/client/sysint/sys-rename.sm
diff -u src/client/sysint/sys-rename.sm:1.9 src/client/sysint/sys-rename.sm:1.10
--- src/client/sysint/sys-rename.sm:1.9	Tue Dec  6 13:46:30 2005
+++ src/client/sysint/sys-rename.sm	Mon Jan 16 10:34:18 2006
@@ -773,6 +773,8 @@
                  lld(attr->u.dir.dirent_count));
         gossip_debug(GOSSIP_CLIENT_DEBUG,
                  "rename dest attr mask: %d\n", (int)attr->mask);
+        gossip_debug(GOSSIP_CLIENT_DEBUG,
+                 "rename dest handle %llu\n", llu(sm_p->object_ref.handle));
     }
 
     /* if the destination is a directory, is it empty? */
@@ -1006,9 +1008,13 @@
     sm_p->object_ref = sm_p->u.rename.refns[1];
 
     PINT_SM_GETATTR_STATE_CLEAR(sm_p->getattr);
+    /* NOTE: we ask for the dirent count on the destination.  This is
+     * important so that we can confirm that the destination is empty if it
+     * happens to be a directory rather than a file.
+     */
     PINT_SM_GETATTR_STATE_FILL(sm_p->getattr,
                                sm_p->object_ref,
-                               PVFS_ATTR_COMMON_ALL,
+                               (PVFS_ATTR_COMMON_ALL|PVFS_ATTR_DIR_DIRENT_COUNT),
                                PVFS_TYPE_NONE);
     js_p->error_code = 0;
     return(1);
---------------------
PatchSet 379 
Date: 2006/01/16 18:06:44
Author: pcarns
Branch: HEAD
Tag: (none) 
Log:
Added flag parameter to nested getattr state machine, only supported flag
right now is BYPASS_CACHE, which is useful in rename to make sure that we get
a relatively up to date dirent_count before modifying entries
[artf29248]

Members: 
	src/client/sysint/client-state-machine.h:1.17->1.18 
	src/client/sysint/mgmt-get-dfile-array.sm:1.5->1.6 
	src/client/sysint/remove.sm:1.6->1.7 
	src/client/sysint/sys-create.sm:1.14->1.15 
	src/client/sysint/sys-flush.sm:1.6->1.7 
	src/client/sysint/sys-getattr.sm:1.11->1.12 
	src/client/sysint/sys-io.sm:1.18->1.19 
	src/client/sysint/sys-lookup.sm:1.4->1.5 
	src/client/sysint/sys-mkdir.sm:1.10->1.11 
	src/client/sysint/sys-readdir.sm:1.6->1.7 
	src/client/sysint/sys-rename.sm:1.10->1.11 
	src/client/sysint/sys-symlink.sm:1.10->1.11 
	src/client/sysint/sys-truncate.sm:1.8->1.9 

Index: src/client/sysint/client-state-machine.h
diff -u src/client/sysint/client-state-machine.h:1.17 src/client/sysint/client-state-machine.h:1.18
--- src/client/sysint/client-state-machine.h:1.17	Mon Jan  9 10:12:32 2006
+++ src/client/sysint/client-state-machine.h	Mon Jan 16 11:06:44 2006
@@ -344,6 +344,9 @@
     int persist_config_buffers;
 };
 
+/* flag to disable cached lookup during getattr nested sm */
+#define PINT_SM_GETATTR_BYPASS_CACHE 1
+
 typedef struct PINT_sm_getattr_state
 {
     PVFS_object_ref object_ref;
@@ -363,16 +366,19 @@
 
     PVFS_size * size_array;
     PVFS_size size;
+
+    int flags;
     
 } PINT_sm_getattr_state;
 
-#define PINT_SM_GETATTR_STATE_FILL(_state, _objref, _mask, _reftype) \
+#define PINT_SM_GETATTR_STATE_FILL(_state, _objref, _mask, _reftype, _flags) \
     do { \
         memset(&(_state), 0, sizeof(PINT_sm_getattr_state)); \
         (_state).object_ref.fs_id = (_objref).fs_id; \
         (_state).object_ref.handle = (_objref).handle; \
         (_state).req_attrmask = _mask; \
         (_state).ref_type = _reftype; \
+        (_state).flags = _flags; \
     } while(0)
 
 #define PINT_SM_GETATTR_STATE_CLEAR(_state) \
Index: src/client/sysint/mgmt-get-dfile-array.sm
diff -u src/client/sysint/mgmt-get-dfile-array.sm:1.5 src/client/sysint/mgmt-get-dfile-array.sm:1.6
--- src/client/sysint/mgmt-get-dfile-array.sm:1.5	Tue Oct 25 13:52:45 2005
+++ src/client/sysint/mgmt-get-dfile-array.sm	Mon Jan 16 11:06:44 2006
@@ -91,7 +91,8 @@
         sm_p->getattr,
         ref,
         PVFS_ATTR_META_ALL|PVFS_ATTR_COMMON_TYPE,
-        PVFS_TYPE_METAFILE);
+        PVFS_TYPE_METAFILE, 
+        0);
 
     return PINT_client_state_machine_post(
         sm_p, PVFS_MGMT_GET_DFILE_ARRAY, op_id, user_ptr);
Index: src/client/sysint/remove.sm
diff -u src/client/sysint/remove.sm:1.6 src/client/sysint/remove.sm:1.7
--- src/client/sysint/remove.sm:1.6	Mon Jan 16 09:26:36 2006
+++ src/client/sysint/remove.sm	Mon Jan 16 11:06:44 2006
@@ -145,7 +145,8 @@
         sm_p->getattr,
         sm_p->object_ref,
         PVFS_ATTR_META_ALL|PVFS_ATTR_COMMON_TYPE,
-        PVFS_TYPE_NONE);
+        PVFS_TYPE_NONE,
+        0);
     return 1;
 }
 
Index: src/client/sysint/sys-create.sm
diff -u src/client/sysint/sys-create.sm:1.14 src/client/sysint/sys-create.sm:1.15
--- src/client/sysint/sys-create.sm:1.14	Tue Dec  6 13:46:30 2005
+++ src/client/sysint/sys-create.sm	Mon Jan 16 11:06:44 2006
@@ -421,7 +421,8 @@
         sm_p->getattr,
         sm_p->object_ref,
         PVFS_ATTR_COMMON_ALL, 
-        PVFS_TYPE_DIRECTORY);
+        PVFS_TYPE_DIRECTORY,
+        0);
 
    return 1;
 }
Index: src/client/sysint/sys-flush.sm
diff -u src/client/sysint/sys-flush.sm:1.6 src/client/sysint/sys-flush.sm:1.7
--- src/client/sysint/sys-flush.sm:1.6	Tue Dec  6 13:46:30 2005
+++ src/client/sysint/sys-flush.sm	Mon Jan 16 11:06:44 2006
@@ -119,7 +119,8 @@
         sm_p->getattr,
         ref,
         PVFS_ATTR_META_ALL|PVFS_ATTR_COMMON_TYPE, 
-        PVFS_TYPE_METAFILE);
+        PVFS_TYPE_METAFILE,
+        0);
 
     return PINT_client_state_machine_post(
         sm_p, PVFS_SYS_FLUSH, op_id, user_ptr);
Index: src/client/sysint/sys-getattr.sm
diff -u src/client/sysint/sys-getattr.sm:1.11 src/client/sysint/sys-getattr.sm:1.12
--- src/client/sysint/sys-getattr.sm:1.11	Wed Jan  4 08:50:58 2006
+++ src/client/sysint/sys-getattr.sm	Mon Jan 16 11:06:44 2006
@@ -238,7 +238,8 @@
         ref,
         PVFS_util_sys_to_object_attr_mask(
             attrmask), 
-        PVFS_TYPE_NONE);
+        PVFS_TYPE_NONE,
+        0);
 
     return PINT_client_state_machine_post(
         sm_p, PVFS_SYS_GETATTR, op_id, user_ptr);
@@ -342,6 +343,15 @@
 	sm_p->getattr.req_attrmask |= PVFS_ATTR_META_ALL;
     }
 
+    if(sm_p->getattr.flags & PINT_SM_GETATTR_BYPASS_CACHE)
+    {
+        gossip_debug(GOSSIP_ACACHE_DEBUG, "acache: forced acache miss: "
+                    " [%llu]\n",
+                      llu(object_ref.handle));
+        js_p->error_code = GETATTR_ACACHE_MISS;
+        return 1;
+    }
+
     ret = PINT_acache_get_cached_entry(object_ref,
         &sm_p->getattr.attr,
         &attr_status,
Index: src/client/sysint/sys-io.sm
diff -u src/client/sysint/sys-io.sm:1.18 src/client/sysint/sys-io.sm:1.19
--- src/client/sysint/sys-io.sm:1.18	Thu Jan 12 08:21:59 2006
+++ src/client/sysint/sys-io.sm	Mon Jan 16 11:06:44 2006
@@ -387,7 +387,8 @@
         sm_p->getattr,
         sm_p->object_ref,
         PVFS_ATTR_META_ALL|PVFS_ATTR_COMMON_TYPE, 
-        PVFS_TYPE_METAFILE);
+        PVFS_TYPE_METAFILE,
+        0);
        
     if (js_p->error_code == IO_RETRY)
     {
Index: src/client/sysint/sys-lookup.sm
diff -u src/client/sysint/sys-lookup.sm:1.4 src/client/sysint/sys-lookup.sm:1.5
--- src/client/sysint/sys-lookup.sm:1.4	Tue Dec  6 13:46:30 2005
+++ src/client/sysint/sys-lookup.sm	Mon Jan 16 11:06:44 2006
@@ -709,7 +709,8 @@
             sm_p->getattr,  
             cur_seg->seg_resolved_refn,
             (PVFS_ATTR_COMMON_ALL | PVFS_ATTR_SYMLNK_ALL),
-            PVFS_TYPE_NONE);
+            PVFS_TYPE_NONE,
+            0);
 
         js_p->error_code = 1;
     }
Index: src/client/sysint/sys-mkdir.sm
diff -u src/client/sysint/sys-mkdir.sm:1.10 src/client/sysint/sys-mkdir.sm:1.11
--- src/client/sysint/sys-mkdir.sm:1.10	Tue Dec  6 13:46:30 2005
+++ src/client/sysint/sys-mkdir.sm	Mon Jan 16 11:06:44 2006
@@ -278,7 +278,8 @@
         sm_p->getattr,
         sm_p->object_ref,
         PVFS_ATTR_COMMON_ALL, 
-        PVFS_TYPE_DIRECTORY);
+        PVFS_TYPE_DIRECTORY,
+        0);
 
     return 1;
 }
Index: src/client/sysint/sys-readdir.sm
diff -u src/client/sysint/sys-readdir.sm:1.6 src/client/sysint/sys-readdir.sm:1.7
--- src/client/sysint/sys-readdir.sm:1.6	Tue Dec  6 13:46:30 2005
+++ src/client/sysint/sys-readdir.sm	Mon Jan 16 11:06:44 2006
@@ -199,7 +199,8 @@
         sm_p->getattr,
         sm_p->object_ref,
         PVFS_ATTR_DIR_ALL,
-        PVFS_TYPE_DIRECTORY);
+        PVFS_TYPE_DIRECTORY,
+        0);
     
     assert(js_p->error_code == 0);
 
Index: src/client/sysint/sys-rename.sm
diff -u src/client/sysint/sys-rename.sm:1.10 src/client/sysint/sys-rename.sm:1.11
--- src/client/sysint/sys-rename.sm:1.10	Mon Jan 16 10:34:18 2006
+++ src/client/sysint/sys-rename.sm	Mon Jan 16 11:06:44 2006
@@ -468,7 +468,8 @@
                 sm_p->getattr,
                 sm_p->object_ref,
                 PVFS_ATTR_COMMON_ALL,
-                PVFS_TYPE_NONE);
+                PVFS_TYPE_NONE,
+                0);
 
             /* if both lookups succeeded, then we need chdirent */
             return RENAME_CHDIRENT;
@@ -1015,7 +1016,8 @@
     PINT_SM_GETATTR_STATE_FILL(sm_p->getattr,
                                sm_p->object_ref,
                                (PVFS_ATTR_COMMON_ALL|PVFS_ATTR_DIR_DIRENT_COUNT),
-                               PVFS_TYPE_NONE);
+                               PVFS_TYPE_NONE,
+                               PINT_SM_GETATTR_BYPASS_CACHE);
     js_p->error_code = 0;
     return(1);
 }
Index: src/client/sysint/sys-symlink.sm
diff -u src/client/sysint/sys-symlink.sm:1.10 src/client/sysint/sys-symlink.sm:1.11
--- src/client/sysint/sys-symlink.sm:1.10	Tue Dec  6 13:46:30 2005
+++ src/client/sysint/sys-symlink.sm	Mon Jan 16 11:06:44 2006
@@ -301,7 +301,8 @@
         sm_p->getattr,
         sm_p->object_ref,
         PVFS_ATTR_COMMON_ALL,
-        PVFS_TYPE_DIRECTORY);
+        PVFS_TYPE_DIRECTORY,
+        0);
         
     return ret;
 }
Index: src/client/sysint/sys-truncate.sm
diff -u src/client/sysint/sys-truncate.sm:1.8 src/client/sysint/sys-truncate.sm:1.9
--- src/client/sysint/sys-truncate.sm:1.8	Wed Jan  4 08:50:58 2006
+++ src/client/sysint/sys-truncate.sm	Mon Jan 16 11:06:44 2006
@@ -124,7 +124,8 @@
         sm_p->getattr,
         sm_p->object_ref,
         PVFS_ATTR_META_ALL|PVFS_ATTR_COMMON_TYPE,
-        PVFS_TYPE_METAFILE);
+        PVFS_TYPE_METAFILE,
+        0);
 
     return PINT_client_state_machine_post(
         sm_p, PVFS_SYS_TRUNCATE, op_id, user_ptr);
---------------------
PatchSet 377 
Date: 2006/01/16 16:26:36
Author: pcarns
Branch: HEAD
Tag: (none) 
Log:
bug fix: removed client side dirent_count check because the client's
attribute information may be too out of date to rely on
[artf29238]

Members: 
	src/client/sysint/remove.sm:1.5->1.6 

Index: src/client/sysint/remove.sm
diff -u src/client/sysint/remove.sm:1.5 src/client/sysint/remove.sm:1.6
--- src/client/sysint/remove.sm:1.5	Wed Jan  4 08:50:58 2006
+++ src/client/sysint/remove.sm	Mon Jan 16 09:26:36 2006
@@ -170,11 +170,17 @@
 	    js_p->error_code = REMOVE_MUST_REMOVE_DATAFILES;
             break;
 	case PVFS_TYPE_DIRECTORY:
+#if 0
+/* NOTE: this check is not safe because it relies on cached attributes on the
+ * parent, which may be out of date.  The server will perform this check
+ * locally when we attempt to remove the directory itself.
+ */
             if(attr->u.dir.dirent_count != 0)
             {
                 js_p->error_code = -PVFS_ENOTEMPTY;
                 break;
             }
+#endif
 	case PVFS_TYPE_SYMLINK:
             js_p->error_code = 0;
             break;
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to