DHT developers,

We introduced a non-blocking lock prior to a rename operation in dht and fail the rename if the lock acquisition is not successful with 3.6. I ran into an user in IRC yesterday who is affected by this behavior change:

"We're seeing a behavior in Gluster 3.7.x that we did not see in 3.4.x and we're not sure how to fix it. When multiple processes are attempting to rename a file to the same destination at once, we're now seeing "Device or resource busy" and "Stale file handle" errors. Here's the command to replicate it: cd /mnt/glustermount; while true; do FILE=$RANDOM; touch $FILE; mv $FILE file-fv; done. The above command would be ran on two or three servers within the same gluster cluster. In the output, one would always be sucessfull in the rename, while the 2 other ones would fail with the above error."

The use case for concurrent renames was described as:

"we generate files and push them to the gluster cluster. Some are generated multiple times and end up being pushed to the cluster at the same time by different data generators; resulting in the 'rename collision'. We use also the cluster.extra-hash-regex to make sure the data is written in place. And this does the rename."

Is a non-blocking lock essential? Can we not use a blocking lock instead of a non-blocking lock or fallback to a blocking lock if the original non-blocking lock acquisition fails?

Thanks,
Vijay



_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Reply via email to