Re: [Gluster-devel] [Gluster-users] Feedback on DHT option "cluster.readdir-optimize"

Kyle Johnson Wed, 09 Nov 2016 06:05:18 -0800

Hey there,

We have a number of processes which daily walk our entire directory treeand perform operations on the found files.

Pre-gluster, this processes was able to complete within 24 hours ofstarting. After outgrowing that single server and moving to a glustersetup (two bricks, two servers, distribute, 10gig uplink), the processesbecame unusable.

After turning this option on, we were back to normal run times, with theprocess completing within 24 hours.


Our data is heavy nested in a large number of subfolders under /media/ftp.

A subset of our data:

15T of files in 48163 directories under /media/ftp/dig_dis.

Without readdir-optimize:

[root@colossus dig_dis]# time ls|wc -l
48163

real    13m1.582s
user    0m0.294s
sys     0m0.205s


With readdir-optimize:

[root@colossus dig_dis]# time ls | wc -l
48163

real    0m23.785s
user    0m0.296s
sys     0m0.108s

Long story short - this option is super important to me as it resolvedan issue that would have otherwise made me move my data off of gluster.



Thank you for all of your work,

Kyle




On 11/07/2016 10:07 PM, Raghavendra Gowdappa wrote:

Hi all,

We have an option in called "cluster.readdir-optimize" which alters the 
behavior of readdirp in DHT. This value affects how storage/posix treats dentries 
corresponding to directories (not for files).

When this value is on,
* DHT asks only one subvol/brick to return dentries corresponding to 
directories.
* Other subvols/bricks filter dentries corresponding to directories and send 
only dentries corresponding to files.

When this value is off (this is the default value),
* All subvols return all dentries stored on them. IOW, bricks don't filter any 
dentries.
* Since a directory has one dentry representing it on each subvol, dht (loaded 
on client) picks up dentry only from hashed subvol.

Note that irrespective of value of this option, _all_ subvols return dentries 
corresponding to files which are stored on them.

This option was introduced to boost performance of readdir as (when set on), 
filtering of dentries happens on bricks and hence there is reduced:
1. network traffic (with filtering all the redundant dentry information)
2. number of readdir calls between client and server for the same number of 
dentries returned to application (If filtering happens on client, lesser number 
of dentries in result and hence more number of readdir calls. IOW, result 
buffer is not filled to maximum capacity).

We want to hear from you Whether you've used this option and if yes,
1. Did it really boost readdir performance?
2. Do you've any performance data to find out what was the percentage of 
improvement (or deterioration)?
3. Data set you had (Number of files, directories and organisation of 
directories).

If we find out that this option is really helping you, we can spend our 
energies on fixing issues that will arise when this option is set to on. One 
common issue with turning this option on is that when this option is set, some 
directories might not show up in directory listing [1]. The reason for this is 
that:
1. If a directory can be created on a hashed subvol, mkdir (result to 
application) will be successful, irrespective of result of mkdir on rest of the 
subvols.
2. So, any subvol we pick to give us dentries corresponding to directory need 
not contain all the directories and we might miss out those directories in 
listing.

Your feedback is important for us and will help us to prioritize and improve 
things.

[1] https://www.gluster.org/pipermail/gluster-users/2016-October/028703.html

regards,
Raghavendra
_______________________________________________
Gluster-users mailing list
gluster-us...@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-users


_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] [Gluster-users] Feedback on DHT option "cluster.readdir-optimize"

Reply via email to