Re: [Gluster-devel] regarding GF_CONTENT_KEY and dht2 - perf with small files

Pranith Kumar Karampuri Wed, 03 Feb 2016 02:12:51 -0800

The file data would be located based on its GFID, so before the *first*
lookup/stat for a file, there is no way to know it's GFID.
NOTE: Instead of a name hash the GFID hash is used, to get immunity
against renames and the like, as a name hash could change the location
information for the file (among other reasons).
Another manner of achieving the same when the GFID of the file isknown (from a readdir) is to wind the lookup and read of size to therespective MDS and DS, where the lookup would be responded to once theMDS responds, and the DS response is cached for the subsequentopen+read case. So on the wire we would have a fan out of 2 FOPs, butstill satisfy the quick read requirements.

Tar kind of workload doesn't have a problem because we know the gfidafter readdirp.

I would assume the above resolves the problem posted, are there caseswhere we do not know the GFID of the file? i.e no readdir performedand client knows the file name that it wants to operate on? Do we havetraces of the webserver workload to see if it generates names on thefly or does a readdir prior to that?

Problem is with workloads which know the files that need to be readwithout readdir, like hyperlinks (webserver), swift objects etc. Theseare two I know of which will have this problem, which can't be improvedbecause we don't have metadata, data co-located. I have been trying tothink of a solution for past few days. Nothing good is coming up :-/


Pranith
_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] regarding GF_CONTENT_KEY and dht2 - perf with small files

Reply via email to