On 12/09/2015 08:08 PM, Shyam wrote:
On 12/09/2015 12:52 AM, Pranith Kumar Karampuri wrote:


On 12/09/2015 10:39 AM, Prashanth Pai wrote:
However, I’d be even more comfortable with an even simpler approach that
avoids the need to solve what the database folks (who have dealt with
complex transactions for years) would tell us is a really hard problem. Instead of designing for every case we can imagine, let’s design for the cases that we know would be useful for improving performance. Open plus
read/write plus close is an obvious one.  Raghavendra mentions
create+inodelk as well.
 From object interface (Swift/S3) perspective, this is the fop order
and flow for object operations:

GET: open(), fstat(), fgetxattr()s, read()s, close()
Krutika implemented fstat+fgetxattr(http://review.gluster.org/10180). In
posix there is an implementation of GF_CONTENT_KEY which is used to read
a file in lookup by quick-read. This needs to be exposed for fds as well
I think. So you can do all this using fstat on anon-fd.
HEAD: stat(), getxattr()s
Krutika already implemented this for sharding
http://review.gluster.org/10158. You can do this using stat fop.

I believe we need to fork this part of the conversation, i.e the stat + xattr information clubbing.

My view on a stat for gluster is, POSIX stat + gluster extended information being returned. I state this as, a file system when it stats its inode, should get all information regarding the inode, and not just the POSIX ones. In the case of other local FS, the inode structure has more fields than just what POSIX needs, so when the inode is *read* the FS can populate all its internal inode information and return to the application/syscall the relevant fields that it needs.

I believe gluster should do the same, so in the cases above, we should actually extend our stat information (not elaborating how) to include all information from the brick, i.e stat from POSIX and all the extended attrs for the inode (file or dir). This can then be consumed by any layer as needed.

Currently, each layer adds what it needs in addition to the stat information in the xdata, as an xattr request, this can continue or go away, if the relevant FOPs return the whole inode information upward.

This also has useful outcomes in readdirp calls, where we get the extended stat information for each entry.
You can use "list-xattr" in xdata request to get this.

With the patches referred to, and older patches, this seems to be the direction sought (around 2013), any reasons why this is not prevalent across the stack and made so? Or am I mistaken?
No reason. We can revive it. There didn't seem to be any interest. So I didn't follow up to get it in.

Pranith

PUT: creat(), write()s, setxattr(), fsync(), close(), rename()
This I think should be a new compound fop. Nothing similar exists.
DELETE: getxattr(), unlink()
This can also be clubbed in unlink already because xdata exists on the
wire already.

Compounding some of these ops and exposing them as consumable libgfapi
APIs like glfs_get() and glfs_put() similar to librados compound
APIs[1] would greatly improve performance for object based access.

[1]:
https://github.com/ceph/ceph/blob/master/src/include/rados/librados.h#L2219


Thanks.

- Prashanth Pai

_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Reply via email to