Hi all,

This mail is an initiation of discussion on how to provide QoS in Glusterfs. 
The thoughts I've put in this mail are in the context of Glusterfs architecture 
till (and including) 3.x series. Discussions and suggestions for 4.0 are 
welcome. Please note that this is a very much Work in Progress as QoS for 
Storage is complex, relatively new (at least in non-proprietary world) and the 
requirements can vary. One of the objectives of this exercise is collection of 
requirements and determining the scope of QoS.

Some of my own reading [1][2] and discussions with others led to three QoS 
guarantees :

Note: From what I know, QoS guarantees seem to be on throughput and is measured 
in terms of IOPS. Pointers to any QoS implementations/solutions targeting 
latency are welcome.

1. Reservations: This is a guaranteed performance. However given our 
architecture, total reservations cannot exceed the capacity of weakest brick in 
the cluster. This is because the brick can become an hot-spot for the I/O and 
in worst-case scenario all I/O might be directed to that node. Is this 
acceptable? Note that cluster/distribute and cluster/shard can solve the 
problem of hotspots for directories and files respectively to a certain extent. 
Even with distribute and sharding the basic exercise of coming up with a 
capacity to be used for admission control of clients still not solved (In other 
words, what is the total reservation we can provide even with sharding and 
distribute). Any pointers or suggestions are much appreciated.

2. Limits: This is maximum IOPS a client can attain. Note that total limits 
should not be greater than the total throughput of volume. Note that by having 
limits we can control "noisy neighbors". There have been attempts to solve 
Noisy neighbor problem using throttling [3].

3. Proportional Shares: When the clients have met their reservations but not 
exceeded their limitations, the capacity is shared with others in proportion to 
their weights.

dmclock [2] seems to fit our requirements. Some of the positives I found are:

1. QoS can be implemented in a distributed system, requiring no communication 
between bricks/servers themselves.
2. Claims to adapt to varying capacity. One of the problems Jeff pointed out in 
using Token bucket algorithms was determining the number of tokens to be 
introduced into the system. This algorithm has no such requirement. Note that 
unlike cpu, throughput of storage (at least for magnetic disks) is stateful.
3. Its intuitive and simple.

Some of the questions/open-topics (from my understanding):

1. How to allocate costs of different fops? What is the standard we can use for 
comparing different fops like lookup, readdirp, read, write etc? This gets more 
complicated because of fop overloading. For eg., a lookup can fetch the entire 
content of the file if it is "small enough". How can we convert this 
information into a single number representing reservations/limits? In other 
words what does the term IOPS represent in a distributed file-system like 
Glusterfs?

2. Since IOPS seem to be dependent on workloads, what are the workloads we 
should be using to test our QoS guarantees?

3. Relation of QoS with Throttling. Should throttling be implemented as part of 
QoS? Since Throttling concerns itself with enforcing limits on resource 
consumption, I assume similar functionality can be achieved by setting limits 
for different clients (self-heal-daemon, rebalance process, clients etc).

4. Granularity of "clients". Should it be
   * a single application?
   * a single mount process?
   * a set of applications running on a mount?
   * An abstract tenant which can span multiple mount points?

  How do we pass this "client" information through a Posix file-system 
interface?

5. Should "security/isolation" be part of QoS guarantees? Multi-tenancy support 
attempts to solve the data isolation problem. QoS tries to solve the 
performance isolation problem. Are there any parallels between them? What is 
the scope of this exercise?

6. Caching layer is on clients. If we are going to implement QoS engine on 
bricks, how do we consolidate both? What about caching in VFS?

7. Other unknowns/requirements which I am not aware of.

Thanks to Jeff, Vijay and Steve for their inputs till now.

[1] 
https://github.com/kubernetes/kubernetes/blob/release-1.1/docs/proposals/resource-qos.md
[2] https://labs.vmware.com/download/122/
[3] https://www.gluster.org/pipermail/gluster-devel/2016-January/048007.html

regards,
Raghavendra
_______________________________________________
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel

Reply via email to