Re: [Gluster-devel] FOP ratelimit?

2015-09-11 Thread Joseph Fernandes
Something ceph is working on "dmclock"

http://tracker.ceph.com/projects/ceph/wiki/Rados_qos

might be we can talk to them ?

~Joe


- Original Message -
From: "Jeff Darcy" 
To: "Joseph Fernandes" 
Cc: "Gluster Devel" , "Raghavendra Gowdappa" 
, "Venky Shankar" , "Pranith Kumar 
Karampuri" , "Shyamsundar Ranganathan" 

Sent: Thursday, September 10, 2015 6:57:51 PM
Subject: Re: [Gluster-devel] FOP ratelimit?

> Have we given thought about other IO scheduling algorithms like mclock
> algorithm [1], used by vmware for their QOS solution.
> Plus another point to keep in mind here is the distributed nature of the
> solution. Its easier to think of a brick
> controlling the throughput for a client or a tenant. But how would this work
> in collaboration and scale with all the
> bricks together, what I am talking about is Distributed QOS.

At the packet level, this is a core problem that SDN has to solve.  When
we're running in an SDN environment, we should just hand off responsibility
for QoS to them.  Otherwise, we should probably steal their algorithms.  ;)
I believe there are some experts elsewhere at Red Hat whose brains we can
and should pick.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-10 Thread Jeff Darcy
> Have we given thought about other IO scheduling algorithms like mclock
> algorithm [1], used by vmware for their QOS solution.
> Plus another point to keep in mind here is the distributed nature of the
> solution. Its easier to think of a brick
> controlling the throughput for a client or a tenant. But how would this work
> in collaboration and scale with all the
> bricks together, what I am talking about is Distributed QOS.

At the packet level, this is a core problem that SDN has to solve.  When
we're running in an SDN environment, we should just hand off responsibility
for QoS to them.  Otherwise, we should probably steal their algorithms.  ;)
I believe there are some experts elsewhere at Red Hat whose brains we can
and should pick.
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-10 Thread Joseph Fernandes
Hi Guys,

Have we given thought about other IO scheduling algorithms like mclock 
algorithm [1], used by vmware for their QOS solution.
Plus another point to keep in mind here is the distributed nature of the 
solution. Its easier to think of a brick
controlling the throughput for a client or a tenant. But how would this work in 
collaboration and scale with all the
bricks together, what I am talking about is Distributed QOS.

Regards,
Joe

[1] http://www.gluster.org/community/documentation/index.php/File:Qos.odp

- Original Message -
From: "Venky Shankar" 
To: "Raghavendra Gowdappa" 
Cc: "Gluster Devel" 
Sent: Thursday, September 10, 2015 12:16:41 PM
Subject: Re: [Gluster-devel] FOP ratelimit?

On Thu, Sep 3, 2015 at 11:36 AM, Raghavendra Gowdappa
 wrote:
>
>
> - Original Message -
>> From: "Emmanuel Dreyfus" 
>> To: "Raghavendra Gowdappa" , "Pranith Kumar Karampuri" 
>> 
>> Cc: gluster-devel@gluster.org
>> Sent: Wednesday, September 2, 2015 8:12:37 PM
>> Subject: Re: [Gluster-devel] FOP ratelimit?
>>
>> Raghavendra Gowdappa  wrote:
>>
>> > Its helpful if you can give some pointers on what parameters (like
>> > latency, throughput etc) you want us to consider for QoS.
>>
>> Full blown QoS would be nice, but a first line of defense against
>> resource hogs seems just badly required.
>>
>> A bare minimum could be to process client's FOP in a round robin
>> fashion. That way even if one client sends a lot of FOPs, there is
>> always some window for others to slip in.
>>
>> Any opinion?
>
> As of now we depend on epoll/poll events informing servers about incoming 
> messages. All sockets are put in the same event-pool represented by a single 
> poll-control fd. So, the order of our processing of msgs from various clients 
> really depends on how epoll/poll picks events across multiple sockets. Do 
> poll/epoll have any sort of scheduling? or is it random? Any pointers on this 
> are appreciated.

I haven't come across any kind of scheduling for picking events for
sockets. Routers use synthetic throttling for traffic shaping. Most
commonly used technique is by using TBF (token bucket filter) to
"induce" latency for outbound traffic. Lustre had some work[1] done
for QoS along the lines of TBF.

HTH.

[1]: http://cdn.opensfs.org/wp-content/uploads/2014/10/7-DDN_LiXi_lustre_QoS.pdf

>
>>
>> --
>> Emmanuel Dreyfus
>> http://hcpnet.free.fr/pubz
>> m...@netbsd.org
>>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-10 Thread Joseph Fernandes
Hi Guys,

Have we given thought about other IO scheduling algorithms like mclock 
algorithm [1], used by vmware for their QOS solution.
Plus another point to keep in mind here is the distributed nature of the 
solution. Its easier to think of a brick
controlling the throughput for a client or a tenant. But how would this work in 
collaboration and scale with all the
bricks together, what I am talking about is Distributed QOS.

Regards,
Joe

[1] http://www.gluster.org/community/documentation/index.php/File:Qos.odp

- Original Message -
From: "Venky Shankar" 
To: "Raghavendra Gowdappa" 
Cc: "Gluster Devel" 
Sent: Thursday, September 10, 2015 12:16:41 PM
Subject: Re: [Gluster-devel] FOP ratelimit?

On Thu, Sep 3, 2015 at 11:36 AM, Raghavendra Gowdappa
 wrote:
>
>
> - Original Message -
>> From: "Emmanuel Dreyfus" 
>> To: "Raghavendra Gowdappa" , "Pranith Kumar Karampuri" 
>> 
>> Cc: gluster-devel@gluster.org
>> Sent: Wednesday, September 2, 2015 8:12:37 PM
>> Subject: Re: [Gluster-devel] FOP ratelimit?
>>
>> Raghavendra Gowdappa  wrote:
>>
>> > Its helpful if you can give some pointers on what parameters (like
>> > latency, throughput etc) you want us to consider for QoS.
>>
>> Full blown QoS would be nice, but a first line of defense against
>> resource hogs seems just badly required.
>>
>> A bare minimum could be to process client's FOP in a round robin
>> fashion. That way even if one client sends a lot of FOPs, there is
>> always some window for others to slip in.
>>
>> Any opinion?
>
> As of now we depend on epoll/poll events informing servers about incoming 
> messages. All sockets are put in the same event-pool represented by a single 
> poll-control fd. So, the order of our processing of msgs from various clients 
> really depends on how epoll/poll picks events across multiple sockets. Do 
> poll/epoll have any sort of scheduling? or is it random? Any pointers on this 
> are appreciated.

I haven't come across any kind of scheduling for picking events for
sockets. Routers use synthetic throttling for traffic shaping. Most
commonly used technique is by using TBF (token bucket filter) to
"induce" latency for outbound traffic. Lustre had some work[1] done
for QoS along the lines of TBF.

HTH.

[1]: http://cdn.opensfs.org/wp-content/uploads/2014/10/7-DDN_LiXi_lustre_QoS.pdf

>
>>
>> --
>> Emmanuel Dreyfus
>> http://hcpnet.free.fr/pubz
>> m...@netbsd.org
>>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-09 Thread Venky Shankar
On Thu, Sep 3, 2015 at 11:36 AM, Raghavendra Gowdappa
 wrote:
>
>
> - Original Message -
>> From: "Emmanuel Dreyfus" 
>> To: "Raghavendra Gowdappa" , "Pranith Kumar Karampuri" 
>> 
>> Cc: gluster-devel@gluster.org
>> Sent: Wednesday, September 2, 2015 8:12:37 PM
>> Subject: Re: [Gluster-devel] FOP ratelimit?
>>
>> Raghavendra Gowdappa  wrote:
>>
>> > Its helpful if you can give some pointers on what parameters (like
>> > latency, throughput etc) you want us to consider for QoS.
>>
>> Full blown QoS would be nice, but a first line of defense against
>> resource hogs seems just badly required.
>>
>> A bare minimum could be to process client's FOP in a round robin
>> fashion. That way even if one client sends a lot of FOPs, there is
>> always some window for others to slip in.
>>
>> Any opinion?
>
> As of now we depend on epoll/poll events informing servers about incoming 
> messages. All sockets are put in the same event-pool represented by a single 
> poll-control fd. So, the order of our processing of msgs from various clients 
> really depends on how epoll/poll picks events across multiple sockets. Do 
> poll/epoll have any sort of scheduling? or is it random? Any pointers on this 
> are appreciated.

I haven't come across any kind of scheduling for picking events for
sockets. Routers use synthetic throttling for traffic shaping. Most
commonly used technique is by using TBF (token bucket filter) to
"induce" latency for outbound traffic. Lustre had some work[1] done
for QoS along the lines of TBF.

HTH.

[1]: http://cdn.opensfs.org/wp-content/uploads/2014/10/7-DDN_LiXi_lustre_QoS.pdf

>
>>
>> --
>> Emmanuel Dreyfus
>> http://hcpnet.free.fr/pubz
>> m...@netbsd.org
>>
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Raghavendra Gowdappa


- Original Message -
> From: "Emmanuel Dreyfus" 
> To: "Raghavendra Gowdappa" , "Pranith Kumar Karampuri" 
> 
> Cc: gluster-devel@gluster.org
> Sent: Wednesday, September 2, 2015 8:12:37 PM
> Subject: Re: [Gluster-devel] FOP ratelimit?
> 
> Raghavendra Gowdappa  wrote:
> 
> > Its helpful if you can give some pointers on what parameters (like
> > latency, throughput etc) you want us to consider for QoS.
> 
> Full blown QoS would be nice, but a first line of defense against
> resource hogs seems just badly required.
> 
> A bare minimum could be to process client's FOP in a round robin
> fashion. That way even if one client sends a lot of FOPs, there is
> always some window for others to slip in.
> 
> Any opinion?

As of now we depend on epoll/poll events informing servers about incoming 
messages. All sockets are put in the same event-pool represented by a single 
poll-control fd. So, the order of our processing of msgs from various clients 
really depends on how epoll/poll picks events across multiple sockets. Do 
poll/epoll have any sort of scheduling? or is it random? Any pointers on this 
are appreciated.

> 
> --
> Emmanuel Dreyfus
> http://hcpnet.free.fr/pubz
> m...@netbsd.org
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Emmanuel Dreyfus
Raghavendra Gowdappa  wrote:

> Its helpful if you can give some pointers on what parameters (like
> latency, throughput etc) you want us to consider for QoS.

Full blown QoS would be nice, but a first line of defense against
resource hogs seems just badly required.

A bare minimum could be to process client's FOP in a round robin
fashion. That way even if one client sends a lot of FOPs, there is
always some window for others to slip in.

Any opinion?

-- 
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Jeff Darcy
> Do you have any ideas here on QoS? Can it be provided as a use-case for
> multi-tenancy you were working on earlier?

My interpretation of QoS would include rate limiting, but more per
*activity* (e.g. self-heal, rebalance, user I/O) or per *tenant* rather
than per *client*.  Also, it's easier to implement at the message level
(which can be done on the servers) rather than the fop level (which has
to be on clients).  How well does that apply to what we've been
discussing in this thread?
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Emmanuel Dreyfus
On Wed, Sep 02, 2015 at 02:04:32PM +0530, Pranith Kumar Karampuri wrote:
> >And more generally, do we have a way to ratelimit FOPs per client, so
> >that one client cannot make the cluster unusable for the others?
> Do you have profile data?

No, it was on a production setup and I was too focused to restoring
functionnality to have thought about it.

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Emmanuel Dreyfus
On Wed, Sep 02, 2015 at 02:05:03PM +0530, Venky Shankar wrote:
> > I understand rename on DHT can be very costly because data really have
> > to be moved from a brick to another one just for a file name change.
> > Is there a workaround for this behavior?
> 
> Not really. DHT uses pointer files (so called link-to) to work around
> moving file contents on rename().

Then I have been misled by the huge amount of DHT rename opeeations in 
the logs, but the user killed performance another way. Too bad I did 
not collect profile data at that time.

-- 
Emmanuel Dreyfus
m...@netbsd.org
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Raghavendra Gowdappa
+Jeff.

Jeff,

Do you have any ideas here on QoS? Can it be provided as a use-case for 
multi-tenancy you were working on earlier?

regards,
Raghavendra.

- Original Message -
> From: "Raghavendra Gowdappa" 
> To: "Pranith Kumar Karampuri" 
> Cc: gluster-devel@gluster.org
> Sent: Wednesday, September 2, 2015 2:11:35 PM
> Subject: Re: [Gluster-devel] FOP ratelimit?
> 
> 
> 
> - Original Message -
> > From: "Pranith Kumar Karampuri" 
> > To: "Emmanuel Dreyfus" , gluster-devel@gluster.org
> > Sent: Wednesday, September 2, 2015 2:04:32 PM
> > Subject: Re: [Gluster-devel] FOP ratelimit?
> > 
> > 
> > 
> > On 09/02/2015 01:59 PM, Emmanuel Dreyfus wrote:
> > > Hi
> > >
> > > Yesterday I experienced the problem of a single user bringing down
> > > a glusterfs cluster to its knees because of a high amount of rename
> > > operations.
> > >
> > > I understand rename on DHT can be very costly because data really have
> > > to be moved from a brick to another one just for a file name change.
> > > Is there a workaround for this behavior?
> > This is not true.
> 
> Data is not moved across bricks during rename. So, may be something else is
> causing the issue. Were you running rebalance while these renames were being
> done?
> 
> > >
> > > And more generally, do we have a way to ratelimit FOPs per client, so
> > > that one client cannot make the cluster unusable for the others?
> > Do you have profile data?
> > 
> > Raghavendra G is working on some QOS related enahancements in gluster.
> > Please let us know if you have any inputs here.
> 
> Thanks Pranith.
> 
> @Manu and others,
> 
> Its helpful if you can give some pointers on what parameters (like latency,
> throughput etc) you want us to consider for QoS. Also, any ideas (like
> interface for QoS) in this area is welcome. With my very basic search, seems
> like there are not many filesystems with QoS functionality.
> 
> regards,
> Raghavendra.
> > 
> > Pranith
> > >
> > 
> > ___
> > Gluster-devel mailing list
> > Gluster-devel@gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel
> > 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Raghavendra Gowdappa


- Original Message -
> From: "Pranith Kumar Karampuri" 
> To: "Emmanuel Dreyfus" , gluster-devel@gluster.org
> Sent: Wednesday, September 2, 2015 2:04:32 PM
> Subject: Re: [Gluster-devel] FOP ratelimit?
> 
> 
> 
> On 09/02/2015 01:59 PM, Emmanuel Dreyfus wrote:
> > Hi
> >
> > Yesterday I experienced the problem of a single user bringing down
> > a glusterfs cluster to its knees because of a high amount of rename
> > operations.
> >
> > I understand rename on DHT can be very costly because data really have
> > to be moved from a brick to another one just for a file name change.
> > Is there a workaround for this behavior?
> This is not true.

Data is not moved across bricks during rename. So, may be something else is 
causing the issue. Were you running rebalance while these renames were being 
done?

> >
> > And more generally, do we have a way to ratelimit FOPs per client, so
> > that one client cannot make the cluster unusable for the others?
> Do you have profile data?
> 
> Raghavendra G is working on some QOS related enahancements in gluster.
> Please let us know if you have any inputs here.

Thanks Pranith. 

@Manu and others,

Its helpful if you can give some pointers on what parameters (like latency, 
throughput etc) you want us to consider for QoS. Also, any ideas (like 
interface for QoS) in this area is welcome. With my very basic search, seems 
like there are not many filesystems with QoS functionality.

regards,
Raghavendra.
> 
> Pranith
> >
> 
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Venky Shankar
On Wed, Sep 2, 2015 at 2:05 PM, Venky Shankar  wrote:
> On Wed, Sep 2, 2015 at 1:59 PM, Emmanuel Dreyfus  wrote:
>> Hi
>>
>> Yesterday I experienced the problem of a single user bringing down
>> a glusterfs cluster to its knees because of a high amount of rename
>> operations.
>>
>> I understand rename on DHT can be very costly because data really have
>> to be moved from a brick to another one just for a file name change.
>> Is there a workaround for this behavior?
>
> Not really. DHT uses pointer files (so called link-to) to work around
> moving file contents on rename().
>
>>
>> And more generally, do we have a way to ratelimit FOPs per client, so
>> that one client cannot make the cluster unusable for the others?
>
> There is some form of limiting based on priority (w/ client-pids) in
> io-threads. For bit-rot, I had used token bucket
> based throttling[1] during hash calculation. But that resides on the
> client side for bitrot xlator. It may be beneficial
> to have that on the server side.

[1]: 
https://github.com/gluster/glusterfs/blob/master/xlators/features/bit-rot/src/bitd/bit-rot-tbf.c
>
>>
>> --
>> Emmanuel Dreyfus
>> m...@netbsd.org
>> ___
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Venky Shankar
On Wed, Sep 2, 2015 at 1:59 PM, Emmanuel Dreyfus  wrote:
> Hi
>
> Yesterday I experienced the problem of a single user bringing down
> a glusterfs cluster to its knees because of a high amount of rename
> operations.
>
> I understand rename on DHT can be very costly because data really have
> to be moved from a brick to another one just for a file name change.
> Is there a workaround for this behavior?

Not really. DHT uses pointer files (so called link-to) to work around
moving file contents on rename().

>
> And more generally, do we have a way to ratelimit FOPs per client, so
> that one client cannot make the cluster unusable for the others?

There is some form of limiting based on priority (w/ client-pids) in
io-threads. For bit-rot, I had used token bucket
based throttling[1] during hash calculation. But that resides on the
client side for bitrot xlator. It may be beneficial
to have that on the server side.

>
> --
> Emmanuel Dreyfus
> m...@netbsd.org
> ___
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] FOP ratelimit?

2015-09-02 Thread Pranith Kumar Karampuri



On 09/02/2015 01:59 PM, Emmanuel Dreyfus wrote:

Hi

Yesterday I experienced the problem of a single user bringing down
a glusterfs cluster to its knees because of a high amount of rename
operations.

I understand rename on DHT can be very costly because data really have
to be moved from a brick to another one just for a file name change.
Is there a workaround for this behavior?

This is not true.


And more generally, do we have a way to ratelimit FOPs per client, so
that one client cannot make the cluster unusable for the others?

Do you have profile data?

Raghavendra G is working on some QOS related enahancements in gluster. 
Please let us know if you have any inputs here.


Pranith




___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://www.gluster.org/mailman/listinfo/gluster-devel