On 12/27/2016 05:47 PM, Marcelo Ricardo Leitner wrote:
> On Tue, Dec 27, 2016 at 09:25:47AM +0100, Matthias Tafelmeier wrote:
>> Oftenly, introducing side effects on packet processing on the other half
>> of the stack by adjusting one of TX/RX via sysctl is not desirable.
>> There are cases of demand for asymmetric, orthogonal configurability.
>>
>> This holds true especially for nodes where RPS for RFS usage on top is
>> configured and therefore use the 'old dev_weight'. This is quite a
>> common base configuration setup nowadays, even with NICs of superior 
>> processing
>> support (e.g. aRFS).
>>
>> A good example use case are nodes acting as noSQL data bases with a
>> large number of tiny requests and rather fewer but large packets as 
>> responses.
>> It's affordable to have large budget and rx dev_weights for the
>> requests. But as a side effect having this large a number on TX
>> processed in one run can overwhelm drivers.
>>
>> This patch therefore introduces an independent configurability via sysctl to
>> userland.
>> ---
>>  include/linux/netdevice.h  |  2 ++
>>  net/core/dev.c             |  4 +++-
>>  net/core/sysctl_net_core.c | 14 ++++++++++++++
>>  net/sched/sch_generic.c    |  2 +-
>>  4 files changed, 20 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>> index 994f742..bb331e0 100644
>> --- a/include/linux/netdevice.h
>> +++ b/include/linux/netdevice.h
>> @@ -3795,6 +3795,8 @@ void netdev_stats_to_stats64(struct rtnl_link_stats64 
>> *stats64,
>>  extern int          netdev_max_backlog;
>>  extern int          netdev_tstamp_prequeue;
>>  extern int          weight_p;
>> +extern int          dev_w_rx_bias;
>> +extern int          dev_w_tx_bias;
>>  
>>  bool netdev_has_upper_dev(struct net_device *dev, struct net_device 
>> *upper_dev);
>>  struct net_device *netdev_upper_get_next_dev_rcu(struct net_device *dev,
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 8db5a0b..0dcbd28 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -3428,6 +3428,8 @@ EXPORT_SYMBOL(netdev_max_backlog);
>>  int netdev_tstamp_prequeue __read_mostly = 1;
>>  int netdev_budget __read_mostly = 300;
>>  int weight_p __read_mostly = 64;            /* old backlog weight */
>> +int dev_w_rx_bias __read_mostly = 1;            /* bias for backlog weight 
>> */
>> +int dev_w_tx_bias __read_mostly = 1;            /* bias for output_queue 
>> quota */
>>  
>>  /* Called with irq disabled */
>>  static inline void ____napi_schedule(struct softnet_data *sd,
>> @@ -4833,7 +4835,7 @@ static int process_backlog(struct napi_struct *napi, 
>> int quota)
>>              net_rps_action_and_irq_enable(sd);
>>      }
>>  
>> -    napi->weight = weight_p;
>> +    napi->weight = weight_p * dev_w_rx_bias;
>>      while (again) {
>>              struct sk_buff *skb;
>>  
>> diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
>> index 2a46e40..a2ab149 100644
>> --- a/net/core/sysctl_net_core.c
>> +++ b/net/core/sysctl_net_core.c
>> @@ -276,6 +276,20 @@ static struct ctl_table net_core_table[] = {
>>              .proc_handler   = proc_dointvec
>>      },
>>      {
>> +            .procname       = "dev_w_rx_bias",
>> +            .data           = &dev_w_rx_bias,
>> +            .maxlen         = sizeof(int),
>> +            .mode           = 0644,
>> +            .proc_handler   = proc_dointvec
>> +    },
>> +    {
>> +            .procname       = "dev_w_tx_bias",
>> +            .data           = &dev_w_tx_bias,
>> +            .maxlen         = sizeof(int),
>> +            .mode           = 0644,
>> +            .proc_handler   = proc_dointvec
>> +    },
>> +    {
> Please describe these at Documentation/sysctl/net.txt, probably right
> after dev_weight. 
Sure, I'll do that.

> I'm not sure about the abbreviation, maybe it would be better the longer
> name as it doesn't block tab completion.
> dev_weight_tx_bias
> dev_weight_rx_bias
> dev_weight
>
Do not find the abbreviation/naming satisfactory, either. Rather saw
them as a draft. Could think of dev_weight distant naming:

ns_rps_cpu_rx_bias
ns_cpu_tx_bias

Though, makes me concerned about association etc. Maybe, that's nit
picking.


Attachment: 0x8ADF343B.asc
Description: application/pgp-keys

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to