Re: Socket option to configure Ethernet PCP / CoS per-flow

2020-09-11 Thread sthaug
> However, while this allows all traffic sent via a specific interface to be 
> marked with a PCP (priority code point), it defeats the purpose of PFC 
> (priority flow control) which works by individually pausing different queues 
> of an interface, provided there is an actual differentiation of traffic into 
> those various classes.
> 
> Internally, we have added a socket option (SO_VLAN_PCP) to change the PCP 
> specifically for traffic associated with that socket, to be marked 
> differently from whatever the interface default is (unmarked, or the default 
> PCP).
> 
> Does the community see value in having such a socket option widely available? 
> (Linux currently doesn't seem to have a per-socket option either, only a 
> per-interface IOCTL API).

I've been doing quite a bit of network testing using iperf3 and
similar tools, and have wanted this type of functionality since the
interface option became available. Having this on a socket level would
make it possible to teach iperf3, ping and other tools to set PCP and
facilitate/simplify testing of L2 networks.

So the answer is a definite yes! This would be valuable.

Steinar Haug, Nethelp consulting, sth...@nethelp.no
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


RE: Socket option to configure Ethernet PCP / CoS per-flow

2020-09-11 Thread Scheffenegger, Richard
Thank you for the quick feedback.

On a related note - it just occurred to me, that the PCP functionality could be 
extended to make more effective use of PFC (priority flow control) without 
explicitly managing it on an application level directly.

Right now, PFC typically degenerates to good-old Flow control, as all traffic 
is handled just in the default class (0, or whatever is set up using the IOCTL 
interface API).

Typically, the different Ethernet classes come with a notion of prioritization 
between them - traffic in a "higher" class may be forwarded prior to traffic in 
a lower class. But that is not a strong requirement - using WRR with 1/8th 
bandwidth "reserved" for each class in a switch, assigning flows to a random 
PCP value, PFC could work in a more scalable fashion - only blocking a fraction 
of traffic, that is actually queue building (has to go over a lower bandwidth 
link, or a NIC excessively pausing its ingress), thus reducing the chance of 
the formation of congrestion trees...

E.g. PCP runs from 0 (default) to 7; 

Adding a socket option to explicitly assign traffic to one of these flows would 
allow testing and configuring applications to make use of "real" prioritization 
capabilities of modern switches.

And what I was just pondering was a special interface level setting (e.g. 8), 
which results in a socket to pick a "random" value when created, to distribute 
packets across all the queues available in hardware, allowing PFC to no longer 
collapse in effect to old FC style "on"/"off" for all traffic... 

Perhaps someone here has experience with congestion tree formation in multi-hop 
switching environments, and can comment if the above approach would be feasible 
to address that FC issue?


Richard Scheffenegger


-Original Message-
From: sth...@nethelp.no  
Sent: Freitag, 11. September 2020 18:55
To: Scheffenegger, Richard 
Cc: n...@freebsd.org; transp...@freebsd.org
Subject: Re: Socket option to configure Ethernet PCP / CoS per-flow

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




> However, while this allows all traffic sent via a specific interface to be 
> marked with a PCP (priority code point), it defeats the purpose of PFC 
> (priority flow control) which works by individually pausing different queues 
> of an interface, provided there is an actual differentiation of traffic into 
> those various classes.
>
> Internally, we have added a socket option (SO_VLAN_PCP) to change the PCP 
> specifically for traffic associated with that socket, to be marked 
> differently from whatever the interface default is (unmarked, or the default 
> PCP).
>
> Does the community see value in having such a socket option widely available? 
> (Linux currently doesn't seem to have a per-socket option either, only a 
> per-interface IOCTL API).

I've been doing quite a bit of network testing using iperf3 and similar tools, 
and have wanted this type of functionality since the interface option became 
available. Having this on a socket level would make it possible to teach 
iperf3, ping and other tools to set PCP and facilitate/simplify testing of L2 
networks.

So the answer is a definite yes! This would be valuable.

Steinar Haug, Nethelp consulting, sth...@nethelp.no
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Socket option to configure Ethernet PCP / CoS per-flow

2020-09-11 Thread Matthew Grooms

On 9/11/2020 12:15 PM, Scheffenegger, Richard wrote:

Thank you for the quick feedback.

On a related note - it just occurred to me, that the PCP functionality could be 
extended to make more effective use of PFC (priority flow control) without 
explicitly managing it on an application level directly.

Right now, PFC typically degenerates to good-old Flow control, as all traffic 
is handled just in the default class (0, or whatever is set up using the IOCTL 
interface API).

Typically, the different Ethernet classes come with a notion of prioritization between them - 
traffic in a "higher" class may be forwarded prior to traffic in a lower class. But that 
is not a strong requirement - using WRR with 1/8th bandwidth "reserved" for each class in 
a switch, assigning flows to a random PCP value, PFC could work in a more scalable fashion - only 
blocking a fraction of traffic, that is actually queue building (has to go over a lower bandwidth 
link, or a NIC excessively pausing its ingress), thus reducing the chance of the formation of 
congrestion trees...

E.g. PCP runs from 0 (default) to 7;

Adding a socket option to explicitly assign traffic to one of these flows would allow 
testing and configuring applications to make use of "real" prioritization 
capabilities of modern switches.

And what I was just pondering was a special interface level setting (e.g. 8), which results in a socket to 
pick a "random" value when created, to distribute packets across all the queues available in 
hardware, allowing PFC to no longer collapse in effect to old FC style "on"/"off" for all 
traffic...

Perhaps someone here has experience with congestion tree formation in multi-hop 
switching environments, and can comment if the above approach would be feasible 
to address that FC issue?


Richard Scheffenegger


Hey There Richard,

I live in Austin where we are fortunate enough to have Google Fiber. And 
while I love the service, I hate the idea of being forced to use the 
Google Fiber black box as my edge device. But get full use of the 
service, you have to set VLAN + PCP values appropriately or you hit a 
Google imposed traffic shaping bottleneck. In any case, I was able to do 
this using pf as the packet classifier. You simply write a rule to match 
the traffic and assign the desired value. Perhaps this may be a way to 
accomplish what you're trying to do without having to add a new socket 
option. Have a look at the pf.conf man page and search for 'set prio'. I 
assume ipfw has an equivalent feature as well ...


 set prio priority | (priority, priority)
   Packets matching this rule will be assigned a specific queueing
   priority.  Priorities are assigned as integers 0 through 7.  
If the
   packet is transmitted on a vlan(4) interface, the queueing 
priority

   will be written as the priority code point in the 802.1Q VLAN
   header.  If two priorities are given, packets which have a 
TOS of
   lowdelay and TCP ACKs with no data payload will be assigned 
to the

   second one.

Hope this helps,

-Matthew

___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


Re: Socket option to configure Ethernet PCP / CoS per-flow

2020-09-24 Thread Ryan Stone
On Fri, Sep 11, 2020 at 12:33 PM Scheffenegger, Richard
 wrote:
>
> Hi,
>
> Currently, upstream head has only an IOCTL API to set up interface-wide 
> default PCP marking:
>
> #define  SIOCGVLANPCPSIOCGLANPCP /* Get VLAN PCP */
> #define   SIOCSVLANPCPSIOCSLANPCP  /* Set VLAN PCP */
>
> And the interface is via ifconfig  pcp .
>
> However, while this allows all traffic sent via a specific interface to be 
> marked with a PCP (priority code point), it defeats the purpose of PFC 
> (priority flow control) which works by individually pausing different queues 
> of an interface, provided there is an actual differentiation of traffic into 
> those various classes.
>
> Internally, we have added a socket option (SO_VLAN_PCP) to change the PCP 
> specifically for traffic associated with that socket, to be marked 
> differently from whatever the interface default is (unmarked, or the default 
> PCP).
>
> Does the community see value in having such a socket option widely available? 
> (Linux currently doesn't seem to have a per-socket option either, only a 
> per-interface IOCTL API).
>
> Best regards,
>
> Richard Scheffenegger
>
> ___
> freebsd-transp...@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-transport
> To unsubscribe, send any mail to "freebsd-transport-unsubscr...@freebsd.org"

Hi Richard,

At $WORK we're running into situations where PFC support would be very
useful, so I think that this would be a good thing to add.  I have a
question: does your work also communicate the priority value for an
mbuf down to the Ethernet driver, so that it can put the packet in the
proper queue?
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


RE: Socket option to configure Ethernet PCP / CoS per-flow

2020-09-24 Thread Scheffenegger, Richard
Hi Ryan,

As you can see in the code, when a specific PCP value is associated with a 
session, a vlan header is added to the mbuf, before all that gets handed off to 
the device drivers.

(I did improve upon the $work code basis, in allowing "default" and "explicit" 
pcp values - rather than assuming an underlying interface will always have a 
default PCP of 0).

I'm not perfectly happy with the pcp value living in the socket struct, but 
frankly, there is no more appropriate layer anyway, and this approach should be 
pretty speed-efficient.

I'm not a hw driver person, so whatever happens to the mbuf after the vlan tag 
is added (a pure pcp=x, vlan=0 may be attached) is all up to how the driver / 
hardware deals with that header as part of the mbuf chain...

Also, if you do have an account on reviews.freebsd.org, perhaps you want to 
comment on the Diff, that this is valuable work... As this is outside my normal 
scope of tweaks, I would certainly need some positive reviews around this to 
get it approved for committing.

Were you able to patch you kernel and achieve what you were trying to do?

Do you see any value in an interface default, that effectively lets each new 
session rotate through all PCPs, to make PFC more useful and not degrade into 
simple xon/xoff "global" flow control?


Richard Scheffenegger

-Original Message-
From: Ryan Stone  
Sent: Donnerstag, 24. September 2020 23:31
To: Scheffenegger, Richard 
Cc: n...@freebsd.org; transp...@freebsd.org
Subject: Re: Socket option to configure Ethernet PCP / CoS per-flow

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




On Fri, Sep 11, 2020 at 12:33 PM Scheffenegger, Richard 
 wrote:
>
> Hi,
>
> Currently, upstream head has only an IOCTL API to set up interface-wide 
> default PCP marking:
>
> #define  SIOCGVLANPCPSIOCGLANPCP /* Get VLAN PCP */
> #define   SIOCSVLANPCPSIOCSLANPCP  /* Set VLAN PCP */
>
> And the interface is via ifconfig  pcp .
>
> However, while this allows all traffic sent via a specific interface to be 
> marked with a PCP (priority code point), it defeats the purpose of PFC 
> (priority flow control) which works by individually pausing different queues 
> of an interface, provided there is an actual differentiation of traffic into 
> those various classes.
>
> Internally, we have added a socket option (SO_VLAN_PCP) to change the PCP 
> specifically for traffic associated with that socket, to be marked 
> differently from whatever the interface default is (unmarked, or the default 
> PCP).
>
> Does the community see value in having such a socket option widely available? 
> (Linux currently doesn't seem to have a per-socket option either, only a 
> per-interface IOCTL API).
>
> Best regards,
>
> Richard Scheffenegger
>
> ___
> freebsd-transp...@freebsd.org mailing list 
> https://lists.freebsd.org/mailman/listinfo/freebsd-transport
> To unsubscribe, send any mail to "freebsd-transport-unsubscr...@freebsd.org"

Hi Richard,

At $WORK we're running into situations where PFC support would be very useful, 
so I think that this would be a good thing to add.  I have a
question: does your work also communicate the priority value for an mbuf down 
to the Ethernet driver, so that it can put the packet in the proper queue?
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"


RE: Socket option to configure Ethernet PCP / CoS per-flow

2020-10-09 Thread Scheffenegger, Richard
Hi Ryan,

D26409 was committed in r366569. The socket option is now "living" under the 
IP_PROTO or IPV6_PROTO (depending on the AF_FAMILY used by the socket), and 
stored with the INPCB for more efficient processing.

Would be grateful if anyone could look at D26627, which adds a "-C " 
option to ping, to validate the functionality (host + network).

That patch for ping also shows a simple example as to how to use this new 
functionality (basically, perform a setsockopt between a bind/connect or 
bind/listen of the socket, similar to other such setsockopt calls). 

Best regards,


Richard Scheffenegger


-Original Message-
From: Ryan Stone  
Sent: Donnerstag, 24. September 2020 23:31
To: Scheffenegger, Richard 
Cc: n...@freebsd.org; transp...@freebsd.org
Subject: Re: Socket option to configure Ethernet PCP / CoS per-flow


Hi Richard,

At $WORK we're running into situations where PFC support would be very useful, 
so I think that this would be a good thing to add.  I have a
question: does your work also communicate the priority value for an mbuf down 
to the Ethernet driver, so that it can put the packet in the proper queue?
___
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"