RE: [net-next 1/3] net: dsa: optimize tx timestamp request handling

2021-04-20 Thread Y.b. Lu
> -Original Message-
> From: Vladimir Oltean 
> Sent: 2021年4月18日 17:19
> To: Y.b. Lu 
> Cc: net...@vger.kernel.org; Richard Cochran ;
> Vladimir Oltean ; David S . Miller
> ; Jakub Kicinski ; Jonathan Corbet
> ; Kurt Kanzenbach ; Andrew Lunn
> ; Vivien Didelot ; Florian
> Fainelli ; Claudiu Manoil ;
> Alexandre Belloni ;
> unglinuxdri...@microchip.com; linux-...@vger.kernel.org;
> linux-kernel@vger.kernel.org
> Subject: Re: [net-next 1/3] net: dsa: optimize tx timestamp request handling
> 
> On Fri, Apr 16, 2021 at 08:36:53PM +0800, Yangbo Lu wrote:
> > Optimization could be done on dsa_skb_tx_timestamp(), and dsa device
> > drivers should adapt to it.
> >
> > - Check SKBTX_HW_TSTAMP request flag at the very beginning, instead of in
> >   port_txtstamp, so that most skbs not requiring tx timestamp just return.
> 
> Agree that this is a trivial performance optimization with no downside that we
> should be making.
> 
> > - No longer to identify PTP packets, and limit tx timestamping only for PTP
> >   packets. If device driver likes, let device driver do.
> 
> Agree that DSA has a way too heavy hand in imposing upon the driver which
> packets should be timestampable and which ones shouldn't.
> 
> For example, I have a latency measurement tool called isochron which is based
> on hardware timestamping of non-PTP packets (in order to not disturb the PTP
> state machines):
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.
> com%2Fvladimiroltean%2Ftsn-scriptsdata=04%7C01%7Cyangbo.lu%40
> nxp.com%7C3772f63deb6f4491933208d9024af750%7C686ea1d3bc2b4c6fa9
> 2cd99c5c301635%7C0%7C0%7C637543343267914018%7CUnknown%7CTW
> FpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV
> CI6Mn0%3D%7C1000sdata=deOn03j6biPEUwppNkv%2F9if1u5HIdLEtzz
> AkWW0p1rc%3Dreserved=0
> 
> I can't use it on DSA interfaces, for rather artificial reasons.
> 
> > - It is a waste to clone skb directly in dsa_skb_tx_timestamp().
> >   For one-step timestamping, a clone is not needed. For any failure of
> >   port_txtstamp (this may usually happen), the skb clone has to be freed.
> >   So put skb cloning into port_txtstamp where it really needs.
> 
> Mostly agree. For two-step timestamping, it is an operation which all drivers
> need to do, so it is in the common potion. If we want to support one-step, we
> need to avoid cloning the PTP packets.
> 

Thanks a lot for your ACK.

> > Signed-off-by: Yangbo Lu 
> > ---
> >  Documentation/networking/timestamping.rst |  7 +--
> >  .../net/dsa/hirschmann/hellcreek_hwtstamp.c   | 20 --
> >  .../net/dsa/hirschmann/hellcreek_hwtstamp.h   |  2 +-
> >  drivers/net/dsa/mv88e6xxx/hwtstamp.c  | 21
> +--
> >  drivers/net/dsa/mv88e6xxx/hwtstamp.h  |  6 +++---
> >  drivers/net/dsa/ocelot/felix.c| 11 ++
> >  drivers/net/dsa/sja1105/sja1105_ptp.c |  6 +-
> >  drivers/net/dsa/sja1105/sja1105_ptp.h |  2 +-
> >  include/net/dsa.h |  2 +-
> >  net/dsa/slave.c   | 20 +-
> >  10 files changed, 57 insertions(+), 40 deletions(-)
> >
> > diff --git a/Documentation/networking/timestamping.rst
> > b/Documentation/networking/timestamping.rst
> > index f682e88fa87e..7f04a699a5d1 100644
> > --- a/Documentation/networking/timestamping.rst
> > +++ b/Documentation/networking/timestamping.rst
> > @@ -635,8 +635,8 @@ in generic code: a BPF classifier
> > (``ptp_classify_raw``) is used to identify  PTP event messages (any
> > other packets, including PTP general messages, are not  timestamped), and
> provides two hooks to drivers:
> >
> > -- ``.port_txtstamp()``: The driver is passed a clone of the
> > timestampable skb
> > -  to be transmitted, before actually transmitting it. Typically, a
> > switch will
> > +- ``.port_txtstamp()``: A clone of the timestampable skb to be
> > +transmitted
> > +  is needed, before actually transmitting it. Typically, a switch
> > +will
> >have a PTP TX timestamp register (or sometimes a FIFO) where the
> timestamp
> >becomes available. There may be an IRQ that is raised upon this
> timestamp's
> >availability, or the driver might have to poll after invoking @@
> > -645,6 +645,9 @@ timestamped), and provides two hooks to drivers:
> >later use (when the timestamp becomes available). Each skb is annotated
> with
> >a pointer to its clone, in ``DSA_SKB_CB(skb)->clone``, to ease the 
> > driver's
> >job of keeping track of which clo

RE: [net-next 1/3] net: dsa: optimize tx timestamp request handling

2021-04-20 Thread Y.b. Lu
> -Original Message-
> From: Richard Cochran 
> Sent: 2021年4月18日 23:06
> To: Y.b. Lu 
> Cc: net...@vger.kernel.org; Vladimir Oltean ;
> David S . Miller ; Jakub Kicinski ;
> Jonathan Corbet ; Kurt Kanzenbach ;
> Andrew Lunn ; Vivien Didelot ;
> Florian Fainelli ; Claudiu Manoil
> ; Alexandre Belloni
> ; unglinuxdri...@microchip.com;
> linux-...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [net-next 1/3] net: dsa: optimize tx timestamp request handling
> 
> On Fri, Apr 16, 2021 at 08:36:53PM +0800, Yangbo Lu wrote:
> > Optimization could be done on dsa_skb_tx_timestamp(), and dsa device
> > drivers should adapt to it.
> >
> > - Check SKBTX_HW_TSTAMP request flag at the very beginning, instead of in
> >   port_txtstamp, so that most skbs not requiring tx timestamp just return.
> >
> > - No longer to identify PTP packets, and limit tx timestamping only for PTP
> >   packets. If device driver likes, let device driver do.
> >
> > - It is a waste to clone skb directly in dsa_skb_tx_timestamp().
> >   For one-step timestamping, a clone is not needed. For any failure of
> >   port_txtstamp (this may usually happen), the skb clone has to be freed.
> >   So put skb cloning into port_txtstamp where it really needs.
> 
> This patch changes three things ^^^ at once.  These are AFAICT independent
> changes, and so please break this one patch into three for easier review.

Will split it in next version. Thank you.

> 
> Thanks,
> Richard


RE: [net-next 1/3] net: dsa: optimize tx timestamp request handling

2021-04-20 Thread Y.b. Lu
> -Original Message-
> From: Richard Cochran 
> Sent: 2021年4月18日 23:11
> To: Vladimir Oltean 
> Cc: Y.b. Lu ; net...@vger.kernel.org; Vladimir Oltean
> ; David S . Miller ; Jakub
> Kicinski ; Jonathan Corbet ; Kurt
> Kanzenbach ; Andrew Lunn ; Vivien
> Didelot ; Florian Fainelli ;
> Claudiu Manoil ; Alexandre Belloni
> ; unglinuxdri...@microchip.com;
> linux-...@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [net-next 1/3] net: dsa: optimize tx timestamp request handling
> 
> On Sun, Apr 18, 2021 at 12:18:42PM +0300, Vladimir Oltean wrote:
> >
> > How about not passing "clone" back to DSA as an argument by reference,
> > but instead require the driver to populate DSA_SKB_CB(skb)->clone if
> > it needs to do so?
> >
> > Also, how about changing the return type to void? Returning true or
> > false makes no difference.

Thank you. That's good idea.
And how about letting driver store the skb clone pointer, or not? I copied my 
comments in patch #3 here,

" Can we totally drop dsa_skb_cb in dsa core? The only usage of it is holding a 
skb clone pointer, for only felix and sja1105.
Actually we can move such pointer in _skb_cb, instead of reserving the 
space of skb for any drivers."

> 
> +1
> 
> Thanks,
> Richard


Re: [net-next 1/3] net: dsa: optimize tx timestamp request handling

2021-04-19 Thread Kurt Kanzenbach
On Fri Apr 16 2021, Yangbo Lu wrote:
> Optimization could be done on dsa_skb_tx_timestamp(), and dsa device
> drivers should adapt to it.
>
> - Check SKBTX_HW_TSTAMP request flag at the very beginning, instead of in
>   port_txtstamp, so that most skbs not requiring tx timestamp just return.
>
> - No longer to identify PTP packets, and limit tx timestamping only for PTP
>   packets. If device driver likes, let device driver do.
>
> - It is a waste to clone skb directly in dsa_skb_tx_timestamp().
>   For one-step timestamping, a clone is not needed. For any failure of
>   port_txtstamp (this may usually happen), the skb clone has to be freed.
>   So put skb cloning into port_txtstamp where it really needs.
>
> Signed-off-by: Yangbo Lu 

PTP still works.

Tested-by: Kurt Kanzenbach  # hellcreek


signature.asc
Description: PGP signature


Re: [net-next 1/3] net: dsa: optimize tx timestamp request handling

2021-04-18 Thread Richard Cochran
On Sun, Apr 18, 2021 at 12:18:42PM +0300, Vladimir Oltean wrote:
> 
> How about not passing "clone" back to DSA as an argument by reference,
> but instead require the driver to populate DSA_SKB_CB(skb)->clone if it
> needs to do so?
> 
> Also, how about changing the return type to void? Returning true or
> false makes no difference.

+1

Thanks,
Richard


Re: [net-next 1/3] net: dsa: optimize tx timestamp request handling

2021-04-18 Thread Richard Cochran
On Fri, Apr 16, 2021 at 08:36:53PM +0800, Yangbo Lu wrote:
> @@ -628,9 +620,7 @@ static netdev_tx_t dsa_slave_xmit(struct sk_buff *skb, 
> struct net_device *dev)
>  
>   DSA_SKB_CB(skb)->clone = NULL;
>  
> - /* Identify PTP protocol packets, clone them, and pass them to the
> -  * switch driver
> -  */
> + /* Handle tx timestamp request if has */

"if has" what?

>   dsa_skb_tx_timestamp(p, skb);
>  
>   if (dsa_realloc_skb(skb, dev)) {
> -- 
> 2.25.1
> 

Thanks,
Richard


Re: [net-next 1/3] net: dsa: optimize tx timestamp request handling

2021-04-18 Thread Richard Cochran
On Fri, Apr 16, 2021 at 08:36:53PM +0800, Yangbo Lu wrote:
> Optimization could be done on dsa_skb_tx_timestamp(), and dsa device
> drivers should adapt to it.
> 
> - Check SKBTX_HW_TSTAMP request flag at the very beginning, instead of in
>   port_txtstamp, so that most skbs not requiring tx timestamp just return.
> 
> - No longer to identify PTP packets, and limit tx timestamping only for PTP
>   packets. If device driver likes, let device driver do.
> 
> - It is a waste to clone skb directly in dsa_skb_tx_timestamp().
>   For one-step timestamping, a clone is not needed. For any failure of
>   port_txtstamp (this may usually happen), the skb clone has to be freed.
>   So put skb cloning into port_txtstamp where it really needs.

This patch changes three things ^^^ at once.  These are AFAICT
independent changes, and so please break this one patch into three for
easier review.

Thanks,
Richard


Re: [net-next 1/3] net: dsa: optimize tx timestamp request handling

2021-04-18 Thread Vladimir Oltean
On Fri, Apr 16, 2021 at 08:36:53PM +0800, Yangbo Lu wrote:
> Optimization could be done on dsa_skb_tx_timestamp(), and dsa device
> drivers should adapt to it.
> 
> - Check SKBTX_HW_TSTAMP request flag at the very beginning, instead of in
>   port_txtstamp, so that most skbs not requiring tx timestamp just return.

Agree that this is a trivial performance optimization with no downside
that we should be making.

> - No longer to identify PTP packets, and limit tx timestamping only for PTP
>   packets. If device driver likes, let device driver do.

Agree that DSA has a way too heavy hand in imposing upon the driver
which packets should be timestampable and which ones shouldn't.

For example, I have a latency measurement tool called isochron which is
based on hardware timestamping of non-PTP packets (in order to not
disturb the PTP state machines):
https://github.com/vladimiroltean/tsn-scripts

I can't use it on DSA interfaces, for rather artificial reasons.

> - It is a waste to clone skb directly in dsa_skb_tx_timestamp().
>   For one-step timestamping, a clone is not needed. For any failure of
>   port_txtstamp (this may usually happen), the skb clone has to be freed.
>   So put skb cloning into port_txtstamp where it really needs.

Mostly agree. For two-step timestamping, it is an operation which all
drivers need to do, so it is in the common potion. If we want to support
one-step, we need to avoid cloning the PTP packets.

> Signed-off-by: Yangbo Lu 
> ---
>  Documentation/networking/timestamping.rst |  7 +--
>  .../net/dsa/hirschmann/hellcreek_hwtstamp.c   | 20 --
>  .../net/dsa/hirschmann/hellcreek_hwtstamp.h   |  2 +-
>  drivers/net/dsa/mv88e6xxx/hwtstamp.c  | 21 +--
>  drivers/net/dsa/mv88e6xxx/hwtstamp.h  |  6 +++---
>  drivers/net/dsa/ocelot/felix.c| 11 ++
>  drivers/net/dsa/sja1105/sja1105_ptp.c |  6 +-
>  drivers/net/dsa/sja1105/sja1105_ptp.h |  2 +-
>  include/net/dsa.h |  2 +-
>  net/dsa/slave.c   | 20 +-
>  10 files changed, 57 insertions(+), 40 deletions(-)
> 
> diff --git a/Documentation/networking/timestamping.rst 
> b/Documentation/networking/timestamping.rst
> index f682e88fa87e..7f04a699a5d1 100644
> --- a/Documentation/networking/timestamping.rst
> +++ b/Documentation/networking/timestamping.rst
> @@ -635,8 +635,8 @@ in generic code: a BPF classifier (``ptp_classify_raw``) 
> is used to identify
>  PTP event messages (any other packets, including PTP general messages, are 
> not
>  timestamped), and provides two hooks to drivers:
>  
> -- ``.port_txtstamp()``: The driver is passed a clone of the timestampable skb
> -  to be transmitted, before actually transmitting it. Typically, a switch 
> will
> +- ``.port_txtstamp()``: A clone of the timestampable skb to be transmitted
> +  is needed, before actually transmitting it. Typically, a switch will
>have a PTP TX timestamp register (or sometimes a FIFO) where the timestamp
>becomes available. There may be an IRQ that is raised upon this timestamp's
>availability, or the driver might have to poll after invoking
> @@ -645,6 +645,9 @@ timestamped), and provides two hooks to drivers:
>later use (when the timestamp becomes available). Each skb is annotated 
> with
>a pointer to its clone, in ``DSA_SKB_CB(skb)->clone``, to ease the driver's
>job of keeping track of which clone belongs to which skb.
> +  But one-step timestamping request is handled differently with above 
> two-step
> +  timestamping. The skb clone is no longer needed since hardware will insert
> +  TX time information on packet during egress.

Bonus points for updating the documentation, but I don't quite like the
end result. Please feel free to restructure more, in order to have a
clearer and more coherent explanation.

Also, this paragraph from right above is no longer true:

In code, DSA provides for most of the infrastructure for timestamping 
already,
in generic code: a BPF classifier (``ptp_classify_raw``) is used to 
identify
PTP event messages (any other packets, including PTP general messages, 
are not
timestamped), and provides two hooks to drivers:

It's nothing like that anymore. It's more of a passthrough now with your
changes, the BPF classifier is not run by the DSA core but optionally by
individual taggers.

Here is my attempt of rewriting this documentation paragraph, feel free
to take which parts you consider relevant:

-[cut here]-

In the generic layer, DSA provides the following infrastructure for PTP
timestamping:

- ``.port_txtstamp()``: a hook called prior to the transmission of
  packets with a hardware TX timestamping request from user space.
  This is required for two-step timestamping, since the hardware
  timestamp becomes available after the actual MAC transmission, so