On 09/02/2017 7:10 PM, Eric Dumazet wrote:
From: Eric Dumazet <eduma...@google.com>

Using a reader-writer lock in fast path is silly, when we can
instead use RCU or a seqlock.

For mlx4 hwstamp clock, a seqlock is the way to go, removing
two atomic operations and false sharing.

Signed-off-by: Eric Dumazet <eduma...@google.com>
Cc: Tariq Toukan <tar...@mellanox.com>
---
  drivers/net/ethernet/mellanox/mlx4/en_clock.c |   35 ++++++++--------
  drivers/net/ethernet/mellanox/mlx4/mlx4_en.h  |    2
  2 files changed, 19 insertions(+), 18 deletions(-)


Hi Eric,

When my peer, Shay, modified mlx5 to adopt this same locking scheme/type, he noticed a degradation in packet rate. He got back to testing mlx4 and also noticed a degradation introduced by this patch.

Perf numbers (single ring):

mlx4:
with rw-lock: ~8.54M pps
with seq-lock: ~8.51M pps

mlx5:
With rw-lock: ~14.94M pps
With seq-lock: ~14.48M pps

Actually, this can be explained by the analysis below.
In short, number of readers is significantly larger than of writers. Hence optimizing the readers flow would give better numbers. The issue is, the read/write lock might cause writers starvation. Maybe RCU fits best here?

Degradation analysis:
The patch changes the lock type which protects reads and updates of a variable ( (struct mlx4_en_dev).clock variable)
This variable is used to convert the hw timestamp into skb->hwtstamps.
This variable is read for each transmitted/received packet and updated only via ptp module and some overflow periodic work we have (maximum of 10 times per second) Meaning that there are much more readers than writers, and it’s best to optimize the readers flow.

Best,
Tariq

Reply via email to