On 11/01/2013 10:36 AM, Markus Pargmann wrote:
> This patch speeds up the rx_poll function by reducing the number of
> register reads.
> 
> Replace the 32bit register read by a 16bit register read. Currently
> the 32bit register read is implemented by using 2 16bit reads. This is
> inefficient as we only use the lower 16bit in rx_poll.
> 
> The for loop reads the pending interrupts in every iteration. This
> leads up to 16 reads of pending interrupts. The patch introduces a new
> outer loop to read the pending interrupts as long as 'quota' is above 0.
> This reduces the total number of reads.
> 
> The third change is to replace the for-loop by a ffs loop.
> 
> Tested on AM335x. I removed all 'static' and 'inline' from c_can.c to
> see the timings for all functions. I used the function tracer with
> trace_stats.
> 
> 125kbit:
>   Function                               Hit    Time            Avg           
>   s^2
>   --------                               ---    ----            ---           
>   ---
>   c_can_do_rx_poll                     63960    10168178 us     158.977 us    
>   1493056 us
> With patch:
>   c_can_do_rx_poll                     63941    3764057 us      58.867 us     
>   776162.2 us
> 
> 1Mbit:
>   Function                               Hit    Time            Avg           
>   s^2
>   --------                               ---    ----            ---           
>   ---
>   c_can_do_rx_poll                     69489    30049498 us     432.435 us    
>   9271851 us
> With patch:
>   c_can_do_rx_poll                    207109    24322185 us     117.436 us    
>   171469047 us
> 
> Signed-off-by: Markus Pargmann <m...@pengutronix.de>
> ---
> 
> Notes:
>     Changes in v3:
>      - Update commit message (measurements and ffs)
>     
>     Changes in v2:
>      - Small changes, find_next_bit -> ffs and other
> 
>  drivers/net/can/c_can/c_can.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c
> index a668cd4..428681e 100644
> --- a/drivers/net/can/c_can/c_can.c
> +++ b/drivers/net/can/c_can/c_can.c
> @@ -798,17 +798,19 @@ static int c_can_do_rx_poll(struct net_device *dev, int 
> quota)
>       u32 num_rx_pkts = 0;
>       unsigned int msg_obj, msg_ctrl_save;
>       struct c_can_priv *priv = netdev_priv(dev);
> -     u32 val = c_can_read_reg32(priv, C_CAN_INTPND1_REG);
> +     u16 val;
> +
> +     /*
> +      * It is faster to read only one 16bit register. This is only possible
> +      * for a maximum number of 16 objects.
> +      */
> +     BUILD_BUG_ON_MSG(C_CAN_MSG_OBJ_RX_LAST > 16,
> +                     "Implementation does not support more message objects 
> than 16");
> +
> +     while (quota > 0 && (val = priv->read_reg(priv, C_CAN_INTPND1_REG))) {
> +             while ((msg_obj = ffs(val)) && quota > 0) {
> +                     val &= ~BIT(msg_obj - 1);

IIRC, we should avoid assignment in if/while statements.

Wolfgang.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to