RE: [git patches] net driver updates

2008-02-20 Thread Krishna Kumar2
Hi Divy,

But the race doesn't exist even for LLTX drivers these days. There is no
way two cpu's can execute the xmit handler at the same time.

Thanks,

- KK

> > > > The first part of the patch removes the !netif_queue_stopped(dev).
> > > > It opens the race discussed a while ago between Stephen hemminger
> > and
> > > > David Miller:
> > > > http://marc.info/?l=linux-netdev=113383224512427=2
> > >
> > > I feel this race cannot happen anymore. I think the fix for that
> race
> > was
> > > to introduce the
> > > __LINK_STATE_QDISC_RUNNING bit thus eliminating any races between
> > CPU's. If
> > > one
> > > CPU has called xmit, the other CPU will enqueue skbs (by holding the
> > > queue_lock) and
> > > exit from qdisc_run since it finds the bit set already.
> >
> > And the race is talking about LLTX, which S2IO doesn't use as
> > far as I can tell.
>
> Dave,
>
> The driver is cxgb3 here, it uses LLTX.
>
> Cheers,
> Divy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git patches] net driver updates

2008-02-20 Thread Krishna Kumar2
Hi Divy,

> > Explain why, so I can include it in the changelog please...
>
> Hi Jeff,
>
> The first part of the patch removes the !netif_queue_stopped(dev).
> It opens the race discussed a while ago between Stephen hemminger and
> David Miller:
> http://marc.info/?l=linux-netdev=113383224512427=2

I feel this race cannot happen anymore. I think the fix for that race was
to introduce the
__LINK_STATE_QDISC_RUNNING bit thus eliminating any races between CPU's. If
one
CPU has called xmit, the other CPU will enqueue skbs (by holding the
queue_lock) and
exit from qdisc_run since it finds the bit set already.

Thanks,

- KK

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git patches] net driver updates

2008-02-20 Thread Krishna Kumar2
Hi Divy,

  Explain why, so I can include it in the changelog please...

 Hi Jeff,

 The first part of the patch removes the !netif_queue_stopped(dev).
 It opens the race discussed a while ago between Stephen hemminger and
 David Miller:
 http://marc.info/?l=linux-netdevm=113383224512427w=2

I feel this race cannot happen anymore. I think the fix for that race was
to introduce the
__LINK_STATE_QDISC_RUNNING bit thus eliminating any races between CPU's. If
one
CPU has called xmit, the other CPU will enqueue skbs (by holding the
queue_lock) and
exit from qdisc_run since it finds the bit set already.

Thanks,

- KK

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [git patches] net driver updates

2008-02-20 Thread Krishna Kumar2
Hi Divy,

But the race doesn't exist even for LLTX drivers these days. There is no
way two cpu's can execute the xmit handler at the same time.

Thanks,

- KK

The first part of the patch removes the !netif_queue_stopped(dev).
It opens the race discussed a while ago between Stephen hemminger
  and
David Miller:
http://marc.info/?l=linux-netdevm=113383224512427w=2
  
   I feel this race cannot happen anymore. I think the fix for that
 race
  was
   to introduce the
   __LINK_STATE_QDISC_RUNNING bit thus eliminating any races between
  CPU's. If
   one
   CPU has called xmit, the other CPU will enqueue skbs (by holding the
   queue_lock) and
   exit from qdisc_run since it finds the bit set already.
 
  And the race is talking about LLTX, which S2IO doesn't use as
  far as I can tell.

 Dave,

 The driver is cxgb3 here, it uses LLTX.

 Cheers,
 Divy

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc8-mm1 : net tcp_input.c warnings

2008-01-24 Thread Krishna Kumar2
 ? sock_def_readable+0x48/0xa0
> >  [] ? sock_queue_rcv_skb+0xb1/0x1a0
> >  [] ? sock_queue_rcv_skb+0xf7/0x1a0
> >  [] ip_rcv+0x18f/0x290
> >  [] ? packet_rcv_spkt+0xd0/0x130
> >  [] netif_receive_skb+0x2b6/0x330
> >  [] ? netif_receive_skb+0x127/0x330
> >  [] ? process_backlog+0x83/0x100
> >  [] process_backlog+0x8e/0x100
> >  [] net_rx_action+0x13c/0x230
> >  [] ? net_rx_action+0x59/0x230
> >  [] __do_softirq+0x93/0x120
> >  [] do_softirq+0x7a/0x80
> >  [] irq_exit+0x65/0x90
> >  [] do_IRQ+0x41/0x80
> >  [] ? trace_hardirqs_on+0xb9/0x130
> >  [] common_interrupt+0x2e/0x34
> >  [] ? mwait_idle_with_hints+0x40/0x50
> >  [] ? mwait_idle+0x0/0x20
> >  [] mwait_idle+0x12/0x20
> >  [] cpu_idle+0x61/0x110
> >  [] rest_init+0x5d/0x60
> >  [] start_kernel+0x1fa/0x260
> >  [] ? unknown_bootoption+0x0/0x130
> >  ===
> > ---[ end trace 14b601818e6903ac ]---
>
> ...But this no longer is, and even more, L: 5 is not valid state at this
> point all (should only happen if we went to RTO but it would reset S to
> zero with newreno):
>
> > P: 5 L: 5 vs 5 S: 0 vs 3 w: 2044790889-2044796616 (0)
> > TCP wq(s) l<
> > TCP wq(h) +++h+<
> > l5 s3 f0 p5 seq: su2044790889 hs2044795029 sn2044796616
>
> Surprisingly, it was the first time the WARN_ON for left_out returned
> correct location. This also explains why the patch I sent to Krishna
> didn't print anything (it didn't end up into printing because I forgot
> to add L+S>P check into to the state checking if).
>
> ...so please, could you (others than Denys) try this patch, it should
> solve the issue. And Denys, could you confirm (and if necessary double
> check) that the kernel you saw this similar problem with is the pure
> Linus' mainline, i.e., without any net-2.6.25 or mm bits please, if so,
> that problem persists. And anyway, there were some fackets_out related
> problems reported as well and this doesn't help for that but I think I've

> lost track of who was seeing it due to large number of reports :-), could

> somebody refresh my memory because I currently don't have time to dig it
> up from archives (at least on this week).
>
>
> --
>  i.
>
> --
> [PATCH] [TCP]: NewReno must count every skb while marking losses
>
> NewReno should add cnt per skb (as with FACK) instead of depending
> on SACKED_ACKED bits which won't be set with it at all.
> Effectively, NewReno should always exists after the first
> iteration anyway (or immediately if there's already head in
> lost_out.
>
> This was fixed earlier in net-2.6.25 but got reverted among other
> stuff and I didn't notice that this is still necessary (actually
> wasn't even considering this case while trying to figure out the
> reports because I lived with different kind of code than it in
> reality was).
>
> This should solve the WARN_ONs in TCP code that as a result of
> this triggered multiple times in every place we check for this
> invariant.
>
> Special thanks to Dave Young <[EMAIL PROTECTED]> and
> Krishna Kumar2 <[EMAIL PROTECTED]> for trying with my debug
> patches.
>
> Signed-off-by: Ilpo Järvinen <[EMAIL PROTECTED]>
> ---
>  net/ipv4/tcp_input.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 295490e..aa409a5 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -2156,7 +2156,7 @@ static void tcp_mark_head_lost(struct sock *sk, int

> packets, int fast_rexmit)
>tp->lost_skb_hint = skb;
>tp->lost_cnt_hint = cnt;
>
> -  if (tcp_is_fack(tp) ||
> +  if (tcp_is_fack(tp) || tcp_is_reno(tp) ||
>(TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED))
>   cnt += tcp_skb_pcount(skb);
>
> --
> 1.5.2.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.24-rc8-mm1 : net tcp_input.c warnings

2008-01-24 Thread Krishna Kumar2
   [c03c8ea3] ? process_backlog+0x83/0x100
   [c03c8eae] process_backlog+0x8e/0x100
   [c03c90bc] net_rx_action+0x13c/0x230
   [c03c8fd9] ? net_rx_action+0x59/0x230
   [c0136f63] __do_softirq+0x93/0x120
   [c013706a] do_softirq+0x7a/0x80
   [c0137135] irq_exit+0x65/0x90
   [c01078b1] do_IRQ+0x41/0x80
   [c0155659] ? trace_hardirqs_on+0xb9/0x130
   [c01059ca] common_interrupt+0x2e/0x34
   [c0103390] ? mwait_idle_with_hints+0x40/0x50
   [c01033a0] ? mwait_idle+0x0/0x20
   [c01033b2] mwait_idle+0x12/0x20
   [c0103141] cpu_idle+0x61/0x110
   [c04339fd] rest_init+0x5d/0x60
   [c05a47fa] start_kernel+0x1fa/0x260
   [c05a4190] ? unknown_bootoption+0x0/0x130
   ===
  ---[ end trace 14b601818e6903ac ]---

 ...But this no longer is, and even more, L: 5 is not valid state at this
 point all (should only happen if we went to RTO but it would reset S to
 zero with newreno):

  P: 5 L: 5 vs 5 S: 0 vs 3 w: 2044790889-2044796616 (0)
  TCP wq(s) l
  TCP wq(h) +++h+
  l5 s3 f0 p5 seq: su2044790889 hs2044795029 sn2044796616

 Surprisingly, it was the first time the WARN_ON for left_out returned
 correct location. This also explains why the patch I sent to Krishna
 didn't print anything (it didn't end up into printing because I forgot
 to add L+SP check into to the state checking if).

 ...so please, could you (others than Denys) try this patch, it should
 solve the issue. And Denys, could you confirm (and if necessary double
 check) that the kernel you saw this similar problem with is the pure
 Linus' mainline, i.e., without any net-2.6.25 or mm bits please, if so,
 that problem persists. And anyway, there were some fackets_out related
 problems reported as well and this doesn't help for that but I think I've

 lost track of who was seeing it due to large number of reports :-), could

 somebody refresh my memory because I currently don't have time to dig it
 up from archives (at least on this week).


 --
  i.

 --
 [PATCH] [TCP]: NewReno must count every skb while marking losses

 NewReno should add cnt per skb (as with FACK) instead of depending
 on SACKED_ACKED bits which won't be set with it at all.
 Effectively, NewReno should always exists after the first
 iteration anyway (or immediately if there's already head in
 lost_out.

 This was fixed earlier in net-2.6.25 but got reverted among other
 stuff and I didn't notice that this is still necessary (actually
 wasn't even considering this case while trying to figure out the
 reports because I lived with different kind of code than it in
 reality was).

 This should solve the WARN_ONs in TCP code that as a result of
 this triggered multiple times in every place we check for this
 invariant.

 Special thanks to Dave Young [EMAIL PROTECTED] and
 Krishna Kumar2 [EMAIL PROTECTED] for trying with my debug
 patches.

 Signed-off-by: Ilpo Järvinen [EMAIL PROTECTED]
 ---
  net/ipv4/tcp_input.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
 index 295490e..aa409a5 100644
 --- a/net/ipv4/tcp_input.c
 +++ b/net/ipv4/tcp_input.c
 @@ -2156,7 +2156,7 @@ static void tcp_mark_head_lost(struct sock *sk, int

 packets, int fast_rexmit)
tp-lost_skb_hint = skb;
tp-lost_cnt_hint = cnt;

 -  if (tcp_is_fack(tp) ||
 +  if (tcp_is_fack(tp) || tcp_is_reno(tp) ||
(TCP_SKB_CB(skb)-sacked  TCPCB_SACKED_ACKED))
   cnt += tcp_skb_pcount(skb);

 --
 1.5.2.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/