Re: em(4) watchdog timeouts

2015-11-17 Thread Alexis VACHETTE

Hi Gregor,

Thank you for your feedback.

Did you have some timeout on 5.6 ?

On amd64 version, I experienced some on heavy network load. Is it related ?

Regards,
Alexis VACHETTE.
On 11/11/2015 21:19, Gregor Best wrote:

Hi Alexis,

On Wed, Nov 11, 2015 at 08:11:15PM +, Alexis VACHETTE wrote:

[...]
Even with heavy network load ?
[...]

So far, yes. I've saturated the device for about 45 Minutes with
something like this (the other end is my laptop):

## on the router
$ dd if=/dev/zero bs=8k | nc 172.31.64.174 55000
## on my laptop
$ nc -l 55000 | dd of=/dev/null bs=8k

(with two or three streams in parallel). There were about 6k
interrupts per second and bandwidth was about 250Mbps, which seems
to be the maximum the tiny CPU in this router can do. No watchdog
timeouts appeared, where previously something relatively low bandwidth
(the SSDs in router and laptop suck) like this caused one every 20
or 30 seconds:

## on the router
$ pax -w /home | nc 172.31.64.174 55000

I'll keep an eye on things, but so far it looks good. Regular usage
works out so far as well. If you need me to run some special workload
for you, I'd be more than happy to do that.





Re: em(4) watchdog timeouts

2015-11-17 Thread Alexis VACHETTE

Hi Gregor,

I use the same revision than yours :

- "Intel 82583V" rev 0x00: msi

Regards,
Alexis VACHETTE.*
*
On 16/11/2015 10:12, Alexis VACHETTE wrote:

Hi Gregor,

Thank you for your feedback.

Did you have some timeout on 5.6 ?

On amd64 version, I experienced some on heavy network load. Is it 
related ?


Regards,
Alexis VACHETTE.
On 11/11/2015 21:19, Gregor Best wrote:

Hi Alexis,

On Wed, Nov 11, 2015 at 08:11:15PM +0000, Alexis VACHETTE wrote:

[...]
Even with heavy network load ?
[...]

So far, yes. I've saturated the device for about 45 Minutes with
something like this (the other end is my laptop):

## on the router
$ dd if=/dev/zero bs=8k | nc 172.31.64.174 55000
## on my laptop
$ nc -l 55000 | dd of=/dev/null bs=8k

(with two or three streams in parallel). There were about 6k
interrupts per second and bandwidth was about 250Mbps, which seems
to be the maximum the tiny CPU in this router can do. No watchdog
timeouts appeared, where previously something relatively low bandwidth
(the SSDs in router and laptop suck) like this caused one every 20
or 30 seconds:

## on the router
$ pax -w /home | nc 172.31.64.174 55000

I'll keep an eye on things, but so far it looks good. Regular usage
works out so far as well. If you need me to run some special workload
for you, I'd be more than happy to do that.







Re: em(4) watchdog timeouts

2015-11-11 Thread Alexis VACHETTE
Hi Gregor,

Even with heavy network load ?

Regards,
Alexis.


De : owner-t...@openbsd.org  de la part de Gregor Best 

Envoyé : mercredi 11 novembre 2015 15:20
À : Mark Kettenis
Cc : tech@openbsd.org; m...@openbsd.org
Objet : Re: em(4) watchdog timeouts

I've done some further testing and I think I've narrowed it down to the
"Unlocking em(4) a bit further"-patch [0]. With the patch reverted, I
haven't seen any watchdog timeouts yet. I'm currently running the router
with the patch reverted to make sure the timeouts don't happen again.

[0]: https://www.marc.info/?l=openbsd-tech&m=144347723907388&w=4

--
Gregor



Re: Possible em(4) fix

2015-11-05 Thread Alexis VACHETTE

Hi Mark,

If you need a box for testing purpose on this issue.

I can provide you bug reports when I will get a spare box which trigger 
the watchdog timeout.


In my case it's only with trunk device on failover mode so far.

Regards,
Alexis VACHETTE*
*
On 05/10/2015 22:45, Mark Kettenis wrote:

Several people seem to complain on misc@ that they're seeing watchdog
timeouts on em(4).  But none of them bother to submit a proper bug
report to bugs@.  Anyway, here is a diff that might fix the issue.
Please test, even if you're not experiencing any problems.

Thanks,

Mark


Index: if_em.c
===
RCS file: /home/cvs/src/sys/dev/pci/if_em.c,v
retrieving revision 1.306
diff -u -p -r1.306 if_em.c
--- if_em.c 30 Sep 2015 11:25:08 -  1.306
+++ if_em.c 5 Oct 2015 20:35:13 -
@@ -1210,12 +1210,6 @@ em_encap(struct em_softc *sc, struct mbu
}
}
  
-	sc->next_avail_tx_desc = i;

-   if (sc->pcix_82544)
-   atomic_sub_int(&sc->num_tx_desc_avail, txd_used);
-   else
-   atomic_sub_int(&sc->num_tx_desc_avail, map->dm_nsegs);
-
  #if NVLAN > 0
/* Find out if we are in VLAN mode */
if (m_head->m_flags & M_VLANTAG) {
@@ -1249,6 +1243,14 @@ em_encap(struct em_softc *sc, struct mbu
tx_buffer = &sc->tx_buffer_area[first];
tx_buffer->next_eop = last;
  
+	membar_producer();

+
+   sc->next_avail_tx_desc = i;
+   if (sc->pcix_82544)
+   atomic_sub_int(&sc->num_tx_desc_avail, txd_used);
+   else
+   atomic_sub_int(&sc->num_tx_desc_avail, map->dm_nsegs);
+
/*
 * Advance the Transmit Descriptor Tail (Tdt),
 * this tells the E1000 that this frame is
@@ -2377,6 +2379,8 @@ em_transmit_checksum_setup(struct em_sof
  
  	tx_buffer->m_head = NULL;

tx_buffer->next_eop = -1;
+
+   membar_producer();
  
  	if (++curr_txd == sc->num_tx_desc)

curr_txd = 0;