On Wed, Sep 14, 2016 at 21:46 +0200, Mike Belopuhov wrote:
> On Tue, Sep 13, 2016 at 08:50 +0000, Olivier Cherrier wrote:
> > >Synopsis: crash with oce(4)
> > >Category: network
> > >Environment:
> > System : OpenBSD 6.0
> > Details : OpenBSD 6.0 (GENERIC.MP) #2319: Tue Jul 26
> > 13:00:43 MDT 2016
> >
> > [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> >
> > Architecture: OpenBSD.amd64
> > Machine : amd64
> > >Description:
> >
> > After upgrading systems from 5.9 (with patch 004) to 6.0, I am getting
> > crash after a few seconds the network is configured. The problem seems
> > to be linked to oce(4) and pool, at least not linked to carp/vlan since
> > I can reproduce the crash with just «ifconfig ocex up» commands as
> > shown here while booting in single user:
> >
>
> I didn't test CARP, but I cound't reproduce this with vlans on
> top of a trunk on top of two oce's with 6.0-release. I will
> double check -current tomorrow. I don't see a good reason for
> the "missing descriptor in rxeof" unless it's a stray interrupt
> with a valid completion queue entry which is a bit too weird.
>
> Perhaps we're not filling the Rx ring with enough slots and get
> a heavily fragmented jumbo frame that the card has managed to
> only partially fit into provided space. How about this diff?
>
Nah. This is rubbish, otherwise the driver wouldn't have been
usable at all. I would still ask you to try it for possbile
side effects and for the extended printf.
Could you downgrade to 5.9 and retry your configuration?
I wonder if there could be some hardware or RAM related
issues.
> diff --git sys/dev/pci/if_oce.c sys/dev/pci/if_oce.c
> index ee74185..a74b35b 100644
> --- sys/dev/pci/if_oce.c
> +++ sys/dev/pci/if_oce.c
> @@ -1078,7 +1078,7 @@ oce_init(void *arg)
> rq->ring->index = 0;
>
> /* oce splits jumbos into 2k chunks... */
> - if_rxr_init(&rq->rxring, 8, rq->nitems);
> + if_rxr_init(&rq->rxring, OCE_MAX_TX_ELEMENTS, rq->nitems);
>
> if (!oce_alloc_rx_bufs(rq)) {
> printf("%s: failed to allocate rx buffers\n",
> @@ -1560,8 +1560,8 @@ oce_rxeof(struct oce_rq *rq, struct oce_nic_rx_cqe *cqe)
>
> for (i = 0; i < cqe->u0.s.num_fragments; i++) {
> if ((pkt = oce_pkt_get(&rq->pkt_list)) == NULL) {
> - printf("%s: missing descriptor in rxeof\n",
> - sc->sc_dev.dv_xname);
> + printf("%s: missing descriptor in rxeof, frag %d/%u\n",
> + sc->sc_dev.dv_xname, i, cqe->u0.s.num_fragments);
> goto exit;
> }
>
>