Re: bce(4) - descriptor error

2015-01-23 Thread Stefan Sperling
On Thu, Jan 22, 2015 at 07:59:07PM -0500, John Merriam wrote:
> This diff from mark.kettenis also seems to work with 4GM of RAM installed.
> I also did the ping -f and scp tests.  It seems that performance may be a
> bit better with this patch but it is hard to tell for sure.  Thanks!

I like Mark's diff better because the aproach is simpler to use in drivers.

Thanks for testing.



Re: bce(4) - descriptor error

2015-01-22 Thread John Merriam

On 1/22/2015 3:40 PM, Mark Kettenis wrote:

Date: Thu, 22 Jan 2015 19:38:42 +0100 (CET)
From: Mark Kettenis 


Date: Thu, 22 Jan 2015 18:04:44 +0100
From: Stefan Sperling 

It looks as if some ring descriptor data is still being allocated with
bus_dmamem_alloc(). That function probably doesn't respect the mapping
constraints bce(4) hardware requires.

This diff makes bce use the same memory allocation APIs as bwi(4) is
using. Some bwi devices have the same 1GB problem and I know the bwi
code works fine with such devices. So perhaps applying the same approach
to bce will fix your issue.


I'd really prefer it if we'd solve this issue by promoting
_bus_dmamem_alloc_range() to a first class citizen.  I'll try to
recover an old diff that did this tonight.

No harm in testing this diff though to verify that it indeed solves
the issue.


So here is the alternative diff.  Only amd64/i386 are actually
implemented, but those are the only platforms we support bce(4) on.



This diff from mark.kettenis also seems to work with 4GM of RAM 
installed.  I also did the ping -f and scp tests.  It seems that 
performance may be a bit better with this patch but it is hard to tell 
for sure.  Thanks!


--

John Merriam



Re: bce(4) - descriptor error

2015-01-22 Thread John Merriam

On 1/22/2015 12:04 PM, Stefan Sperling wrote:

It looks as if some ring descriptor data is still being allocated with
bus_dmamem_alloc(). That function probably doesn't respect the mapping
constraints bce(4) hardware requires.

This diff makes bce use the same memory allocation APIs as bwi(4) is
using. Some bwi devices have the same 1GB problem and I know the bwi
code works fine with such devices. So perhaps applying the same approach
to bce will fix your issue.

Diff is compile-tested only due to lack of hardware at my end.



This diff from stsp seems to work with 4GB of RAM installed.  In 
addition to the ping -f test I also did an scp test and performance 
looks good.  Thanks!


--

John Merriam



Re: bce(4) - descriptor error

2015-01-22 Thread John Merriam

On 2015-01-22 15:40, Mark Kettenis wrote:

Date: Thu, 22 Jan 2015 19:38:42 +0100 (CET)
From: Mark Kettenis 

> Date: Thu, 22 Jan 2015 18:04:44 +0100
> From: Stefan Sperling 
>
> On Thu, Jan 22, 2015 at 11:34:47AM -0500, John Merriam wrote:
> > So, what could be the problem then?  Theoretically it did work as of the
> > 1.35 if_bce.c revision which seems to have shipped in OpenBSD 5.0.  This
> > message:
> >
> > http://marc.info/?l=openbsd-tech&m=130217668909255
> >
> > seems to verify that it did actually work at one point.
>
> It looks as if some ring descriptor data is still being allocated with
> bus_dmamem_alloc(). That function probably doesn't respect the mapping
> constraints bce(4) hardware requires.
>
> This diff makes bce use the same memory allocation APIs as bwi(4) is
> using. Some bwi devices have the same 1GB problem and I know the bwi
> code works fine with such devices. So perhaps applying the same approach
> to bce will fix your issue.

I'd really prefer it if we'd solve this issue by promoting
_bus_dmamem_alloc_range() to a first class citizen.  I'll try to
recover an old diff that did this tonight.

No harm in testing this diff though to verify that it indeed solves
the issue.


So here is the alternative diff.  Only amd64/i386 are actually
implemented, but those are the only platforms we support bce(4) on.



Your diff also seems to work and pass the ping -f test with <= 1GB RAM.  
I will also test this tonight with 4GB RAM installed in the machine.  
Thanks


--

John Merriam



Re: bce(4) - descriptor error

2015-01-22 Thread Mark Kettenis
> Date: Thu, 22 Jan 2015 19:38:42 +0100 (CET)
> From: Mark Kettenis 
> 
> > Date: Thu, 22 Jan 2015 18:04:44 +0100
> > From: Stefan Sperling 
> > 
> > On Thu, Jan 22, 2015 at 11:34:47AM -0500, John Merriam wrote:
> > > So, what could be the problem then?  Theoretically it did work as of the
> > > 1.35 if_bce.c revision which seems to have shipped in OpenBSD 5.0.  This
> > > message:
> > > 
> > > http://marc.info/?l=openbsd-tech&m=130217668909255
> > > 
> > > seems to verify that it did actually work at one point.
> > 
> > It looks as if some ring descriptor data is still being allocated with
> > bus_dmamem_alloc(). That function probably doesn't respect the mapping
> > constraints bce(4) hardware requires.
> > 
> > This diff makes bce use the same memory allocation APIs as bwi(4) is
> > using. Some bwi devices have the same 1GB problem and I know the bwi
> > code works fine with such devices. So perhaps applying the same approach
> > to bce will fix your issue.
> 
> I'd really prefer it if we'd solve this issue by promoting
> _bus_dmamem_alloc_range() to a first class citizen.  I'll try to
> recover an old diff that did this tonight.
> 
> No harm in testing this diff though to verify that it indeed solves
> the issue.

So here is the alternative diff.  Only amd64/i386 are actually
implemented, but those are the only platforms we support bce(4) on.

Index: dev/pci/if_bce.c
===
RCS file: /home/cvs/src/sys/dev/pci/if_bce.c,v
retrieving revision 1.40
diff -u -p -r1.40 if_bce.c
--- dev/pci/if_bce.c22 Dec 2014 02:28:51 -  1.40
+++ dev/pci/if_bce.c22 Jan 2015 20:38:05 -
@@ -315,8 +315,9 @@ bce_attach(struct device *parent, struct
 * XXX PAGE_SIZE is wasteful; we only need 1KB + 1KB, but
 * due to the limition above. ??
 */
-   if ((error = bus_dmamem_alloc(sc->bce_dmatag, 2 * PAGE_SIZE,
-   PAGE_SIZE, 2 * PAGE_SIZE, &seg, 1, &rseg, BUS_DMA_NOWAIT))) {
+   if ((error = bus_dmamem_alloc_range(sc->bce_dmatag, 2 * PAGE_SIZE,
+   PAGE_SIZE, 2 * PAGE_SIZE, &seg, 1, &rseg, BUS_DMA_NOWAIT,
+   (bus_addr_t)0, (bus_addr_t)0x3fff))) {
printf(": unable to alloc space for ring descriptors, "
"error = %d\n", error);
uvm_km_free(kernel_map, (vaddr_t)sc->bce_data,
Index: arch/amd64/amd64/bus_dma.c
===
RCS file: /home/cvs/src/sys/arch/amd64/amd64/bus_dma.c,v
retrieving revision 1.46
diff -u -p -r1.46 bus_dma.c
--- arch/amd64/amd64/bus_dma.c  16 Nov 2014 12:30:56 -  1.46
+++ arch/amd64/amd64/bus_dma.c  22 Jan 2015 20:38:05 -
@@ -424,7 +424,7 @@ _bus_dmamem_alloc(bus_dma_tag_t t, bus_s
 * memory under the 4gig boundary.
 */
return (_bus_dmamem_alloc_range(t, size, alignment, boundary,
-   segs, nsegs, rsegs, flags, (paddr_t)0, (paddr_t)0x));
+   segs, nsegs, rsegs, flags, (bus_addr_t)0, (bus_addr_t)0x));
 }
 
 /*
@@ -662,7 +662,7 @@ _bus_dmamap_load_buffer(bus_dma_tag_t t,
 int
 _bus_dmamem_alloc_range(bus_dma_tag_t t, bus_size_t size, bus_size_t alignment,
 bus_size_t boundary, bus_dma_segment_t *segs, int nsegs, int *rsegs,
-int flags, paddr_t low, paddr_t high)
+int flags, bus_addr_t low, bus_addr_t high)
 {
paddr_t curaddr, lastaddr;
struct vm_page *m;
Index: arch/amd64/include/bus.h
===
RCS file: /home/cvs/src/sys/arch/amd64/include/bus.h,v
retrieving revision 1.31
diff -u -p -r1.31 bus.h
--- arch/amd64/include/bus.h29 Mar 2014 18:09:28 -  1.31
+++ arch/amd64/include/bus.h22 Jan 2015 20:38:05 -
@@ -594,6 +594,9 @@ struct bus_dma_tag {
 */
int (*_dmamem_alloc)(bus_dma_tag_t, bus_size_t, bus_size_t,
bus_size_t, bus_dma_segment_t *, int, int *, int);
+   int (*_dmamem_alloc_range)(bus_dma_tag_t, bus_size_t, bus_size_t,
+   bus_size_t, bus_dma_segment_t *, int, int *, int,
+   bus_addr_t, bus_addr_t);
void(*_dmamem_free)(bus_dma_tag_t,
bus_dma_segment_t *, int);
int (*_dmamem_map)(bus_dma_tag_t, bus_dma_segment_t *,
@@ -622,6 +625,9 @@ struct bus_dma_tag {
 
 #definebus_dmamem_alloc(t, s, a, b, sg, n, r, f)   \
(*(t)->_dmamem_alloc)((t), (s), (a), (b), (sg), (n), (r), (f))
+#definebus_dmamem_alloc_range(t, s, a, b, sg, n, r, f, l, h)   \
+   (*(t)->_dmamem_alloc_range)((t), (s), (a), (b), (sg),   \
+   (n), (r), (f), (l), (h))
 #definebus_dmamem_free(t, sg, n)   \
(*(t)->_dmamem_free)((t), (sg), (n))
 #definebus_dmamem_map(t, sg, n, s, k, f)   \
@@ -686,6 +692,6 @@ paddr_t _bus_dmamem_mmap(bus_dma_tag_t t
 int_bus_dmamem_alloc_range(bus_dma_tag_t ta

Re: bce(4) - descriptor error

2015-01-22 Thread John Merriam
On Thu, 22 Jan 2015, Stefan Sperling wrote:
> On Thu, Jan 22, 2015 at 12:47:28PM -0500, John Merriam wrote:
> > The machine currently has the 1GB DIMM in it.  I tested the patch and it
> > seems to work with <= 1GB RAM.  ping -f test returned similar results so
> > performance likely isn't affected.
> 
> You sure? This would be the first time I wrote a driver diff
> that didn't panic the machine on first try  :)
> 

:)  Yeah, it does seem to work at least with <= 1GB RAM.

I am not a C expert.  Even less so when it comes to OS kernels.  And even 
less when it comes to hardware drivers.  But, when I look over your patch, 
it looks fine to me.

Will let you know tonight (EST) how I fare with 4GB installed in the 
machine.  Thanks

-- 

John Merriam



Re: bce(4) - descriptor error

2015-01-22 Thread Mark Kettenis
> Date: Thu, 22 Jan 2015 18:04:44 +0100
> From: Stefan Sperling 
> 
> On Thu, Jan 22, 2015 at 11:34:47AM -0500, John Merriam wrote:
> > So, what could be the problem then?  Theoretically it did work as of the
> > 1.35 if_bce.c revision which seems to have shipped in OpenBSD 5.0.  This
> > message:
> > 
> > http://marc.info/?l=openbsd-tech&m=130217668909255
> > 
> > seems to verify that it did actually work at one point.
> 
> It looks as if some ring descriptor data is still being allocated with
> bus_dmamem_alloc(). That function probably doesn't respect the mapping
> constraints bce(4) hardware requires.
> 
> This diff makes bce use the same memory allocation APIs as bwi(4) is
> using. Some bwi devices have the same 1GB problem and I know the bwi
> code works fine with such devices. So perhaps applying the same approach
> to bce will fix your issue.

I'd really prefer it if we'd solve this issue by promoting
_bus_dmamem_alloc_range() to a first class citizen.  I'll try to
recover an old diff that did this tonight.

No harm in testing this diff though to verify that it indeed solves
the issue.



Re: bce(4) - descriptor error

2015-01-22 Thread Stefan Sperling
On Thu, Jan 22, 2015 at 12:47:28PM -0500, John Merriam wrote:
> The machine currently has the 1GB DIMM in it.  I tested the patch and it
> seems to work with <= 1GB RAM.  ping -f test returned similar results so
> performance likely isn't affected.

You sure? This would be the first time I wrote a driver diff
that didn't panic the machine on first try  :)



Re: bce(4) - descriptor error

2015-01-22 Thread John Merriam

On 2015-01-22 12:04, Stefan Sperling wrote:

On Thu, Jan 22, 2015 at 11:34:47AM -0500, John Merriam wrote:
So, what could be the problem then?  Theoretically it did work as of 
the
1.35 if_bce.c revision which seems to have shipped in OpenBSD 5.0.  
This

message:

http://marc.info/?l=openbsd-tech&m=130217668909255

seems to verify that it did actually work at one point.


It looks as if some ring descriptor data is still being allocated with
bus_dmamem_alloc(). That function probably doesn't respect the mapping
constraints bce(4) hardware requires.

This diff makes bce use the same memory allocation APIs as bwi(4) is
using. Some bwi devices have the same 1GB problem and I know the bwi
code works fine with such devices. So perhaps applying the same 
approach

to bce will fix your issue.

Diff is compile-tested only due to lack of hardware at my end.



The machine currently has the 1GB DIMM in it.  I tested the patch and it 
seems to work with <= 1GB RAM.  ping -f test returned similar results so 
performance likely isn't affected.


I will put the 2GB DIMMs back in tonight when I can physically access 
the machine then test again and report results for > 1GB RAM.


--

John Merriam



Re: bce(4) - descriptor error

2015-01-22 Thread Stuart Henderson
On 2015/01/22 17:05, Stuart Henderson wrote:
> Would presumably be a change in uvm somewhere. (paddr_t)(0x4000 - 1)
> is passed as 'high' to uvm_km_kmemalloc_pla -> uvm_pglistalloc and is
> meant to constrain the addresses.
> 
> Identifying when (at least which release) it broke might be a good start..
> 

...but try stsp's diff first :)



Re: bce(4) - descriptor error

2015-01-22 Thread Stuart Henderson
On 2015/01/22 11:34, John Merriam wrote:
> On 2015-01-21 18:36, John Merriam wrote:
> >On 1/21/2015 1:43 PM, Stefan Sperling wrote:
> >>There is supposed to be a bounce buffer in bce to cope with
> >>systems with more than 1GB but perhaps it is broken.
> >>
> >
> >I installed the old 1GB DIMM that came with the machine when I
> >acquired it, and you are correct, it seems to work fine with <= 1GB
> >RAM.
> >
> >Next question is how to fix it so it works with > 1GB RAM.
> 
> Hmmm.  I looked at sys/dev/pci/if_bce.c and it looks to me like this patch:
> 
> http://marc.info/?l=openbsd-tech&m=130183146308043
> 
> is still in place.
> 
> I then downloaded version 1.35 (dated 4/3/2011) of if_bce.c from
> cvsweb.openbsd.org and did a diff against version 1.40 (in -current now) of
> if_bce.c and there are hardly any differences.  The main change is the call
> to m_devget() removing the 5th parameter which was NULL.  That first
> parameter to m_devget looks funny to me.  After thinking about it I think I
> know what it's doing but it just looks weird to me.  Anyway, the only other
> changes besides includes from 1.35 to 1.40 are in the bce_ioctl() function
> and they don't look like they would make any difference to me.
> 
> So, what could be the problem then?  Theoretically it did work as of the
> 1.35 if_bce.c revision which seems to have shipped in OpenBSD 5.0.  This
> message:
> 
> http://marc.info/?l=openbsd-tech&m=130217668909255
> 
> seems to verify that it did actually work at one point.

Would presumably be a change in uvm somewhere. (paddr_t)(0x4000 - 1)
is passed as 'high' to uvm_km_kmemalloc_pla -> uvm_pglistalloc and is
meant to constrain the addresses.

Identifying when (at least which release) it broke might be a good start..



Re: bce(4) - descriptor error

2015-01-22 Thread Stefan Sperling
On Thu, Jan 22, 2015 at 11:34:47AM -0500, John Merriam wrote:
> So, what could be the problem then?  Theoretically it did work as of the
> 1.35 if_bce.c revision which seems to have shipped in OpenBSD 5.0.  This
> message:
> 
> http://marc.info/?l=openbsd-tech&m=130217668909255
> 
> seems to verify that it did actually work at one point.

It looks as if some ring descriptor data is still being allocated with
bus_dmamem_alloc(). That function probably doesn't respect the mapping
constraints bce(4) hardware requires.

This diff makes bce use the same memory allocation APIs as bwi(4) is
using. Some bwi devices have the same 1GB problem and I know the bwi
code works fine with such devices. So perhaps applying the same approach
to bce will fix your issue.

Diff is compile-tested only due to lack of hardware at my end.

Index: if_bce.c
===
RCS file: /cvs/src/sys/dev/pci/if_bce.c,v
retrieving revision 1.40
diff -u -p -r1.40 if_bce.c
--- if_bce.c22 Dec 2014 02:28:51 -  1.40
+++ if_bce.c22 Jan 2015 17:02:05 -
@@ -185,6 +185,13 @@ bce_probe(struct device *parent, void *m
nitems(bce_devices)));
 }
 
+struct uvm_constraint_range bce_constraint = { 0x0, (0x4000 - 1) };
+struct kmem_pa_mode bce_pa_mode = {
+   .kp_align = 0x1000,
+   .kp_constraint = &bce_constraint,
+   .kp_zero = 1
+};
+
 void
 bce_attach(struct device *parent, struct device *self, void *aux)
 {
@@ -194,8 +201,6 @@ bce_attach(struct device *parent, struct
pci_intr_handle_t ih;
const char *intrstr = NULL;
caddr_t kva;
-   bus_dma_segment_t seg;
-   int rseg;
struct ifnet *ifp;
pcireg_t memtype;
bus_addr_t memaddr;
@@ -255,9 +260,9 @@ bce_attach(struct device *parent, struct
bce_reset(sc);
 
/* Create the data DMA region and maps. */
-   if ((sc->bce_data = (caddr_t)uvm_km_kmemalloc_pla(kernel_map,
-   uvm.kernel_object, (BCE_NTXDESC + BCE_NRXDESC) * MCLBYTES, 0,
-   UVM_KMF_NOWAIT, 0, (paddr_t)(0x4000 - 1), 0, 0, 1)) == NULL) {
+   sc->bce_data = (caddr_t)km_alloc((BCE_NTXDESC + BCE_NRXDESC) * MCLBYTES,
+   &kv_intrsafe, &bce_pa_mode, &kd_nowait);
+   if (sc->bce_data == NULL) {
printf(": unable to alloc space for ring");
return;
}
@@ -267,8 +272,8 @@ bce_attach(struct device *parent, struct
1, BCE_NRXDESC * MCLBYTES, 0, BUS_DMA_NOWAIT | BUS_DMA_ALLOCNOW,
&sc->bce_rxdata_map))) {
printf(": unable to create ring DMA map, error = %d\n", error);
-   uvm_km_free(kernel_map, (vaddr_t)sc->bce_data,
-   (BCE_NTXDESC + BCE_NRXDESC) * MCLBYTES);
+   km_free(sc->bce_data, (BCE_NTXDESC + BCE_NRXDESC) * MCLBYTES,
+   &kv_intrsafe, &bce_pa_mode);
return;
}
 
@@ -276,9 +281,9 @@ bce_attach(struct device *parent, struct
if (bus_dmamap_load(sc->bce_dmatag, sc->bce_rxdata_map, sc->bce_data,
BCE_NRXDESC * MCLBYTES, NULL, BUS_DMA_READ | BUS_DMA_NOWAIT)) {
printf(": unable to load rx ring DMA map\n");
-   uvm_km_free(kernel_map, (vaddr_t)sc->bce_data,
-   (BCE_NTXDESC + BCE_NRXDESC) * MCLBYTES);
bus_dmamap_destroy(sc->bce_dmatag, sc->bce_rxdata_map);
+   km_free(sc->bce_data, (BCE_NTXDESC + BCE_NRXDESC) * MCLBYTES,
+   &kv_intrsafe, &bce_pa_mode);
return;
}
 
@@ -287,9 +292,9 @@ bce_attach(struct device *parent, struct
1, BCE_NTXDESC * MCLBYTES, 0, BUS_DMA_NOWAIT | BUS_DMA_ALLOCNOW,
&sc->bce_txdata_map))) {
printf(": unable to create ring DMA map, error = %d\n", error);
-   uvm_km_free(kernel_map, (vaddr_t)sc->bce_data,
-   (BCE_NTXDESC + BCE_NRXDESC) * MCLBYTES);
bus_dmamap_destroy(sc->bce_dmatag, sc->bce_rxdata_map);
+   km_free(sc->bce_data, (BCE_NTXDESC + BCE_NRXDESC) * MCLBYTES,
+   &kv_intrsafe, &bce_pa_mode);
return;
}
 
@@ -298,10 +303,10 @@ bce_attach(struct device *parent, struct
sc->bce_data + BCE_NRXDESC * MCLBYTES,
BCE_NTXDESC * MCLBYTES, NULL, BUS_DMA_WRITE | BUS_DMA_NOWAIT)) {
printf(": unable to load tx ring DMA map\n");
-   uvm_km_free(kernel_map, (vaddr_t)sc->bce_data,
-   (BCE_NTXDESC + BCE_NRXDESC) * MCLBYTES);
bus_dmamap_destroy(sc->bce_dmatag, sc->bce_rxdata_map);
bus_dmamap_destroy(sc->bce_dmatag, sc->bce_txdata_map);
+   km_free(sc->bce_data, (BCE_NTXDESC + BCE_NRXDESC) * MCLBYTES,
+   &kv_intrsafe, &bce_pa_mode);
return;
}
 
@@ -315,26 +320,14 @@ bce_attach(struct device *parent, struct
 * XXX PAGE_SIZE is wasteful; we only need 1KB + 1KB, but
   

Re: bce(4) - descriptor error

2015-01-22 Thread John Merriam

On 2015-01-21 18:36, John Merriam wrote:

On 1/21/2015 1:43 PM, Stefan Sperling wrote:

There is supposed to be a bounce buffer in bce to cope with
systems with more than 1GB but perhaps it is broken.



I installed the old 1GB DIMM that came with the machine when I
acquired it, and you are correct, it seems to work fine with <= 1GB
RAM.

Next question is how to fix it so it works with > 1GB RAM.


Hmmm.  I looked at sys/dev/pci/if_bce.c and it looks to me like this 
patch:


http://marc.info/?l=openbsd-tech&m=130183146308043

is still in place.

I then downloaded version 1.35 (dated 4/3/2011) of if_bce.c from 
cvsweb.openbsd.org and did a diff against version 1.40 (in -current now) 
of if_bce.c and there are hardly any differences.  The main change is 
the call to m_devget() removing the 5th parameter which was NULL.  That 
first parameter to m_devget looks funny to me.  After thinking about it 
I think I know what it's doing but it just looks weird to me.  Anyway, 
the only other changes besides includes from 1.35 to 1.40 are in the 
bce_ioctl() function and they don't look like they would make any 
difference to me.


So, what could be the problem then?  Theoretically it did work as of the 
1.35 if_bce.c revision which seems to have shipped in OpenBSD 5.0.  This 
message:


http://marc.info/?l=openbsd-tech&m=130217668909255

seems to verify that it did actually work at one point.

--

John Merriam



Re: bce(4) - descriptor error

2015-01-21 Thread John Merriam

On 1/21/2015 1:43 PM, Stefan Sperling wrote:

On Wed, Jan 21, 2015 at 12:21:46PM -0500, j...@johnmerriam.net wrote:

Synopsis:   bce(4) infinite loop of descriptor error
Category:   kernel
Environment:

System  : OpenBSD 5.7
Details : OpenBSD 5.7-beta (GENERIC.MP) #0: Tue Jan 20 12:38:44 EST 
2015
 
r...@test.johnmerriam.net:/usr/src/sys/arch/amd64/compile/GENERIC.MP

Architecture: OpenBSD.amd64
Machine : amd64

Description:

If I try to use the bce(4) on-board Broadcom NIC in OpenBSD
I get bce0: descriptor error repeated on the console in an
infinite loop. The machine is unusable once it gets stuck in
the infinite loop and a power cycle is required. It looks to
me like the descriptor error infinite loop is occurring in
the bce_intr() function in sys/dev/pci/if_bce.c but I don't
see how to diagnose/fix it further. I am currently using a
rl(4) add-in NIC to make the machine usable. It is running
-current amd64 from CVS sync'ed 1/20/2015.

How-To-Repeat:

ifconfig bce0 up

Fix:

Using a different add-in NIC makes the machine usable but
being able to use the on-board NIC would be nice.


dmesg:
OpenBSD 5.7-beta (GENERIC.MP) #0: Tue Jan 20 12:38:44 EST 2015
 r...@test.johnmerriam.net:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 3437608960 (3278MB)
avail mem = 3342286848 (3187MB)


Try putting less than 1GB of memory in your machine and see if that
fixes it. IIRC the chip cannot do DMA access to memory above 1GB.

There is supposed to be a bounce buffer in bce to cope with
systems with more than 1GB but perhaps it is broken.



I installed the old 1GB DIMM that came with the machine when I acquired 
it, and you are correct, it seems to work fine with <= 1GB RAM.


I haven't tested it for a long time but I did a ping -f to it from 
another machine and it only dropped 9 out of about 1 million packets 
when using the bce interface with 1GB RAM:


# ping -n -f -s 1472 1.2.3.4
PING 1.2.3.4 (1.2.3.4): 1472 data bytes
--- 1.2.3.4 ping statistics ---
1026488 packets transmitted, 1026479 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.439/0.710/14.295/0.249 ms

Much better than hanging the machine in an infinite loop when just 
trying to use the bce interface at all.


Next question is how to fix it so it works with > 1GB RAM.

--

John Merriam



Re: bce(4) - descriptor error

2015-01-21 Thread Stefan Sperling
On Wed, Jan 21, 2015 at 12:21:46PM -0500, j...@johnmerriam.net wrote:
> >Synopsis:bce(4) infinite loop of descriptor error
> >Category:kernel
> >Environment:
>   System  : OpenBSD 5.7
>   Details : OpenBSD 5.7-beta (GENERIC.MP) #0: Tue Jan 20 12:38:44 EST 
> 2015
>
> r...@test.johnmerriam.net:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
>   Architecture: OpenBSD.amd64
>   Machine : amd64
> >Description:
>   If I try to use the bce(4) on-board Broadcom NIC in OpenBSD 
>   I get bce0: descriptor error repeated on the console in an 
>   infinite loop. The machine is unusable once it gets stuck in 
>   the infinite loop and a power cycle is required. It looks to 
>   me like the descriptor error infinite loop is occurring in 
>   the bce_intr() function in sys/dev/pci/if_bce.c but I don't 
>   see how to diagnose/fix it further. I am currently using a 
>   rl(4) add-in NIC to make the machine usable. It is running 
>   -current amd64 from CVS sync'ed 1/20/2015.
> >How-To-Repeat:
>   ifconfig bce0 up
> >Fix:
>   Using a different add-in NIC makes the machine usable but 
>   being able to use the on-board NIC would be nice.
> 
> 
> dmesg:
> OpenBSD 5.7-beta (GENERIC.MP) #0: Tue Jan 20 12:38:44 EST 2015
> r...@test.johnmerriam.net:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> real mem = 3437608960 (3278MB)
> avail mem = 3342286848 (3187MB)

Try putting less than 1GB of memory in your machine and see if that
fixes it. IIRC the chip cannot do DMA access to memory above 1GB.

There is supposed to be a bounce buffer in bce to cope with
systems with more than 1GB but perhaps it is broken.



bce(4) - descriptor error

2015-01-21 Thread john
>Synopsis:  bce(4) infinite loop of descriptor error
>Category:  kernel
>Environment:
System  : OpenBSD 5.7
Details : OpenBSD 5.7-beta (GENERIC.MP) #0: Tue Jan 20 12:38:44 EST 
2015
 
r...@test.johnmerriam.net:/usr/src/sys/arch/amd64/compile/GENERIC.MP

Architecture: OpenBSD.amd64
Machine : amd64
>Description:
If I try to use the bce(4) on-board Broadcom NIC in OpenBSD 
I get bce0: descriptor error repeated on the console in an 
infinite loop. The machine is unusable once it gets stuck in 
the infinite loop and a power cycle is required. It looks to 
me like the descriptor error infinite loop is occurring in 
the bce_intr() function in sys/dev/pci/if_bce.c but I don't 
see how to diagnose/fix it further. I am currently using a 
rl(4) add-in NIC to make the machine usable. It is running 
-current amd64 from CVS sync'ed 1/20/2015.
>How-To-Repeat:
ifconfig bce0 up
>Fix:
Using a different add-in NIC makes the machine usable but 
being able to use the on-board NIC would be nice.


dmesg:
OpenBSD 5.7-beta (GENERIC.MP) #0: Tue Jan 20 12:38:44 EST 2015
r...@test.johnmerriam.net:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 3437608960 (3278MB)
avail mem = 3342286848 (3187MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xf0450 (68 entries)
bios0: vendor Dell Inc. version "1.1.12" date 06/17/2009
bios0: Dell Inc. OptiPlex 320
acpi0 at bios0: rev 2
acpi0: sleep states S0 S1 S3 S4 S5
acpi0: tables DSDT FACP SSDT APIC BOOT MCFG HPET SLIC SSDT SSDT SSDT
acpi0: wakeup devices VBTN(S4) PCI0(S5) PCI7(S5) MAC1(S5) MOU_(S3) USB0(S3) 
USB1(S3) USB2(S3) USB3(S3) USB4(S3) USB5(S3)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM)2 Duo CPU E4400 @ 2.00GHz, 2000.34 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,EST,TM2,SSSE3,CX16,xTPR,PDCM,NXE,LONG,LAHF,PERF
cpu0: 2MB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 200MHz
cpu0: mwait min=64, max=64, C-substates=0.2.2.0.0, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM)2 Duo CPU E4400 @ 2.00GHz, 2000.08 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,EST,TM2,SSSE3,CX16,xTPR,PDCM,NXE,LONG,LAHF,PERF
cpu1: 2MB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
ioapic0 at mainbus0: apid 8 pa 0xfec0, version 21, 24 pins
ioapic0: misconfigured as apic 0, remapped to apid 8
acpimcfg0 at acpi0 addr 0xe000, bus 0-255
acpihpet0 at acpi0: 14318180 Hz
acpiprt0 at acpi0: bus 1 (PCI1)
acpiprt1 at acpi0: bus -1 (PCI2)
acpiprt2 at acpi0: bus -1 (PCI3)
acpiprt3 at acpi0: bus -1 (PCI4)
acpiprt4 at acpi0: bus -1 (PCI5)
acpiprt5 at acpi0: bus -1 (PCI6)
acpiprt6 at acpi0: bus -1 (PCI8)
acpiprt7 at acpi0: bus 2 (PCI7)
acpiprt8 at acpi0: bus 0 (PCI0)
acpicpu0 at acpi0: PSS
acpicpu1 at acpi0: PSS
acpibtn0 at acpi0: VBTN
cpu0: Enhanced SpeedStep 2000 MHz: speeds: 2000, 1600, 1200 MHz
pci0 at mainbus0 bus 0
0:20:0: mem address conflict 0xfed0/0x400
pchb0 at pci0 dev 0 function 0 "ATI RC410 Host" rev 0x01
ppb0 at pci0 dev 1 function 0 "ATI RS480 PCIE" rev 0x00
pci1 at ppb0 bus 1
radeondrm0 at pci1 dev 5 function 0 "ATI Radeon XPRESS 200" rev 0x00
drm0 at radeondrm0
radeondrm0: apic 8 int 17
ahci0 at pci0 dev 18 function 0 "ATI SB600 SATA" rev 0x00: apic 8 int 23, AHCI 
1.1
scsibus1 at ahci0: 32 targets
sd0 at scsibus1 targ 0 lun 0:  SCSI3 0/direct 
fixed naa.50014ee201ce1451
sd0: 476940MB, 512 bytes/sector, 976773168 sectors
cd0 at scsibus1 targ 1 lun 0:  ATAPI 5/cdrom 
removable
ohci0 at pci0 dev 19 function 0 "ATI SB600 USB" rev 0x00: apic 8 int 16, 
version 1.0, legacy support
ohci1 at pci0 dev 19 function 1 "ATI SB600 USB" rev 0x00: apic 8 int 17, 
version 1.0, legacy support
ohci2 at pci0 dev 19 function 2 "ATI SB600 USB" rev 0x00: apic 8 int 18, 
version 1.0, legacy support
ohci3 at pci0 dev 19 function 3 "ATI SB600 USB" rev 0x00: apic 8 int 17, 
version 1.0, legacy support
ohci4 at pci0 dev 19 function 4 "ATI SB600 USB" rev 0x00: apic 8 int 18, 
version 1.0, legacy support
ehci0 at pci0 dev 19 function 5 "ATI SB600 USB2" rev 0x00: apic 8 int 19
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 "ATI EHCI root hub" rev 2.00/1.00 addr 1
piixpm0 at pci0 dev 20 function 0 "ATI SBx00 SMBus" rev 0x13: SMI
iic0 at piixpm0
spdmem0 at iic0 addr 0x50: 2GB DDR2 SDRAM non-parity PC2-5300CL5
spdmem1 at iic0 addr 0x51: 2GB DDR2 SDRAM non-parity PC2-5300CL5
pciide0 at pci0 dev 20 function 1 "ATI SB600 IDE" rev 0x00: DMA, channel 0