USB "bwfm" users -- seeking victims^Wvolunteers

2018-04-28 Thread Jason Thorpe
Folks —

I’m working on pulling in a bunch of updates form the OpenBSD “bwfm” driver to 
support some additional hardware, but other than my RPI3 (which is the hardware 
I’m trying to support), I don’t have any devices to test on.

I don’t want devices — I want volunteers to test the extant USB attachment to 
make sure I haven’t broken anything.  Must be willing to build a kernel 
yourself.  Risking life and limb not required.

Ping me off-list if you can help.

Thx.

-- thorpej



Re: Two fatal errors with yesterday's -current

2018-04-28 Thread Martin Husemann
On Sat, Apr 28, 2018 at 05:37:54PM +0800, Paul Goyette wrote:
> It would seem that there's something broken with this newest version
> of acpica ...  Something that causes some systems to miss interrupts,
> perhaps.

Is there a difference in dmesg between the old and the new kernel?

Martin


Re: Two fatal errors with yesterday's -current

2018-04-28 Thread Paul Goyette

I've narrowed this down to the recent acpica update done on 2018-04-07
around 14:00 - 16:00 UTC.  A kernel built from 2018-04-07 14:00 (before
the import) works just fine, while a kernel built from 2018-04-07 16:00 
(just after the conflict merge commit) fails with these messages.  (A

kernel built in between these two commits doesn't build...)

It would seem that there's something broken with this newest version
of acpica ...  Something that causes some systems to miss interrupts,
perhaps.

I'll be happy to try any suggested fixes, but I know next-to-nothing
about the acpica code, so I'm not likely to have any brilliant ideas
on my own!  :)




On Sun, 22 Apr 2018, Paul Goyette wrote:

With sources built from 2018-04-20 23:11:20 UTC I encountered two fatal 
errors while booting:


First, the system was unable to identify either of my hard drives:

...
[   6.4285988] ums0 at uhidev2: 3 buttons and Z dir
[   6.4285988] wsmouse0 at ums0 mux 0
[   6.6787124] wd0: IDENTIFY failed
[   6.6787124] wd0: fixing 0 sector size
[   6.6787124] wd0: secperunit and ncylinders are zero
[   9.6800696] wd0(ahcisata0:0:0): using PIO mode 0
[   9.6800696] wd1 at atabus1 drive 0
[   9.7100829] ehci_sync_hc: timed out
[  10.7105355] ehci_sync_hc: timed out
[  12.6814270] wd1: IDENTIFY failed
[  12.6814270] wd1: fixing 0 sector size
[  12.6814270] wd1: secperunit and ncylinders are zero
...


Second, I received a whole bunch of the "ehci_sync_hc: timed out"
messages.  The two occurrences above were the first, but there were
several more later on in the boot.

After all the messages, the boot process prompted me for a root device
(since the real root device suffered from the IDENTIFY failed).  And
just as soon as I entered one character on the keyboard, BOOM it crashed
and dropped into ddb.

Unfortunately, I was unable to get a crash dump (no hard drive!) , and I
also forgot to transcribe the backtrace.  If necessary, I can repeat the
experiment.

For comparison purposes, I have attached the dmesg from my earlier (sources 
dated 2018-03-20 11:25:00 UTC) working kernel.



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+--+--++


+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+--+--++


Re: Running out of buffers?

2018-04-28 Thread Roy Marples

On 27/04/2018 23:58, Robert Elz wrote:

 Date:Fri, 27 Apr 2018 21:34:49 +0100
 From:Roy Marples 
 Message-ID:  

   | Hopefully this fixes the issues and won't impact small memory devices
   | too much.

While those are probably useful changes to make, they don't fix anything,
merely make it less likely.


Until we can dynamically size the buffer in the kernel on demand you are 
correct.



We really need to turn off the error on recv() by default - and allow it
to be turned on by applications that actually want to deal with this.


Why should we special case reporting this error instead of others?
While NetBSD might be the first BSD to report ENOBUFS for recv(), it's 
certainly not the first OS to do so.


Looking at Pauls logs, ntpd is reporting this a fair bit.
Looking at ntpd, it already *has* logic to deal exclusivly with this 
error - it logs it and continues. Any other error and it closes the 
socket and gives up.


Roy