config system question

2023-07-11 Thread Phil Nelson
Hi all,

  I'm working on getting the iwlwifi driver for intel devices into
the new wifi framework.  In going over the files as named I see
that there are multiple different files "xxx/tx.c" and "yyy/tx.c".
Those names would both compile to tx.o in a kernel compile directory.
I haven't seen any instance with same named files and was wondering
if the config system has a way to make the .o file name different
or I need to rename those to something like "xxx/xxx_tx.c" and
"yyy/yyy_tx.c"?

--Phil


sdmmc question.

2023-05-23 Thread Phil Nelson
Hello,

   I'm working with a student to get NetBSD working on the SiFive
HiFive Unleashed board.  I know this is no longer being made, but
we have one.  He has the kernel running on the board until it wants
to mount root.  We want to get a sd driver working.  The issue
is that the sd device is on the spi bus on that hardware and sdmmc
doesn't work on the spi bus.

   We want to figure out how to connect the sdmmc to the spi bus.
We have tried a number of things in the config framework to get
sdmmc connected, but we don't know the internals of config well
enough and we have not found any prior work that will help us
get that done.

   Can anyone help us with the config framework and what is needed
to get sdmmc to talk to the spi bus?

   I'm presuming that we'll need something in the dev/spi/files.spi,
but haven't figured out what to say to get it to work.   And I'm
assuming there is a .c file that needs to implement the interface
between the sdmmc and the spi, but I'm not sure.   Do we need
something in another place?  

   Pointers to examples and/or documentation would be appreciated.

   Thanks.

--Phil


PR 52347?

2019-03-04 Thread Phil Nelson
Hello,

   I was wondering if anyone is working on solving PR 52347?   "ww mutex class 
mismatch"
I'm getting it averaging about once every other working day on my main machine. 
 

  8.99.34 did crash, but I couldn't get a good backtrace.  This is a backtrace 
for a
8.99.30 a few days ago.

# crash -M netbsd.34.core  -N netbsd.34
Crash version 8.99.34, image version 8.99.30.
WARNING: versions differ, you may not be able to examine this image.
System panicked: kernel diagnostic assertion "(ctx->wwx_class == 
mutex->wwm_u.ctx->wwx_class)" failed: file 
"../../../../external/bsd/drm2/linux/linux_ww_mutex.c", line 304 ww mutex class 
mismatch: 0x816994c0 != 0x809aed64
Backtrace from time of crash is available.
crash> bt
_KERNEL_OPT_NARCNET() at 0
?() at ecccbcb54a38
vpanic() at vpanic+0x178
ch_voltag_convert_in() at ch_voltag_convert_in
ww_mutex_lock_wait_sig() at ww_mutex_lock_wait_sig+0x199
linux_ww_mutex_lock_interruptible() at linux_ww_mutex_lock_interruptible+0x10a
ttm_eu_reserve_buffers() at ttm_eu_reserve_buffers+0xce
radeon_bo_list_validate() at radeon_bo_list_validate+0x79
radeon_cs_ioctl() at radeon_cs_ioctl+0x842
drm_ioctl() at drm_ioctl+0x23b
sys_ioctl() at sys_ioctl+0x11c
syscall() at syscall+0x173
--- syscall (number 54) ---
7d704bd1a72a:
crash> 

--Phil


Re: workqueues ....

2018-07-28 Thread Phil Nelson
On Thursday 26 July 2018 23:23:13 Taylor R Campbell wrote:
> static void
> foo_intr(...)
> {
> ...
> mutex_enter(&sc->sc_work_lock);
> if (!sc->sc_work_scheduled) {
> workqueue_enqueue(sc->sc_wq, &sc->sc_work, NULL);
> sc->sc_work_scheduled = true;
> }
> mutex_exit(&sc->sc_work_lock);
> ...
> }

I was just wondering ... you show the intr enqueuing the work.
Is it OK to have a callout enqueue work so one can enqueue
work at some point in the future?

--Phil


Re: workqueues ....

2018-07-27 Thread Phil Nelson
On Friday 27 July 2018 11:39:16 Mindaugas Rasiukevicius wrote:
> This is an indication that you are trying to acquire an adaptive lock
> while holding a spin-lock.  Adaptive mutex (using IPL_NONE) blocks and,
> by design, you cannot block while holding a spin-mutex (> IPL_NONE).
> If you will inspect the callers of urtwn_get_tx_data(), I guess you
> will find something holding a spin-mutex at a higher level.

Thanks.   This helps.

--Phil


Re: workqueues ....

2018-07-27 Thread Phil Nelson
On Thursday 26 July 2018 23:23:13 Taylor R Campbell wrote:
> Is this a conceptual problem, or do you have a symptom that you're
> actually hitting with specific code?  If the latter, can you describe
> the symptom and quote the code?

Yes, this a real problem I'm having. 

This is my real "f()":

struct urtwn_tx_data *
urtwn_get_tx_data(struct urtwn_softc *sc, size_t pidx)
{
struct urtwn_tx_data *data = NULL;

mutex_enter(&sc->sc_tx_mtx); 
if (!TAILQ_EMPTY(&sc->tx_free_list[pidx])) {
data = TAILQ_FIRST(&sc->tx_free_list[pidx]);
TAILQ_REMOVE(&sc->tx_free_list[pidx], data, next);
}
mutex_exit(&sc->sc_tx_mtx);

return data;
}

I'm getting a mutex error here in that the lock is held.
Backtrace:
System panicked: LOCKDEBUG: Mutex error: mutex_vector_enter,528: spin lock held
Backtrace from time of crash is available.
crash> bt
_KERNEL_OPT_NARCNET() at 0
_KERNEL_OPT_ACPI_SCANPCI() at _KERNEL_OPT_ACPI_SCANPCI
vpanic() at vpanic+0x17d
snprintf() at snprintf
lockdebug_more() at lockdebug_more
mutex_enter() at mutex_enter+0x6b6
urtwn_get_tx_data() at urtwn_get_tx_data+0x22
urtwn_raw_xmit() at urtwn_raw_xmit+0x3e
ieee80211_raw_output() at ieee80211_raw_output+0x68
ieee80211_send_probereq() at ieee80211_send_probereq+0x326
scan_curchan() at scan_curchan+0x3c
scan_start() at scan_start+0x2b0
workqueue_worker() at workqueue_worker+0xe9

I'm seeing no evidence that scan_start() has been run twice and
I'm not seeing any other debug messages that even say that 
urtwn_get_tx_data is being called again.I can't snoop at crash
time because my usb keyboard quits working on a panic.   I've
been using crash to snoop but I'm not that good at it yet.

An "ifconfig urtwn0 up" started the scan but it appears that ifconfig
is no longer running.   I see only one lwp running "net80211_wq".
This particular mutex gets called from the the usb softintr at the
end of a transmit.   So with ifconfig no longer running, it can't
be that ifconfig is calling a transmit function from the original
thread and then calling urtwn_get_tx_data().

During normal running, this mutex is called in a transmit path
(urtwn_start() the if_start function and  urtwn_raw_xmit() that 
is used by the 80211 layer in areas like the scan where afaict,
they are management frames)  and in the urtwn_txeof() which
is the report back that a transmit was done.

I'm assuming that the softintr and the workqueue don't look like
the same owner.   So I'm stuck wondering what is happening here.

Even though I don't see the scan_start called twice, I do need 
to protect against that.   I'll see if that fixes the problem.

--Phil


workqueues ....

2018-07-26 Thread Phil Nelson
Hello all,

I'm trying to work with workqueues and am having a locking problem

Lets say I have a function f() as follows:

int  f() {
   mutex_enter(&some_mutex);
   .. code .
   mutex_exit(&some_mutex);
}

and now lets say that I start another function running via a workqueue, g()
g() {
  . some code 
  if (f()) {
   do something else ...
  } else {
  error
  }
}

Now the code that starts the workqueue something like:

 some setup code 
 workqueue_enqueue(myworkqueue, work_that_runs_g(), NULL);
 some more code
 if (f()) {
   success ...
} else {
error.
}

So, my question is:  if g() is running f() and holds the mutex and then the
main code calls f() ... will this be detected as already holding the lock?

If it will be detected as already holding the lock,  how can I do locking
between the code that does the enqueue and the code in the work item?

--Phil


KGDB

2018-07-17 Thread Phil Nelson
Hello,

Does anyone know if KGDB will work across a pcie serial card or does
it need to be  the old motherboard ones?How about usb?

--Phil


Re: usb/xhci lock issue on HEAD

2018-07-12 Thread Phil Nelson
On Thursday 12 July 2018 00:54:45 Martin Husemann wrote:
> You commented out a bit too much and it does not find your boot device?
> 
>         /*
>          * If wildcarded root and we the boot device wasn't determined,
>          * ask the user.
>          */
>         if (rootspec == NULL && bootdv == NULL)
>                 boothowto |= RB_ASKNAME;
> 
> (kern_subr.c:257)
> 
> You normally would see something like:
> 
> [..]
> boot device: wd0
> root on wd0a dumps on wd0b
> root file system type: ffs
> 
> but if it can not identify the device, it tries to ask you for it (and
> runs into an unrelated usb bug).
> 

No, I didn't comment out anything to do with the boot and root devices.
It is true I ran into an unrelated usb bug, but there must also be a bug
somewhere where the bootdv ends up NULL.   I got past this problem
by fixing the root device in the kernel config.   The problems are the 
same on my wifi branch and head.  So this is not my bug.

The kernel (from both branches) says:

boot device: 

I copied GENERIC and then removed a bunch of drivers and changed 
options, but left in all the 802.11 devices.   It works and I have been 
running it for a while.   I then removed all the 802.11 devices except 
urtwn and it can't find the boot device.   That is the only difference 
between the working kernel and the kernel that doesn't know about
the boot device.  This is on HEAD.

--Phil


Re: usb/xhci lock issue on HEAD

2018-07-11 Thread Phil Nelson
On Wednesday 11 July 2018 22:32:13 Patrick Welche wrote:
> "boot netbsd -a" ?

No, just "boot netbsd.wifi" to boot my special wifi kernel that I'm sure
will crash and don't want it doing an autoboot to.

--Phil


usb/xhci lock issue on HEAD

2018-07-11 Thread Phil Nelson
Hello,

   Has anyone run into this?   I created a special kernel for my 802.11 work
that removes a lot of unneeded drivers from my setup, stuff like raid, ntfs
and so forth.   I got a working kernel out of it.   Then, to work on the 802.11,
I commented out every 802.11 driver except the urtwn driver.   This new
kernel without the the 802.11 drivers no panics as follows:

panic: kernel diagnostic assertion "mutex_owned(&sc->sc_lock)" failed: file 
"../../../../dev/usb/xhci.c", line 2049 
[6.459749] cpu0: Begin traceback...
[6.459749] vpanic() at netbsd:vpanic+0x16f
[6.459749] ch_voltag_convert_in() at netbsd:ch_voltag_convert_in
[6.459749] xhci_softintr() at netbsd:xhci_softintr+0x5d7
[6.459749] xhci_poll() at netbsd:xhci_poll+0x37
[6.459749] ukbd_cngetc() at netbsd:ukbd_cngetc+0x113
[6.459749] wskbd_cngetc() at netbsd:wskbd_cngetc+0xc8
[6.459749] wsdisplay_getc() at netbsd:wsdisplay_getc+0x2f
[6.459749] cngetc() at netbsd:cngetc+0x4d
[6.459749] cngetsn() at netbsd:cngetsn+0x71
[6.459749] setroot() at netbsd:setroot+0x46f
[6.459749] main() at netbsd:main+0x4a5
[6.459749] cpu0: End traceback...
[6.459749] fatal breakpoint trap in supervisor mode
[6.459749] trap type 1 code 0 rip 0x8021de15 cs 0x8 rflags 0x202 
cr2 0 ilevel 0x8 rsp 0x81451a70
[6.459749] curlwp 0x81020920 pid 0.1 lowest kstack 
0x8144d2c0
fatal protection fault in supervisor mode
[6.459749] trap type 4 code 0 rip 0x8087caea cs 0x8 rflags 0x10282 
cr2 0 ilevel 0x8 rsp 0x81451480
[6.459749] curlwp 0x81020920 pid 0.1 lowest kstack 
0x8144d2c0
rebooting...

I'm not sure why this kernel is calling cngetsn() at setroot() time. 

Has anyone seen this before?

--Phil


mutex question

2018-07-06 Thread Phil Nelson
Hello,

The FreeBSD 802.11 code is using a call to mtx_sleep().  The define is:

#define mtx_sleep(chan, mtx, pri, wmesg, timo)  \
_sleep((chan), &(mtx)->lock_object, (pri), (wmesg), \
tick_sbt * (timo), 0, C_HARDCLOCK)


Just in case I can save time by getting an answer by asking before digging deep 
...
does anyone know what I should translate this to in NetBSD?   Our mutex routines
do not appear to have any similar call.

--Phil


Re: new errno ?

2018-07-06 Thread Phil Nelson
On Friday 06 July 2018 15:59:12 Jason Thorpe wrote:
> Anyway... in what situations is this absurd error code used in the 802.11 
> code?  EFAULT seems wrong because it means something very specific. 

The code is in ieee80211_output.c and says:

/* locate destination node */
switch (wh->i_fc[1] & IEEE80211_FC1_DIR_MASK) {
case IEEE80211_FC1_DIR_NODS:
case IEEE80211_FC1_DIR_FROMDS:
ni = ieee80211_find_txnode(vap, wh->i_addr1);
break;
case IEEE80211_FC1_DIR_TODS:
case IEEE80211_FC1_DIR_DSTODS:
ni = ieee80211_find_txnode(vap, wh->i_addr3);
break;
default:
senderr(EDOOFUS);
}

I agree,  EINVAL sounds closer.   Thanks.

--Phil


Re: new errno ?

2018-07-06 Thread Phil Nelson
On Friday 06 July 2018 12:09:55 Greg Troxel wrote:
>  I might just map it to EFAULT or EINVAL.

I like this suggestion.  EFAULT

--Phil


new errno ?

2018-07-06 Thread Phil Nelson
Hello,

In working on the 802.11 refresh, I ran into a new errno code from FreeBSD:

#define EDOOFUS 88  /* Programming error */

Shall we add this one?  (Most likely with a different number since 88 is 
taken
in the NetBSD errno.h.)

   I could use EPROTO instead, but 

--Phil


Re: NetBSD 8.0 RC1 issue

2018-04-26 Thread Phil Nelson
On Wednesday 25 April 2018 12:40:32 Tom Spindler (moof) wrote:
> > I'm a little late in trying 8.0 ... but I just tried to install 8.0 RC1 on a
> > Dell Optiplex 745.   During boot it blanks the display and nothing can
> > be seen on the display from that point on.  
> 
> IIRC, I had similar issues with going from 6.x to 7.x; in my /etc/boot.cfg
> I had to change the entries along the lines of
> 
> menu=Boot normally:rndseed /var/db/entropy-file;vesa 1024x768x8;boot netbsd
> menu=Drop to boot prompt:vesa 1024x768x8;prompt
> 
> Alternately, "vesa on" supposedly works, but I never got that to work
> consistently.

None of that worked for me.   the "vesa on" mode blanked the display faster
than the regular boot.  I tried a number of the vesa settings to no avail.

I also tried 8.0 RC1  on a Dell 990 and it wouldn't even boot.  6.1.5 is running
just great on that machine, but I couldn't get the 8.0 RC1 to even find a root
device when booting the amd64 8.0 RC1.   *sigh*

--Phil


NetBSD 8.0 RC1 issue

2018-04-25 Thread Phil Nelson
I'm a little late in trying 8.0 ... but I just tried to install 8.0 RC1 on a
Dell Optiplex 745.   During boot it blanks the display and nothing can
be seen on the display from that point on.  

The vga/display parts of the dmesg from 6.1.5 says:

vga1 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992 (rev. 0x02)
vga1: WARNING: ignoring 64-bit BAR @ 0x18
wsdisplay0 at vga1 kbdmux 1: console (80x25, vt100 emulation)
wsmux1: connecting to wsdisplay0
i915drm0 at vga1: Intel i965Q
i915drm0: AGP at 0xc000 256MB
i915drm0: Initialized i915 1.6.0 20080730
vendor 0x8086 product 0x2993 (miscellaneous display, revision 0x02) at pci0 dev 
2 function 1 not configured

The monitor is  1920x1200 dell.

I was expecting it to switch from large characters to small characters at
that point like it does on other recent hardware using similar sized monitors.

--Phil


Re: meltdown

2018-01-05 Thread Phil Nelson
On Thursday 04 January 2018 12:49:22 m...@netbsd.org wrote:
> I wonder if we can count the number of SEGVs and if we get a few, turn
> on the workaround? 

How about turning on the workaround for any process that ignores
or catches SEGV.Any process that is terminated by a SEGV should
be safe, shouldn't it?

--Phil

-- 
Phil Nelson, http://pcnelson.net



File systems on 4k sector devices?

2012-06-07 Thread Phil Nelson
Hello,

   I recently acquired a 3TB USB disk that attaches as a sd0 disk.  disklabel
reports that it has 4k sectors.   I can write a disklabel on the disk, but newfs
can not create a fs.

(parts of the disk label)
bytes/sector: 4096
sectors/track: 32
tracks/cylinder: 64
sectors/cylinder: 2048
cylinders: 357698
total sectors: 732566642

  d: 732566642 0 unused  0 0# (Cyl.  0 - 
357698*)
 e: 563 4.2BSD   1024  8192 0  # (Cyl.  0*- 24*)
 f: 732516579 50063 4.2BSD   4096 32768 0  # (Cyl. 24*- 357698*)
bash-4.1# newfs -S 4096 /dev/rsd0e
/dev/rsd0e: 195.3MB (5 sectors) block size 32768, fragment size 4096
using 4 cylinder groups of 48.84MB, 1563 blks, 3072 inodes.
rdfs: read error for sector 1: Invalid argument
bash-4.1# newfs -S 4096 /dev/rsd0f
/dev/rsd0f: 2861392.9MB (732516579 sectors) block size 32768, fragment size 4096
using 3860 cylinder groups of 741.31MB, 23722 blks, 47104 inodes.
wtfs: write error for sector 732516578: Invalid argument

I'm running 5.1/i386.

(The e: above was just to try a small size.)

Is there support for 4K sectors in 5.1?   I've read that other people have 
successfully created file systems on 4k sector devices.

So, what am I doing wrong?

--Phil

-- 
Phil Nelson, http://pcnelson.net
life: http://goallpower.com