config system question
Hi all, I'm working on getting the iwlwifi driver for intel devices into the new wifi framework. In going over the files as named I see that there are multiple different files "xxx/tx.c" and "yyy/tx.c". Those names would both compile to tx.o in a kernel compile directory. I haven't seen any instance with same named files and was wondering if the config system has a way to make the .o file name different or I need to rename those to something like "xxx/xxx_tx.c" and "yyy/yyy_tx.c"? --Phil
sdmmc question.
Hello, I'm working with a student to get NetBSD working on the SiFive HiFive Unleashed board. I know this is no longer being made, but we have one. He has the kernel running on the board until it wants to mount root. We want to get a sd driver working. The issue is that the sd device is on the spi bus on that hardware and sdmmc doesn't work on the spi bus. We want to figure out how to connect the sdmmc to the spi bus. We have tried a number of things in the config framework to get sdmmc connected, but we don't know the internals of config well enough and we have not found any prior work that will help us get that done. Can anyone help us with the config framework and what is needed to get sdmmc to talk to the spi bus? I'm presuming that we'll need something in the dev/spi/files.spi, but haven't figured out what to say to get it to work. And I'm assuming there is a .c file that needs to implement the interface between the sdmmc and the spi, but I'm not sure. Do we need something in another place? Pointers to examples and/or documentation would be appreciated. Thanks. --Phil
PR 52347?
Hello, I was wondering if anyone is working on solving PR 52347? "ww mutex class mismatch" I'm getting it averaging about once every other working day on my main machine. 8.99.34 did crash, but I couldn't get a good backtrace. This is a backtrace for a 8.99.30 a few days ago. # crash -M netbsd.34.core -N netbsd.34 Crash version 8.99.34, image version 8.99.30. WARNING: versions differ, you may not be able to examine this image. System panicked: kernel diagnostic assertion "(ctx->wwx_class == mutex->wwm_u.ctx->wwx_class)" failed: file "../../../../external/bsd/drm2/linux/linux_ww_mutex.c", line 304 ww mutex class mismatch: 0x816994c0 != 0x809aed64 Backtrace from time of crash is available. crash> bt _KERNEL_OPT_NARCNET() at 0 ?() at ecccbcb54a38 vpanic() at vpanic+0x178 ch_voltag_convert_in() at ch_voltag_convert_in ww_mutex_lock_wait_sig() at ww_mutex_lock_wait_sig+0x199 linux_ww_mutex_lock_interruptible() at linux_ww_mutex_lock_interruptible+0x10a ttm_eu_reserve_buffers() at ttm_eu_reserve_buffers+0xce radeon_bo_list_validate() at radeon_bo_list_validate+0x79 radeon_cs_ioctl() at radeon_cs_ioctl+0x842 drm_ioctl() at drm_ioctl+0x23b sys_ioctl() at sys_ioctl+0x11c syscall() at syscall+0x173 --- syscall (number 54) --- 7d704bd1a72a: crash> --Phil
Re: workqueues ....
On Thursday 26 July 2018 23:23:13 Taylor R Campbell wrote: > static void > foo_intr(...) > { > ... > mutex_enter(&sc->sc_work_lock); > if (!sc->sc_work_scheduled) { > workqueue_enqueue(sc->sc_wq, &sc->sc_work, NULL); > sc->sc_work_scheduled = true; > } > mutex_exit(&sc->sc_work_lock); > ... > } I was just wondering ... you show the intr enqueuing the work. Is it OK to have a callout enqueue work so one can enqueue work at some point in the future? --Phil
Re: workqueues ....
On Friday 27 July 2018 11:39:16 Mindaugas Rasiukevicius wrote: > This is an indication that you are trying to acquire an adaptive lock > while holding a spin-lock. Adaptive mutex (using IPL_NONE) blocks and, > by design, you cannot block while holding a spin-mutex (> IPL_NONE). > If you will inspect the callers of urtwn_get_tx_data(), I guess you > will find something holding a spin-mutex at a higher level. Thanks. This helps. --Phil
Re: workqueues ....
On Thursday 26 July 2018 23:23:13 Taylor R Campbell wrote: > Is this a conceptual problem, or do you have a symptom that you're > actually hitting with specific code? If the latter, can you describe > the symptom and quote the code? Yes, this a real problem I'm having. This is my real "f()": struct urtwn_tx_data * urtwn_get_tx_data(struct urtwn_softc *sc, size_t pidx) { struct urtwn_tx_data *data = NULL; mutex_enter(&sc->sc_tx_mtx); if (!TAILQ_EMPTY(&sc->tx_free_list[pidx])) { data = TAILQ_FIRST(&sc->tx_free_list[pidx]); TAILQ_REMOVE(&sc->tx_free_list[pidx], data, next); } mutex_exit(&sc->sc_tx_mtx); return data; } I'm getting a mutex error here in that the lock is held. Backtrace: System panicked: LOCKDEBUG: Mutex error: mutex_vector_enter,528: spin lock held Backtrace from time of crash is available. crash> bt _KERNEL_OPT_NARCNET() at 0 _KERNEL_OPT_ACPI_SCANPCI() at _KERNEL_OPT_ACPI_SCANPCI vpanic() at vpanic+0x17d snprintf() at snprintf lockdebug_more() at lockdebug_more mutex_enter() at mutex_enter+0x6b6 urtwn_get_tx_data() at urtwn_get_tx_data+0x22 urtwn_raw_xmit() at urtwn_raw_xmit+0x3e ieee80211_raw_output() at ieee80211_raw_output+0x68 ieee80211_send_probereq() at ieee80211_send_probereq+0x326 scan_curchan() at scan_curchan+0x3c scan_start() at scan_start+0x2b0 workqueue_worker() at workqueue_worker+0xe9 I'm seeing no evidence that scan_start() has been run twice and I'm not seeing any other debug messages that even say that urtwn_get_tx_data is being called again.I can't snoop at crash time because my usb keyboard quits working on a panic. I've been using crash to snoop but I'm not that good at it yet. An "ifconfig urtwn0 up" started the scan but it appears that ifconfig is no longer running. I see only one lwp running "net80211_wq". This particular mutex gets called from the the usb softintr at the end of a transmit. So with ifconfig no longer running, it can't be that ifconfig is calling a transmit function from the original thread and then calling urtwn_get_tx_data(). During normal running, this mutex is called in a transmit path (urtwn_start() the if_start function and urtwn_raw_xmit() that is used by the 80211 layer in areas like the scan where afaict, they are management frames) and in the urtwn_txeof() which is the report back that a transmit was done. I'm assuming that the softintr and the workqueue don't look like the same owner. So I'm stuck wondering what is happening here. Even though I don't see the scan_start called twice, I do need to protect against that. I'll see if that fixes the problem. --Phil
workqueues ....
Hello all, I'm trying to work with workqueues and am having a locking problem Lets say I have a function f() as follows: int f() { mutex_enter(&some_mutex); .. code . mutex_exit(&some_mutex); } and now lets say that I start another function running via a workqueue, g() g() { . some code if (f()) { do something else ... } else { error } } Now the code that starts the workqueue something like: some setup code workqueue_enqueue(myworkqueue, work_that_runs_g(), NULL); some more code if (f()) { success ... } else { error. } So, my question is: if g() is running f() and holds the mutex and then the main code calls f() ... will this be detected as already holding the lock? If it will be detected as already holding the lock, how can I do locking between the code that does the enqueue and the code in the work item? --Phil
KGDB
Hello, Does anyone know if KGDB will work across a pcie serial card or does it need to be the old motherboard ones?How about usb? --Phil
Re: usb/xhci lock issue on HEAD
On Thursday 12 July 2018 00:54:45 Martin Husemann wrote: > You commented out a bit too much and it does not find your boot device? > > /* > * If wildcarded root and we the boot device wasn't determined, > * ask the user. > */ > if (rootspec == NULL && bootdv == NULL) > boothowto |= RB_ASKNAME; > > (kern_subr.c:257) > > You normally would see something like: > > [..] > boot device: wd0 > root on wd0a dumps on wd0b > root file system type: ffs > > but if it can not identify the device, it tries to ask you for it (and > runs into an unrelated usb bug). > No, I didn't comment out anything to do with the boot and root devices. It is true I ran into an unrelated usb bug, but there must also be a bug somewhere where the bootdv ends up NULL. I got past this problem by fixing the root device in the kernel config. The problems are the same on my wifi branch and head. So this is not my bug. The kernel (from both branches) says: boot device: I copied GENERIC and then removed a bunch of drivers and changed options, but left in all the 802.11 devices. It works and I have been running it for a while. I then removed all the 802.11 devices except urtwn and it can't find the boot device. That is the only difference between the working kernel and the kernel that doesn't know about the boot device. This is on HEAD. --Phil
Re: usb/xhci lock issue on HEAD
On Wednesday 11 July 2018 22:32:13 Patrick Welche wrote: > "boot netbsd -a" ? No, just "boot netbsd.wifi" to boot my special wifi kernel that I'm sure will crash and don't want it doing an autoboot to. --Phil
usb/xhci lock issue on HEAD
Hello, Has anyone run into this? I created a special kernel for my 802.11 work that removes a lot of unneeded drivers from my setup, stuff like raid, ntfs and so forth. I got a working kernel out of it. Then, to work on the 802.11, I commented out every 802.11 driver except the urtwn driver. This new kernel without the the 802.11 drivers no panics as follows: panic: kernel diagnostic assertion "mutex_owned(&sc->sc_lock)" failed: file "../../../../dev/usb/xhci.c", line 2049 [6.459749] cpu0: Begin traceback... [6.459749] vpanic() at netbsd:vpanic+0x16f [6.459749] ch_voltag_convert_in() at netbsd:ch_voltag_convert_in [6.459749] xhci_softintr() at netbsd:xhci_softintr+0x5d7 [6.459749] xhci_poll() at netbsd:xhci_poll+0x37 [6.459749] ukbd_cngetc() at netbsd:ukbd_cngetc+0x113 [6.459749] wskbd_cngetc() at netbsd:wskbd_cngetc+0xc8 [6.459749] wsdisplay_getc() at netbsd:wsdisplay_getc+0x2f [6.459749] cngetc() at netbsd:cngetc+0x4d [6.459749] cngetsn() at netbsd:cngetsn+0x71 [6.459749] setroot() at netbsd:setroot+0x46f [6.459749] main() at netbsd:main+0x4a5 [6.459749] cpu0: End traceback... [6.459749] fatal breakpoint trap in supervisor mode [6.459749] trap type 1 code 0 rip 0x8021de15 cs 0x8 rflags 0x202 cr2 0 ilevel 0x8 rsp 0x81451a70 [6.459749] curlwp 0x81020920 pid 0.1 lowest kstack 0x8144d2c0 fatal protection fault in supervisor mode [6.459749] trap type 4 code 0 rip 0x8087caea cs 0x8 rflags 0x10282 cr2 0 ilevel 0x8 rsp 0x81451480 [6.459749] curlwp 0x81020920 pid 0.1 lowest kstack 0x8144d2c0 rebooting... I'm not sure why this kernel is calling cngetsn() at setroot() time. Has anyone seen this before? --Phil
mutex question
Hello, The FreeBSD 802.11 code is using a call to mtx_sleep(). The define is: #define mtx_sleep(chan, mtx, pri, wmesg, timo) \ _sleep((chan), &(mtx)->lock_object, (pri), (wmesg), \ tick_sbt * (timo), 0, C_HARDCLOCK) Just in case I can save time by getting an answer by asking before digging deep ... does anyone know what I should translate this to in NetBSD? Our mutex routines do not appear to have any similar call. --Phil
Re: new errno ?
On Friday 06 July 2018 15:59:12 Jason Thorpe wrote: > Anyway... in what situations is this absurd error code used in the 802.11 > code? EFAULT seems wrong because it means something very specific. The code is in ieee80211_output.c and says: /* locate destination node */ switch (wh->i_fc[1] & IEEE80211_FC1_DIR_MASK) { case IEEE80211_FC1_DIR_NODS: case IEEE80211_FC1_DIR_FROMDS: ni = ieee80211_find_txnode(vap, wh->i_addr1); break; case IEEE80211_FC1_DIR_TODS: case IEEE80211_FC1_DIR_DSTODS: ni = ieee80211_find_txnode(vap, wh->i_addr3); break; default: senderr(EDOOFUS); } I agree, EINVAL sounds closer. Thanks. --Phil
Re: new errno ?
On Friday 06 July 2018 12:09:55 Greg Troxel wrote: > I might just map it to EFAULT or EINVAL. I like this suggestion. EFAULT --Phil
new errno ?
Hello, In working on the 802.11 refresh, I ran into a new errno code from FreeBSD: #define EDOOFUS 88 /* Programming error */ Shall we add this one? (Most likely with a different number since 88 is taken in the NetBSD errno.h.) I could use EPROTO instead, but --Phil
Re: NetBSD 8.0 RC1 issue
On Wednesday 25 April 2018 12:40:32 Tom Spindler (moof) wrote: > > I'm a little late in trying 8.0 ... but I just tried to install 8.0 RC1 on a > > Dell Optiplex 745. During boot it blanks the display and nothing can > > be seen on the display from that point on. > > IIRC, I had similar issues with going from 6.x to 7.x; in my /etc/boot.cfg > I had to change the entries along the lines of > > menu=Boot normally:rndseed /var/db/entropy-file;vesa 1024x768x8;boot netbsd > menu=Drop to boot prompt:vesa 1024x768x8;prompt > > Alternately, "vesa on" supposedly works, but I never got that to work > consistently. None of that worked for me. the "vesa on" mode blanked the display faster than the regular boot. I tried a number of the vesa settings to no avail. I also tried 8.0 RC1 on a Dell 990 and it wouldn't even boot. 6.1.5 is running just great on that machine, but I couldn't get the 8.0 RC1 to even find a root device when booting the amd64 8.0 RC1. *sigh* --Phil
NetBSD 8.0 RC1 issue
I'm a little late in trying 8.0 ... but I just tried to install 8.0 RC1 on a Dell Optiplex 745. During boot it blanks the display and nothing can be seen on the display from that point on. The vga/display parts of the dmesg from 6.1.5 says: vga1 at pci0 dev 2 function 0: vendor 0x8086 product 0x2992 (rev. 0x02) vga1: WARNING: ignoring 64-bit BAR @ 0x18 wsdisplay0 at vga1 kbdmux 1: console (80x25, vt100 emulation) wsmux1: connecting to wsdisplay0 i915drm0 at vga1: Intel i965Q i915drm0: AGP at 0xc000 256MB i915drm0: Initialized i915 1.6.0 20080730 vendor 0x8086 product 0x2993 (miscellaneous display, revision 0x02) at pci0 dev 2 function 1 not configured The monitor is 1920x1200 dell. I was expecting it to switch from large characters to small characters at that point like it does on other recent hardware using similar sized monitors. --Phil
Re: meltdown
On Thursday 04 January 2018 12:49:22 m...@netbsd.org wrote: > I wonder if we can count the number of SEGVs and if we get a few, turn > on the workaround? How about turning on the workaround for any process that ignores or catches SEGV.Any process that is terminated by a SEGV should be safe, shouldn't it? --Phil -- Phil Nelson, http://pcnelson.net
File systems on 4k sector devices?
Hello, I recently acquired a 3TB USB disk that attaches as a sd0 disk. disklabel reports that it has 4k sectors. I can write a disklabel on the disk, but newfs can not create a fs. (parts of the disk label) bytes/sector: 4096 sectors/track: 32 tracks/cylinder: 64 sectors/cylinder: 2048 cylinders: 357698 total sectors: 732566642 d: 732566642 0 unused 0 0# (Cyl. 0 - 357698*) e: 563 4.2BSD 1024 8192 0 # (Cyl. 0*- 24*) f: 732516579 50063 4.2BSD 4096 32768 0 # (Cyl. 24*- 357698*) bash-4.1# newfs -S 4096 /dev/rsd0e /dev/rsd0e: 195.3MB (5 sectors) block size 32768, fragment size 4096 using 4 cylinder groups of 48.84MB, 1563 blks, 3072 inodes. rdfs: read error for sector 1: Invalid argument bash-4.1# newfs -S 4096 /dev/rsd0f /dev/rsd0f: 2861392.9MB (732516579 sectors) block size 32768, fragment size 4096 using 3860 cylinder groups of 741.31MB, 23722 blks, 47104 inodes. wtfs: write error for sector 732516578: Invalid argument I'm running 5.1/i386. (The e: above was just to try a small size.) Is there support for 4K sectors in 5.1? I've read that other people have successfully created file systems on 4k sector devices. So, what am I doing wrong? --Phil -- Phil Nelson, http://pcnelson.net life: http://goallpower.com