Re: how do I preset ddb's LINES to zero
On Sat, Dec 16, 2023 at 02:58:59PM +1100, matthew green wrote: > Andrew Cagney writes: > > > > thanks, I'll add that (it won't help with my immediate problem of a > > > > panic during boot though) > > > > > > From DDB command prompt "set $lines = 0" ... > > > > Um, the test framework's VM is stuck waiting for someone to hit the > > space bar :-) > > > > I guess I could modify my pexpect script to do just that, but I was > > kind of hoping I could do something like add ddb.lines=0 to the boot > > line. > > try "options DB_MAX_LINE=0" in your kernel? > > we have poor boot-command line support if you compare against > say what linux can do. > I have added to userconf(4) (this has not been merged in NetBSD) support for "aliases" (variables that can be macros), and patterns etc. Support has been added to config(1) to generate "commands" to interpret by userconf(1) at start-up time (userconf(4) interprets whatever has been added by config(1); then whatever is passed by the bootloader; and then perhaps, enters an interactive session if the -c flag was given; what is added via config(1) is always interpreted). It wouldn't be difficult to add in userconf(4) a command to set such parameters, with then the possibility to add, at user will, "commands" to be interpreted at start-up time via config(1); or passed by the bootloader; or written in userconf(4) interactive session. userconf(4), M.I., is the correct place to add these. And the majority of the work has already been done to allow such extensions (see https://github.com/tlaronde/BeSiDe for the code). -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: __futex(2): use outside Linux compat
On Mon, Dec 11, 2023 at 01:00:38PM +, Robert Swindells wrote: > > tlaro...@kergis.com wrote: > > In Mesa code implementations for futex_wake() and futex_wait() are > > provided for Linux, Windows, FreeBSD and OpenBSD. > > > > There is a __futex(2) syscall in NetBSD, used only for now, if I'm not > > mistaken, to implement Linux compat. > > The Linux emulation of futexes in NetBSD does not work correctly. > > > Is it OK to use for NetBSD "native" code since it is not "advertised" > > by a man page? > > No. OK, thanks for the precisions. To state the problem: NetBSD userland is probably the sole user of the non futex code in Mesa. Hence, since userland doesn't follow the same code path as the same apps on other OSes, and since this code (!UTIL_FUTEX_SUPPORTED) is a second rate citizen considering that the main development (Linux) is taking another path, it could be that the apps (the various Mesa libs components) are exercising bugs in this part, the "tearing" or "threaded" (incorrect lines in a window) that can be observed on NetBSD in certain circumstances being caused by userlevel concurrent accesses, and not by kernel cache problems (there have been reports that these defects are decreasing under heavy load and this is perhaps only because under heavy loads there are less threads concurrently running for the X clients, and they have no occasion to trash shared zones that should be, normally, protected by futexes). So 3 options: 1) To fix the futex support on NetBSD ("ideal" solution but quite involved, at leas for me); 2) Debug the non futex code in Mesa (meaning only finding if the problems seen can come from there); 3) Let it be for now... I will probably opt for 3) since I wanted to debug Mesa for other more disastrous infelicities (crashes with xine(1) or vlc(1)---and probably others since this comes from Mesa libs and probably not from the way the API is used in the clients). -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
__futex(2): use outside Linux compat
In Mesa code implementations for futex_wake() and futex_wait() are provided for Linux, Windows, FreeBSD and OpenBSD. There is a __futex(2) syscall in NetBSD, used only for now, if I'm not mistaken, to implement Linux compat. Is it OK to use for NetBSD "native" code since it is not "advertised" by a man page? -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Puzzling crash and strange reporting Re: ATI video card not recognized
On Thu, Dec 07, 2023 at 03:36:41PM +0100, Reinoud Zandijk wrote: > Hi, > > On Wed, Dec 06, 2023 at 09:52:54AM +0100, Reinoud Zandijk wrote: > > On Mon, Dec 04, 2023 at 04:15:39PM +0100, Reinoud Zandijk wrote: > > > On Tue, Jun 23, 2020 at 01:26:21PM +0200, Reinoud Zandijk wrote: > > > > my old videocard died and I replaced it with a slightly newer one but > > > > it isn't > > > > recognized and nothing other than vga0 attaches. Its an Gigabyte Radeon > > > > RX460 > > > > with 2 GB ram. > > > > > > > > 002:00:0: ATI Technologies Radeon RX460 (VGA display, revision 0xcf) > > > > 002:00:1: ATI Technologies Radeon RX 460/550/640SP, RX 560/560X HD Audio > > > > Controller (mixed mode multimedia) > > > > > > > > > > Back again :) I tried out the videocard again in 10.0 i(beta) and got a > > > lot > > > further. However I still stumble on a panic when starting X : > > A puzzling report and a worrysome crash occured while resizing a Firefox > window: > > ... > [ 1.00] NetBSD 10.99.10 (GENERIC) #0: Mon Dec 4 16:01:51 CET 2023 > [ 1.00] > rein...@gorilla.13thmonkey.org:/usr/sources/cvs.netbsd.org/src-clean/sys/arch/amd64/compile/obj/GENERIC > [ 1.00] total memory = 65456 MB > [ 1.00] avail memory = 63301 MB > ... > [ 4.627885] kern.module.path=/stand/amd64/10.99.10/modules > [ 4.640006] [drm] initializing kernel modesetting (POLARIS11 > 0x1002:0x67EF 0x1458:0x22D6 0xCF). > [ 4.640006] [drm] register mmio base: 0xFCE0 > [ 4.640006] [drm] register mmio size: 262144 > [ 4.640006] [drm] PCIE atomic ops is not supported > [ 4.640006] [drm] add ip block number 0 > [ 4.640006] [drm] add ip block number 1 > [ 4.640006] [drm] add ip block number 2 > [ 4.640006] [drm] add ip block number 3 > [ 4.640006] [drm] add ip block number 4 > [ 4.640006] [drm] add ip block number 5 > [ 4.640006] [drm] add ip block number 6 > [ 4.648106] [drm] add ip block number 7 > [ 4.648106] [drm] add ip block number 8 > [ 4.807888] ATOM BIOS: 113-TIC15322-X01 > [ 4.807888] [drm] UVD is enabled in VM mode > [ 4.807888] [drm] UVD ENC is enabled in VM mode > [ 4.807888] [drm] VCE enabled in VM mode > [ 4.807888] [drm] vm size is 256 GB, 2 levels, block size is 10-bit, > fragment size is 9-bit > [ 4.818504] amdgpu0: VRAM: 2048M 0x00F4 - 0x00F47FFF > (2048M used) > [ 4.818504] amdgpu0: GART: 256M 0x00FF - 0x00FF0FFF > [ 4.818504] [drm] Detected VRAM RAM=2048M, BAR=256M > [ 4.818504] [drm] RAM width 128bits GDDR5 > [ 4.818504] Zone kernel: Available graphics memory: 9007199252279140 KiB > ? For this one, see my message on the list (today), subject: DRMKMS: bug in pseudo linus si_meminfo -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
DRMKMS: bug in pseudo linux si_meminfo
When initializing drmkms, the kernel prints bogus things like: [ 4.193896] Zone kernel: Available graphics memory: 9007199254113272 KiB [ 4.193896] Zone dma32: Available graphics memory: 2097152 KiB The reason is to be found in sys/external/bsd/drm2/include/linux/mm.h which fills a pseudo Linux sysinfo struct (limited to members used). But: - Linux sysinfo(2) specifies that totalram is in bytes, while totalhigh is in pages. In mm.h, totalram is initialized in pages (not bytes) and totalhigh is defined with kernel_map->size, that is a virtual address (?), converted in pages; - then in: sys/external/bsd/drm2/dist/drm/ttm/ttm_memory.c:320 mem = si->totalram - si->totalhigh; The problem is that this is substracting oranges to apples. On my node I have these (added aprint_*): [ 4.224447] si_meminfo: totalram: 1756268; totalhigh: 8479211520; memunit: 4096 it's clear that totalram (pages) - totalhigh (changed to pages but virtual memory) leads to a negative result then casted to unsigned long long yielding the bogus number seen. Furthermore, when setting zone->max_mem, the memory is divided by two (>> 1)? But why? Is it to force reserving at most only half of what is available to graphics? A comment would be welcome explaining the reason why. This explains the number found for dma32: since the available memory exceeds 2^32, 2^32 is taken as the max but, once more, divided by 2. Do somebody know the Linux guts enough to clarify what totalhigh refers to? (certainly not a virtual address) Isn't it dangerous to change the "units" of totalram (bytes in Linux, but here pages) since (I have not traced the use of the pseudo structure in the remaining code) if values are used elsewhere in the drivers, it is likely to wreak havoc the linux code. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Kernel startup: the blob (animal) approach
When the kernel panics, it can enter ddb or reboot. There could be another mode implementing a blob (animal) approach this way: At startup, userconf(4) [1] parses an array of instructions compiled in by config(1), then processes, for arch supporting this, instructions passed by bootinfo and enters eventually an interactive session. For this interactive session, userconf(1) could register the modifying commands in a memory zone accessible by the other routines (at the moment, there is a history recorded but in a static array with no use at all). When the kernel continues, before climbing down a dev node, a "disable this" (shorten as "D this") could be added in this shared zone. If everything goes well, the next node will erase the instruction with its own. If something goes wrong, the panic will add a "print " (shorten as "P ") and there could be a third mode: instead of entering ddb or rebooting, the kernel restarts: it is not reloaded it restarts from the beginning. The userconf replays: instructions compiled in by config(1), bootinfo ones, no interactive session but replaying the instructions in the shared zone, thus disabling the offending device. Then the kernel continues and will worm its path avoiding this panicing point until, eventually, reaching userland, when remote connections can be done and displaying what instructions (and what debugging informations) are in the shared zone. So: trial/error but if error, trying another path. Or, instead of trying artificial intelligence, trying natural one. [1] This is the userconf(4) I have modified: https://github.com/tlaronde/netbsd-src/tree/tsjl -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
[PATCHES] config(1) / cons(9) / userconf(4) : Extensions
Code is at: https://github.com/tlaronde/netbsd-src/tree/tsjl in 3 commits: - config(1): it accepts now also context neutral "userconf" directives. These add the double quoted string given as argument to the userconf_kconf[] array. This array is interpreted by userconf(4) during startup (see below). Typically, MI userconf instructions should go in sys/conf/std like these ones: # # Userconf MI aliases. # userconf"alias azerty qaQAwzWZaqAQ;m:MzwZWm,M?,;<..:>/" # start to define an executable macro "fr" # - printing a message userconf"alias -c fr print changing to pseudo-fr kbd mapping" # - mapping from def of azerty (-a azerty) and mapping * and - userconf"alias -c fr kmap -a azerty `*~-" # - printing a hint: * and - are mapped at the upper left key userconf"alias -c fr p * and - are mapped to upper left ^2" # Here, another macro: the drmkms alias is defined in MD code userconf"alias -c nodrmkms disable -a drmkms" and then in the kernel config, MD directives can be added for example to define drmkms (an alias; each instruction creates or adds to the definition): # DRMKMS drivers i915drmkms* at pci? dev ? function ? intelfb*at intelfbbus? userconf "alias drmkms i915drmkms*" radeon* at pci? dev ? function ? radeondrmkmsfb* at radeonfbbus? userconf "alias drmkms radeon*" #amdgpu* at pci? dev ? function ? #amdgpufb* at amdgpufbbus? nouveau* at pci? dev ? function ? nouveaufb* at nouveaufbbus? userconf "alias drmkms nouveau*" - cons(9): two new routines: cnmapreset() and cnmap() allow a "late" mapping of chars in startup console (works only with cnget*()), allowing a kind of keyboard mapping for use during this step; - userconf(4): in order for interaction and for the config(1) generated userconf_kconf[] array of instructions to be more useful, a lot of things have been added to userconf(4): o At init time, userconf interprets instructions (cmdlines) in userconf_kconf[] (generated by config(1)) before processing bootinfo directives and, perhaps, entering interactive session if the "-c" flag was passed to the kernel; o aliases: one can create aliases, including executable ones (macros). Userconf does its own alloc/free stuff for this; => userconf_parse() thus handle taking definition of aliases and recursing for macros; o patterns: one can select devices using patterns. This works for change, disable, enable, find and list; o new built-ins: * aliases: create or add definition to an alias (that can be executable); allocated; * kmap: maps characters on the console (calling cons(9) added routines) allowing a kind of keyboard mapping for not US ASCII keyboards; * print: echos tokens including dereferencing of aliases; * unalias: delete an alias; freed; * vis: visualize (show) the definition of an alias (uninterpreted)---show and 'S' were not chosen to keep 'S' for "set" in the future; see FUTURE DIRECTIONS; * debug0: display config(1) added instructions parsed at startup time; * debug1: display debugging information about userconf memory and structures allocations; * debug2: display debugging information about userconf defined aliases; o Ergonomy: in order to limit the number of characters to be able to give: * input is case insensitive; * built-ins can be given with a single letter key (in all cases less one, this is the initial); a macro is at least two chars, starting by a letter. Single letters are reserved for built-ins; * no special character is needed for pattern or alias: a flag has to be given with a hyphen and a letter to change the interpretation of the next token (this was proposed by RVP). o FUTURE DIRECTIONS: I have reserved 'S' for set: a lot of things presently in MD
userconf_parse() return status
With some delay, I'm finishing modification of cons/userconf/config (having implemented more in userconf than initially projected): * aliases hence local malloc/free; * executable aliases (macros without parameters but multiple lines possible meaning that one can define drmkms as alias with a list of devices and one can define "nodrmkms" as "disable -a drmkms" for example); * multiple arguments (instead of only one), aliases being replaced and their definition parsed; * patterns; * kmap (char mapping for console input); * single letter support for built-ins (case insensitive: 'e'|'E' for enable and so on). Questions about current usage: It is not obvious, but userconf_parse() was a function, returning something. In fact, 0 generally (including when error) and (-1) when quitting which is used (not obvious) to quit interactive mode (kernel -c). There are two outside usages of userconf_parse(): sys/arch/x86/x86/x86_userconf.c sys/dev/fdt/fdt_userconf.c where the return status is not tested (it should, since one could imagine adding a "quit" or 'Q' in the series of instructions to "comment out" the remaining instructions). So I will correct these. I have modified userconf_parse() to return negative on error, 0 if OK, and 1 if quitting. Is this OK? (Even if this is not of great use, returning different values---here: negative ones---on error documents the code). The other question concerns the "history" in userconf. I have corrected a blunder (a 'd' as "command" where a 'e'---enable---was expected) and I have added single letter support. But if the history is now correct and could be executed (with support for single letters), it is not accessible to user. So was this intended to record what was done at boot time for post-mortem or debugging purposes (which it seems) or was this intended for interactive user comfort---I don't think so because I fail to see a benefit: user will not repeat the same command again and again... Does someone know the history of... "history"? -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
cons(9): char mapping
For at least userconf, are added means to define a char mapping for the console during startup (the userconf char to char mapping command will be "kmap", key 'k'; and a series of instructions, assembled by config(1), will be proceeded during userconf_init() before userconf_bootinfo(), allowing one to add a pseudo keyboard mapping in the kernel config for early interaction). Attached is the diff. Comments? -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C diff --git a/share/man/man9/cons.9 b/share/man/man9/cons.9 index 42db38b25d5b..af96362d8f32 100644 --- a/share/man/man9/cons.9 +++ b/share/man/man9/cons.9 @@ -24,11 +24,13 @@ .\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE .\" POSSIBILITY OF SUCH DAMAGE. .\" -.Dd June 8, 2010 +.Dd November 16, 2023 .Dt CONS 9 .Os .Sh NAME .Nm cnbell , +.Nm cncmap , +.Nm cncmapreset , .Nm cnflush , .Nm cngetc , .Nm cngetsn , @@ -41,6 +43,10 @@ .Ft void .Fn cnbell "u_int pitch" "u_int period" "u_int volume" .Ft void +.Fn cncmap "u_char from" "u_char to" +.Ft void +.Fn cncmapreset "void" +.Ft void .Fn cnflush "void" .Ft int .Fn cngetc "void" @@ -80,10 +86,17 @@ milliseconds at given Note that the .Fa volume value is ignored commonly. +.It Fn cncmap +Maps a char to another char. The mapping is only used inside +.Fn cngetc +and the facility exists to allow a kind of keyboard mapping during +startup interaction. The nul char is never mapped. +.It Fn cncmapreset +Resets the char mapping to the identity mapping. .It Fn cnflush Waits for all pending output to finish. .It Fn cngetc -Poll (busy wait) for an input and return the input key. +Poll (busy wait) for an input and return the mapped input key. Returns 0 if there is no console input device. .Fn cnpollc .Em must @@ -154,6 +167,7 @@ cnpollc(0); .Xr pckbd 4 , .Xr pcppi 4 , .Xr tty 4 , +.Xr userconf 4 , .Xr wscons 4 , .Xr wskbd 4 , .Xr printf 9 , diff --git a/sys/dev/cons.c b/sys/dev/cons.c index f3a2387fbceb..cc8db394b339 100644 --- a/sys/dev/cons.c +++ b/sys/dev/cons.c @@ -95,6 +95,8 @@ structtty *volatile constty; /* virtual console output device */ struct consdev *cn_tab;/* physical console device info */ struct vnode *cn_devvp[2]; /* vnode for underlying device. */ +static unsigned char cn_cmap[UCHAR_MAX+1]; /* char mapping for cngetc() */ + void cn_set_tab(struct consdev *tab) { @@ -109,6 +111,15 @@ cn_set_tab(struct consdev *tab) * cn_tab updates. */ cn_tab = tab; + + /* +* Char mapping is only done in cngetc() i.e. in kernel +* startup when the console is not a tty. Assuming here that +* if there were more than one console, there would be a +* different terminal, that is a different keyboard attached +* to the console so a different mapping. +*/ + cncmapreset(); } int @@ -315,6 +326,29 @@ cnkqfilter(dev_t dev, struct knote *kn) return error; } +void +cncmapreset(void) +{ + unsigned char c; + + /* Consistency, a keyboard is supposed attached to a cons */ + if (cn_tab == NULL) + return; + + for (c = 0; c <= UCHAR_MAX; c++) + cn_cmap[c] = c; +} + +void +cncmap(unsigned char from, unsigned char to) +{ + if (cn_tab == NULL) + return; + + if (from) /* Nul is never mapped */ + cn_cmap[from] = to; +} + int cngetc(void) { @@ -325,7 +359,9 @@ cngetc(void) const int rv = (*cn_tab->cn_getc)(cn_tab->cn_dev); if (rv >= 0) { splx(s); - return rv; + /* Nul is never mapped */ + return (rv && rv <= UCHAR_MAX)? + (int) cn_cmap[(unsigned char)rv] : rv; } docritpollhooks(); } diff --git a/sys/dev/cons.h b/sys/dev/cons.h index 9fed7cb0eb00..aba8def6a743 100644 --- a/sys/dev/cons.h +++ b/sys/dev/cons.h @@ -79,6 +79,8 @@ externstruct consdev *cn_tab; void cn_set_tab(struct consdev *); void cninit(void); +void cncmapreset(void); +void cncmap(unsigned char, unsigned char); intcngetc(void); intcngetsn(char *, int); void cnputc(int);
Re: [RFC 2] userconf(4): 2nd proposal
FWIW, various things I have modified can be seen here: https://github.com/tlaronde/netbsd-src/tree/tsjl The userconf version present at the moment on the published branch, was my first attempt (patterns introduced by slashes---working but not solving the problem about drmkms). There are other bits listed at the root in CHANGES.tsjl (other files: WIP.tsjl and GOALS.tsjl are self-explanatory, but GOALS is empty at the moment except for the title and the date and will have to be filled when the documentation is ready). Next step will be to implement patterns and groups for userconf/config(1), but with a revised syntax as proposed by RVP (but for now, I will go with groups defined by config(1), and not allowing variables aka aliases, first to not implement custom allocation, second to allow the feature to be of use for all archs, and not only the ones using boot.cfg). I may probably not report on the list when done for this or even about what I will be doing next: this will be explained in the files mentionned above and the sources will be published in the branch. If something is found useful by somebody, just cherry-pick (my work published here is under a 2 clauses BSD licence). -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [RFC 2] userconf(4): 2nd proposal
On Sun, Nov 05, 2023 at 11:17:02AM +, RVP wrote: > > Oh, I like the idea (I've always wanted a mechanism to list drivers > etc. using patterns); it's just the syntax that sticks in the craw. > Too many meta-chars. there. > > OTOH, `cmd -p xyz* *abc' doesn't need much thought. And, aliases > are pretty standard too. But, this is your show, n'est pas...? > Don't let me stop you! I like this more: flags introduced by '-' since if a flag is not a number there is no ambiguity with negative numbers (allowed for the more builtin facility). So -p would mean pattern, -g use groups instead of driver name and -pg apply a pattern to group names and even -s meaning STARred (-pgs, letters in whatever order, meaning apply a pattern to group names for STARred devices)... And without flags, this is the present syntax untouched. This will address too the legitimate concern of Staffan Thomén about keyboard mapping: this adds less characters, and not special ones. For variables, I will refrain for the moment because this will impose to add a fraction of a page (1/4, 1/2 or 1/1) as scratch "memory" to allocate and a simple allocation scheme (a la Unix version 6 for example: see Lion's Commentary) in order to not allocate with kernel facilities. Not difficult, not adding much but if not necessary... -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [RFC 2] userconf(4): 2nd proposal
On Sun, Nov 05, 2023 at 01:53:31PM +0200, Staffan Thomén wrote: > One thing I'd like to point out is that I often find I don't have the > right keyboard layout or am restricted in some way in from typing in the > bootloader (glitchy serial connection or really fast repeating keyboard > or something), so keeping the syntax brief and with as few non- > alphabetical characters as possible would probably be a good idea. > > Just throwing some cents on the pile, I have been annoyed by that too (a GENERIC kernel has a US qwerty default compiled in) and I wondered if a supplementary short command to switch the mapping, in userconf, would not be convenient too (no need to deal with accented characters or whatever: just providing the ASCII chars where the engraving of a different keyboard puts there). For the extra characters, I think what can be accessible on the numpad is handy (I even had * not accessible with some USB keyboards...). This leaves the braces (for the groups) more problematic. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [RFC 2] userconf(4): 2nd proposal
On Sat, Nov 04, 2023 at 11:59:00AM +0100, Martin Husemann wrote: > On Sat, Nov 04, 2023 at 11:25:01AM +0100, tlaro...@kergis.com wrote: > > I think that my second proposal is the simplest, allowing not breaking > > existing and introducing extensions without much typing. > > This whole thing still makes no sense to me. You can do what you want > with userconf already and this is not a common operation so any simplification > for something that only makes sense (1) for ad hoc testing or (2) encoded > in boot.cfg does not gain us anything for real. > > For the real world issue at hand (bugs in kernel drivers that claim the > console but then do not work) either a boot flag (like RB_MD4 on x86) > or what you call "ad hoc mechanism" makes a lot more sense to me. An "ad hoc mechanism" would be to construct a list of drivers to disable them in block i.e. exactly the same as what can be done already with userconf if you know the drivers names. The only advantage of this ad hoc solution would be to require only a generic instruction instead of several commands to disable all or the necessity to know exactly which one to disable. This is what my proposal is about, but instead of polluting the sources with an "ad hoc" solution, by adding a feature that can be of some more general use in other cases, for debugging or disabling a collection of devices (group). So how can you discard my proposal as "no sense", when your ad hoc solution is only a variation around the same thing? Secondly, a more fine grained solution to disable a portion of the drivers dealing with the console is more involved---because if it was not, it would have already been done, no? And this is the problem: the drm2/ source is 206 MB (!!!). Our drmkms sources are already not in sync with the Linux ones (I'm watching them and there have been already major changes, for i915 and particularily for amdgpu). So the NetBSD turtle may beat the Linux hare, but in the end; certainly not in a speed race... And there is the NetBSD 10 release. A definitive or even only correct solution will not be found if 10 has to be released soon. I'm just proposing something simple enough to improve the "crude solution"---on par with the Linux/GRUB feature. That's the best that can be done for the time being due to the size (that's the word...) of the problem... -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [RFC 2] userconf(4): 2nd proposal
On Sat, Nov 04, 2023 at 09:30:53AM +, RVP wrote: > On Sat, 4 Nov 2023, tlaro...@kergis.com wrote: > > > > > No...: this is a break of existing. Trailing `*' selects STARred devices > > > > (I'm not the inventor of this). So `*' can not be used as a joker ;-) > > > > > > > > > > You can allow escapes for those: > > > > > > uc> disable i915drmkms\* # exact match STARred > > > uc> disable *kms\*# only STARed `*kms' > > > > > > > But this breaks existing... > > > > Fine. You can introduce the notion of flags. > For example `-p' for pattern: > > uc> disable i915drmkms* # std. starred device > uc> disable -p *drm* # disable using pattern > > You can also add, let's say, a `-g' group flag: > > uc> list -g # list all "groups" > uc> list -g drmkms# list devices in group drmkms > uc> disable -g drmkms # disable group drmkms > Yep, but know you see what became of the simplifications ;-) I covered the same ground as you to end up with my first proposal, in order, by the '=' to allow the keep the present syntax alone and to have a new differing, and imposing double quoting of strings in order to be able, if needed, later, to have variable names unquoted, precisely for use in boot.cfg. I think that my second proposal is the simplest, allowing not breaking existing and introducing extensions without much typing. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [RFC 2] userconf(4): 2nd proposal
On Sat, Nov 04, 2023 at 08:31:09AM +, RVP wrote: > On Sat, 4 Nov 2023, tlaro...@kergis.com wrote: > > > > 1) Allowing shell-like patterns (not hard to implement): > > > > > > uc> disable drm* # all starting with `drm' > > > > No...: this is a break of existing. Trailing `*' selects STARred devices > > (I'm not the inventor of this). So `*' can not be used as a joker ;-) > > > > You can allow escapes for those: > > uc> disable i915drmkms\* # exact match STARred > uc> disable *kms\*# only STARed `*kms' > But this breaks existing... > > I have contemplated, too, adding for example "variables" to userconf and > > rejected it because this would be only useful for arch supporting > > boot.cfg, > > > > Definition in boot.cfg was the intent. Yes, this could be useful but for boot.cfg; but boot.cfg is not supported by all archs. Hence the "grouping" proposal, that is independent from boot.cfg, and that can have, IMO, a usage not limited to drmkms---a group being defined so that the devices enabled/disabled by group can, indeed, work or not imped the behavior of the kernel if disabled; I made experiments disabling devices with pattern matching for drmkms, ending disabling a child with the simple result that I painted myself in a corner: there was no more display... So grouping is also supposed to be "safe" variables definition. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [RFC 2] userconf(4): 2nd proposal
On Sat, Nov 04, 2023 at 08:20:43AM +, Michael van Elst wrote: > tlaro...@kergis.com writes: > > >disable {drmkms} # NEW: disable devices belonging to group "drmkms" > > Almost noone would need to turn off all drmkms drivers. What you may > want to control is that a GPU isn't used as a console. Disabling a driver > is just our crude workaround to achieve this. The problem is, at the moment, that we can not separate the GPU handling from the drmkms stuff, meaning that one can not modify "at run time" because, in some cases, one never gets to "run time": it crashes. The drmkms code (drm2/) has increased the size of the kernel sources by... 50% (!). A "correct" solution can not be found now by diving in the drmkms code. So the crude workaround has to be achieved in a simpler way than listing all the drmkms related drivers: a user trying GENERIC does not necessarily know what is present on his hardware and does not have to find what particular drivers he has to disable/enable. > > I don't think that autoconf is the right place for such a control, > it should be a boot parameter, maybe even something that can be > changed at runtime later. > > The current system of boot parameters is limited and differs a lot > between platforms. We need a common way to set boot parameters and > these should be mostly defined in a platform-agnostic way. > For the moment, putting definition of groups in config(1) and handling in userconf, achieves this goal of arch independence. And since the problems with drmkms are mainly for x86 machines, there is for x86 boot.cfg in which by default we could disable drmkms and simply instruct user to enable it (try once) at userconf console with "enable {drmkms}" and, if this works, to comment out the "disable {drmkms}" in boot.cfg. > > >Hint: Linuces distributions "work" as proposed images on servers, > >where NetBSD fails. > > Servers usually do no have drmkms capable hardware, and if they have, > you probably want to use that hardware. Been there and seen this (I mean: didn't see anything...): to use the hardware, you have to know it is here; when drmkms makes the kernel crash, on a remote node without remote boot administration/console, you will never know what it has and you will think that NetBSD simply doesn't work... So, disabling drmkms to verify that NetBSD works without it allows you to know what the hardware is and, after that, you can try to enable drmkms at least knowing that if it crashes (if you don't have access anymore...), this does not mean that NetBSD can not drive it, simply that this has to be without drmkms (we need to have a boot once feature too so that if a remote node crashes, rebooting restore a working boot sequence). -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [RFC 2] userconf(4): 2nd proposal
On Sat, Nov 04, 2023 at 07:41:19AM +, RVP wrote: > On Sat, 4 Nov 2023, tlaro...@kergis.com wrote: > > > - 1) No change to the general form of current syntax; > > > > - 2) Selection can be as presently: by number (index in cfdata), by > > name (driver name), but also (NEW) by pattern: a pattern is > > between slashes, it is a fix substring, that can be optionnally > > anchored at the beginning with `^' and at the end with `$'; > > > > - 3) (NEW) If the selector (will this word do?) in 2) is surrounded by > > braces `{' `}', the selector is for a group of devices; > > > > - 4) The STAR (existing) is still handled as a suffix. > > > > Examples: > > > > disable i915drmkms # existing syntax > > > > disable {drmkms}# NEW: disable devices belonging to group "drmkms" > > > > disable {/^drm/}* # NEW: disable devices belonging to groups > > # whose name begins with the substr "drm" if > > # they are STARred ones. > > > > I think you can simplify things a bit by: > > 1) Allowing shell-like patterns (not hard to implement): > > uc> disable drm* # all starting with `drm' No...: this is a break of existing. Trailing `*' selects STARred devices (I'm not the inventor of this). So `*' can not be used as a joker ;-) > uc> disable *drm* *usb$ # all with `drm' anywhere and those ending in > `usb' > uc> disable foo # exact match `foo' > uc> disable 1 # exact match 1 (index) > > 2) Having an alias facility: > > uc> alias drm_disable=disable i915*; disable *radeon*; ... > uc> drm_disable # executes: RHS text (no recursive expansion) > uc> alias drm_disable=# remove alias `drm_disable' I have contemplated, too, adding for example "variables" to userconf and rejected it because this would be only useful for arch supporting boot.cfg, and useless in userconf per se. It is useless in userconf per se, because it is not persistent: the time one will spend defining the aliases would be longer than the time to type directly the disabling of several devices at userconf prompt ;-) The goal, for me, is to have something generic, available on all archs (hence put it in kern/subr_userconf.c and config(1)), and not an ad hoc trick for drmkms, so that there is not something we have to remember to update when something changes (groups will be set for the benefits of userconf by config(1) with a macro added for the purpose). -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
[RFC 2] userconf(4): 2nd proposal
Revised proposition: - 1) No change to the general form of current syntax; - 2) Selection can be as presently: by number (index in cfdata), by name (driver name), but also (NEW) by pattern: a pattern is between slashes, it is a fix substring, that can be optionnally anchored at the beginning with `^' and at the end with `$'; - 3) (NEW) If the selector (will this word do?) in 2) is surrounded by braces `{' `}', the selector is for a group of devices; - 4) The STAR (existing) is still handled as a suffix. Examples: disable i915drmkms # existing syntax disable {drmkms}# NEW: disable devices belonging to group "drmkms" disable {/^drm/}* # NEW: disable devices belonging to groups # whose name begins with the substr "drm" if # they are STARred ones. This work for all actions: change, enable, disable, find and list. Remainder: Drmkms is crashing the kernel in various configurations. The drivers can not be modloaded, they have to be compiled in the kernel. Hence a way to disable them at booting time is needed. Hint: Linuces distributions "work" as proposed images on servers, where NetBSD fails. But this is because GRUB has a switch to disable drmkms. And the switch is on. Even Linux does not try to use drmkms in server configurations... -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [RFC] userconf(4) modification
On Fri, Nov 03, 2023 at 09:41:14AM +0100, Martin Husemann wrote: > On Thu, Nov 02, 2023 at 05:32:20PM +0100, tlaro...@kergis.com wrote: > > On Thu, Nov 02, 2023 at 05:05:53PM +0100, Martin Husemann wrote: > [..] > > > Something like: > > > > > > uc> drm off > > > > > > and then have the drm command use a fixed build-in table of driver names > > > to disable individual drivers. > > > > This is precisely what I dislike: an ad hoc addition with the > > necessity to be careful about what objects have to be regenerated > > whenever something is touched or changed. > > Well, there are two parts to it: > > 1) the user interface: for a user following hints from the internet > because their new machine blanks the screen at boot time the command > has to be as simple as possible. We may work around that by adding > the required magic to the standard boot menu on install media. > > 2) the implementation: a very simple and scalable implementation > (instead of the static list of known DRI devices, which IMO is not > that hard to maintain either) is a global kernel variable like > "drm_enabled" and all DRM related drivers checking for that in their > probe function. > When booting (boot(8)), there is switches to disable multiprocessor (-1), ACPI (-2), SVS (-3) and some MD (-4). Do you mean adding a -5 for example? That is this will have nothing to do with userconf? Alternatively, since my proposed syntax and my proposed explanations failed to find any support, I could go with a simplest (from some offlist input): do not change the current syntax, but accept a pattern between slashs '/^?pattern$?/', and a group between braces '{drmkms}', allowing: disable {drmkms} list /^usb/ Note: this will be drmkms and not drm because there is still something "different": drm, the old drivers. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [RFC] userconf(4) modification
On Thu, Nov 02, 2023 at 04:12:43PM +, Robert Swindells wrote: > > tlaro...@kergis.com wrote: > > As stated in a message before, disabling, via userconf(4), all the > > drmkms drivers can not rely on a pattern matching since, for historical > > reasons (several versions of DRM), the namespace of the drivers is not > > "ruled". > > I don't see the need for this, it would be unusual for more than one > drm driver to attach. > > You can also build a custom kernel. The problem is when installing a GENERIC kernel. You don't know, a priori, what will be encountered. Been there: I installed a "generic" distribution on a lended baremetal remote server (without any access except ssh when it comes up), and NetBSD failed to work, because of the drmkms/ And there can be more than one drmkms driver. It is not unusual to have a GPU integrated (on board) and a discrete one (on an extension). The goal is to provide a way to disable drmkms entirely without knowing what the kernel will actually encounter. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [RFC] userconf(4) modification
On Thu, Nov 02, 2023 at 05:05:53PM +0100, Martin Husemann wrote: > I would prefer to have a special new command that does all the magic > internaly, and don't waste code and complexity on pattern matching > and generalizations. > > Something like: > > uc> drm off > > and then have the drm command use a fixed build-in table of driver names > to disable individual drivers. This is precisely what I dislike: an ad hoc addition with the necessity to be careful about what objects have to be regenerated whenever something is touched or changed. The pattern matching (already implemented in a previous attempt) doesn't cost much in code (it is not a regex implementation) and allows too to reduce listing of devices in order to not have literally hundreds of entries to browse. userconf is, for me, a debugging/developing tool too. So it can be useful for more than drmkms to disable a whole range of devices (whether by group name or by pattern matching if the namespace is "ruled"). -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [RFC] userconf(4) modification
On Thu, Nov 02, 2023 at 06:59:50PM +0300, Valery Ushakov wrote: > On Thu, Nov 02, 2023 at 16:29:42 +0100, tlaro...@kergis.com wrote: > > > You will find attached the man page in order to be able to comment > > about the proposed new syntax---supplementary syntax: it does not > > replace the "legacy" one. > > The man page is super-confusing. Someone who needs to use userconf to > get their system to boot needs a clear reference, but the proposed > version tries to be overly formal and ends up a bit opaque. > > I also don't understand why it is necessary to call the old syntax - > "legacy". From the man page my impression is that the command can be > either > > command dev > > or > > command property = value I called it "legacy" because (I'm not an english native speaker) I didn't find (or didn't know) how to call it differently. In the present syntax (what I call "legacy"), you can give as a device specification whether a number or a driver name. If I want to introduce something else: a group name, I have to change the syntax if I don't want to introduce something extra fancy to stipulate it's a group name and not a driver name. Hence the '=' that permits to clearly identify the "new" syntax against the old one; specifying what property we are matching against allow further extensions without syntax modifications if needed (not proposed here). > > both are in a sense a kind of device selector, why do you have to > declare one of them "legacy"? The user probably doesn't care much > either way, they need to get the kernel booting and are not interested > in the lore. > > Why the thing after = is called "expression"? That position only > accepts two kinds of literals, one of which is a shorthand for the > other (but I had to re-read that paragraph several times and I'm still > not quite sure it actually clearly says that). It's an expression because it depends. It can be a number (positive integer) for devno; it can be a string literal (exact match) or a pattern (substring match). I retained the shorthand (literal string) because of the present syntax. But it could be discarded in favor of the only /^drmkms$/ syntax i.e. a special case of pattern matching: matching against whole string. Since I'm not an english native speaker, I tend to put in text a pseudo KNF. This is why it is "formal". It seems my attempt to be "boring but clear" failed... The current (not mine) man page is not formal. But it doesn't tell the true story either. The STARred devices are not explained. The devno is not explained either---and the range is not checked in the code allowing access with a negative number in the cfdata vector. I will be grateful to some english native speaker or someone confortable enough with english to fix the man page and/or propose a syntax that will not require more acrobatics to "understand" that what is wanted is neither a device by index (number), nor a device by driver name but something else. I like strong typing of variables... Awk(1), perl(1) and whatever loose typing languages are not my cup of tea. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
[RFC] userconf(4) modification
As stated in a message before, disabling, via userconf(4), all the drmkms drivers can not rely on a pattern matching since, for historical reasons (several versions of DRM), the namespace of the drivers is not "ruled". So I want to add a "group" member to the cfdata structure, with modifications to config(1) to set it, in order to allow to disable devices by a group name. Additionnaly, because I had already implemented it, there is a pattern matching feature too. You will find attached the man page in order to be able to comment about the proposed new syntax---supplementary syntax: it does not replace the "legacy" one. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C .\" $NetBSD: userconf.4,v 1.14 2019/05/27 21:19:55 wiz Exp $ .\" .\" Copyright (c) 2001 The NetBSD Foundation, Inc. .\" All rights reserved. .\" .\" This code is derived from software contributed to The NetBSD Foundation .\" by Gregory McGarry. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\"notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\"notice, this list of conditions and the following disclaimer in the .\"documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS .\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED .\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR .\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS .\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR .\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF .\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS .\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN .\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) .\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE .\" POSSIBILITY OF SUCH DAMAGE. .\" .Dd November 1, 2023 .Dt USERCONF 4 .Os .Sh NAME .Nm userconf .Nd in-kernel device configuration manager .Sh SYNOPSIS .Cd options USERCONF .Sh DESCRIPTION .Nm is the in-kernel device configuration manager. It is used to alter the kernel autoconfiguration framework at runtime. .Nm is activated from the boot loader by passing the .Fl c option to the kernel. .Sh COMMAND SYNTAX There is a subset of meta-commands described immediately below and action commands, described after in separated sections, in two syntaxes: legacy and new, the new syntax extending the possibilities offered by the legacy one. .Pp .Nm has a .Xr more 1 Ns -like functionality; if a number of lines in a command's output exceeds the number defined in the lines variable, then .Nm displays .Dq "-- more --" and waits for a response, which may be one of: .Bl -tag -offset indent -width "" .It one more line. .It one more page. .It Ic q abort the current command, and return to the command input mode. .El .Pp The common meta-commands are the following: .Bl -tag -width 5n .It Ic lines Ar count Specify the number of lines before more. A negative number suppresses the paging. .It Ic base Ar 8 | 10 | 16 Base for displaying large numbers. .It Ic exit A synonym for .Ic quit . .It Ic help Display online help, including ranges of device number, list of device names and groups. .It Ic quit Leave userconf. .It Ic \&? A synonym for .Ic help . .El .Sh LEGACY SYNTAX AND COMMANDS .Nm supports the legacy syntax: .Bd -ragged -offset indent .Ic command Op Ar option .Ed .Pp and offers the following commands: .Bl -tag -width 5n .It Ic change Ar devno | dev Change devices. .It Ic disable Ar devno | dev Disable devices. .It Ic enable Ar devno | dev Enable devices. .It Ic find Ar devno | dev Find devices. .It Ic list List current configuration. .El .Sh NEW SYNTAX AND COMMANDS .Nm supports the new syntax: .Bd -ragged -offset indent .Ic command Ar property Li \&= Ar expression .Ed .Pp The .Li \&= has to be interpreted as meaning: defines the collection of devices on which to apply the command with devices whose stated property matches the expression. .Pp The commands are the following (same as with legacy syntax): .Bl -tag -width 7n .It Ic change Change devices. .It Ic disable Disable devices. .It Ic enable Enable devices. .It Ic find Find devices. .It Ic list List devices .El .Pp A .Ar property is one of the literals .Bl -tag -width 5n .It Li devno the index number of the device in the cfdata vector. The expression shall be a positive or nul integer value, less than the cardinal of devices in the cfdata vector. .It
[PATCHES] Adding Xorg libdrm rst2man(1) translated man pages
[Note: no need to Cc me anymore. Culprit (me...) being found; and problem solved.] I have added the translated man pages in xsrc/local/man/man[37] and a UPDATING file at the root of xsrc. Can be pulled from: https://github.com/tlaronde/xsrc commit b24a2c96577617a6297efde04ce5628985291eb4 (HEAD -> trunk, origin/trunk, origin/HEAD) Author: Thierry LARONDE Date: Mon Oct 23 11:45:48 2023 +0200 Adding rst2man translated libdrm man pages. and updated src/ as well to proceed the supplementary man pages, these man pages (in whatever format) being added to the xcomp set. Can be pulled from: https://github.com/tlaronde/src commit 0645b8ce539a57e81ea8ee1e8102e66bff1d9c15 (HEAD -> trunk, origin/trunk, origin/HEAD) Author: Thierry LARONDE Date: Mon Oct 23 11:49:07 2023 +0200 Adding the Xorg libdrm rst2man(1) generated man pages. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: DRM-KMS: add a devclass DV_DRMKMS and allow userconf to deal with classes
On Thu, Oct 19, 2023 at 10:41:53AM -0400, Mouse wrote: > >>> [...DV_DRMKMS...userconf...] > >> [...devices in multiple classes...maybe use a separate namespace, > >> used by only config(1) and userconf?...] > > This is precisely why I ask for comment ;-) > > :-) > > > I have two requirements: > > > - that the solution is not ad hoc i.e. that it can provide, in > > userconf, facilities not limited to drmkms (I don't want to implement > > a special case to recognize "drmkms" and to expand to all the STARred > > driver names implied); > > I agree with this; special-casing drmkms would be...suboptimal. > > > - that it will not imply to have to maintain special data for > > userconf to recognize some "magic" strings. > > You already need that, in that userconf has to have some way to > recognize the string "drmkms" as a device category (hinted by the > "class =" syntax, but it still needs error-checking) and map it into > the corresponding DV_ value. I don't see it as significantly worse for > config(1) to generate some data structure mapping device class names > into whatever userconf would need to affect all devices of that class. > > Though it occurs to me that there are too many things called "class" > here. "Group"? "Category"? "Collection"? I concluded too that config(1) can do the generation of the tables during the translation so there should not be a need to "manually" keep up-to-date data files. I think it would make sense to use "Group" and that this should be in fact special to userconf: ability to handle, with userconf, a group of devices, the list of groups being defined at config time, with some USERCONF(USERCONF_GROUP_DRIVER, string) macro. And adding the command in userconf to "set" a variable to a list, so that for example: "disable name in $var" or "disable group in $var" works (but for drmkms it will be defined at config time so this would be: 'disable group = "drmkms"'. This will allow customization both for a developper in source, and for an end user to set, for userconf, a group of devices he wants to enable or disable. (In this case, when the group is composed of devices not mandatorily related in some way, "collection" would be a better term than "group" (I'm with von Neumann when it comes to Set theory; but let's not be pedantic). -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: DRM-KMS: add a devclass DV_DRMKMS and allow userconf to deal with classes
On Thu, Oct 19, 2023 at 09:10:36AM -0400, Mouse wrote: > > I propose to add a DV_DRMKMS class to sys/device.h:enum_devclass; to > > augment cfdata with a devclass member [...] > > > Comments? > > This is not intended as criticism; I am just trying to examine all > sides of this question. > > Why use the sys/sys/device.h kind of device class for userconf? Is > there some reason to think it will be useful to userconf other device > classes, or do you expect other device-class machinery to have a use > for DV_DRMKMS, or is it a question of just reusing the existing device > class rather than creating a new kind of device class, or what? I'm just trying to stay in the vincinity of cfdata, for the headers and for the benefit (consummation) of config(1) uphill and userconf downhill. For the moment, the drivers are given the DV_DULL class, while for modules several classes are given. But userconf doesn't deal with modules... The other reason is that with the drmkms multiple modules classes are provided. It seems to me that, even if it would be useful to disable specific childs (if only for debugging purposes), at the moment there should be a "main" class to disable everything uphill. So the DV_DRMKMS is not exactly the "drmkms" class of modules... > > I'm also thinking it could be useful for a device to fall into multiple > classes for userconf, but I _think_ DV_* classes don't support a device > being in multiple classes. Yes: the DV_* are exclusive: a device can not appear in several classes. This is emphasized in the man page and in the source. > It also could be useful for custom kernels > to have custom modifications to device classification. So I'm > wondering if it would be better for this to be a namespace specific to > config(1) and userconf rather than having anything to do with DV_* > values. This is precisely why I ask for comment ;-) I have two requirements: - that the solution is not ad hoc i.e. that it can provide, in userconf, facilities not limited to drmkms (I don't want to implement a special case to recognize "drmkms" and to expand to all the STARred driver names implied); - that it will not imply to have to maintain special data for userconf to recognize some "magic" strings. But the second item: generating data according to conf is the task of config(1). So config(1) should do the job. Indeed good question: devclass or modules classes or something else? The usr.bin/config/TODO is already listing the problem of the two kind of classes. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
DRM-KMS: add a devclass DV_DRMKMS and allow userconf to deal with classes
[Please do CC me on reply since I _am_ subscribed to the list but don't get the messages...] Note: code can be seen on https://github.com/tlaronde/src . I have implemented "patterns" in sys/kern/subr_userconf.c, in order to allow to manipulate (change, disable, enable, find, list) a device matching a possibly anchored substring. But this doesn't solve the problem for dmskms (to be able to disable all with a single well knows instruction) since the names don't match a regular pattern. I propose to add a DV_DRMKMS class to sys/device.h:enum_devclass; to augment cfdata with a devclass member and modify config(1) accordingly so that in sys/kern/subr_userconf.c can be introduced a (supplementary for now; not replacing) new syntax: exp: number | string | magic | pattern string: '"' alpha alphanum* '"' /* case insensitive */ magic: alpha alphanum /* case insensitive */ pattern: '/' ['^'] alphanum ['$'] '/' /* case insensitive */ {change, disable, enable, find, list} name = exp {change, disable, enable, find, list} class = magic so that: disable class = drmkms does the trick. There is already in usr.bin/config/TODO a paragraph about classes, so it seems this proposal leans towards what was expected. Comments? -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: drm.4 man page and import of X11 drm-kms.7 and al.
On Wed, Oct 18, 2023 at 10:35:26AM +, Taylor R Campbell wrote: > > Date: Tue, 17 Oct 2023 14:39:57 +0200 > > From: tlaro...@kergis.com > > > > I have modified drm.4 to state that the drivers are obsolete and > > to suppress a mention of viadrm that was removed long ago (now superseded by > > viadrmums, provided in drm2/ ---drmkms--- part). > > > > Patch can be retrieved from https://github.com/tlaronde/src > > Thanks, I took the opportunity to update the whole man page. Didn't > realize until now that our drm(4) man page was a local creation > requiring local maintenance. > It's sure is an improvement! Thanks for doing so! -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: drm.4 man page and import of X11 drm-kms.7 and al.
On Wed, Oct 18, 2023 at 10:35:26AM +, Taylor R Campbell wrote: > > > There is no man page for drmkms (the kernel part), but there are man > > pages in the X sources, in the rst format > > (external/mit/libdrm/dist/man/drm-kms.7.rst) with a bunch of related > > resources that provide a view of the DRI thing (from the X POV). > > > > There is rst2man-3.10 (pkgsrc py310-docutils) to convert these to man > > pages. > > > > Should this be done (it is the X11/DRI interface, not the kernel one, so > > should reside in the X11R7 realm)? > > It might be reasonable to ship libdrm man pages in /usr/X11R7/man but > we would need to import the pregenerated rst2man output into > xsrc/external. Not hard in principle but somewhat annoying to deal > with. That said, a cursory skim suggests there's a lot missing here. > I see a lot of API functions cross-referenced, but I don't see their > documentation here? So I'm not sure how useful this would be. What exists is probably better than nothing and, at the very least, drm-kms.7 gives a (part) of the view---unfortunately, a comment in the old version in a header was giving a view of what was wanted on the kernel side, not mentionned in drm-kms7, but I didn't find the equivalent in the new sources (and the documentation provided on the Web by the Linux team is not up-to-date either---there are for example mentions of drmP.h that doesn't exist anymore). So I'm for providing what exists, once more for programmer writing X11 clients (the kernel part is another problem). -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: drm.4 man page and import of X11 drm-kms.7 and al.
On Tue, Oct 17, 2023 at 02:33:43PM +0100, Robert Swindells wrote: > > tlaro...@kergis.com wrote: > > So to clarify: I'm proposing to convert the rst doc pages to man > > pages (with for example the utility I cite), and to add the man pages, > > in man format, to the sources (in order for the sources to not depend > > on a supplementary external tool) and to install the man pages in > > /usr/X11R7/man/. > > I wouldn't bother installing man pages for this, someone working on the > kernel already has the source tree. > > Maybe the drm.4 manpage could be extended to describe the current > status. But someone writing an X11 client should have the information: NetBSD is also a system for development. The man pages should be at least in Xcomp. There is still a big part in user space. And it will help too the ones who want to have a clue about what it is---not to mention that this will clarify the fact that this is heavily X11 linked, which is part of the problem: how could a GPU be used for not "rendering", but as a General Purpose Graphics Processor, if it's not the kernel that is arbitrating but the X11 server. This does mean that an arbitrary application could not work without being converted to use the X11 interface. (I'm my view, the kernel should detect all the resources (it's its role: a kernel is a resource manager allowing a policy to resources access) including auxiliary processors like GPUs, and the rendering is only a specialized usage of these resources.) -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: drm.4 man page and import of X11 drm-kms.7 and al.
[I mean to reply to Mouse, but I have a hell of a time with majordomo or this mailing list: I'm subscribed but I don't get the messages! Can someone look at this, please. TIA.] So to clarify: I'm proposing to convert the rst doc pages to man pages (with for example the utility I cite), and to add the man pages, in man format, to the sources (in order for the sources to not depend on a supplementary external tool) and to install the man pages in /usr/X11R7/man/. The X11 part, for the interface, has changed in 2012, but seems (again: for the interface) stable but the implementation changes and the Linux kernel implementation is still changing frequently and heavily (the drm2/ sources are already significantly behind the Linux sources with not trivial changes; it's, for me, a lost race...). On Tue, Oct 17, 2023 at 02:39:58PM +0200, tlaronde wrote: > I have modified drm.4 to state that the drivers are obsolete and > to suppress a mention of viadrm that was removed long ago (now superseded by > viadrmums, provided in drm2/ ---drmkms--- part). > > Patch can be retrieved from https://github.com/tlaronde/src > > There is no man page for drmkms (the kernel part), but there are man > pages in the X sources, in the rst format > (external/mit/libdrm/dist/man/drm-kms.7.rst) with a bunch of related > resources that provide a view of the DRI thing (from the X POV). > > There is rst2man-3.10 (pkgsrc py310-docutils) to convert these to man > pages. > > Should this be done (it is the X11/DRI interface, not the kernel one, so > should reside in the X11R7 realm)? > -- > Thierry Laronde > http://www.kergis.com/ > http://kertex.kergis.com/ > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
drm.4 man page and import of X11 drm-kms.7 and al.
I have modified drm.4 to state that the drivers are obsolete and to suppress a mention of viadrm that was removed long ago (now superseded by viadrmums, provided in drm2/ ---drmkms--- part). Patch can be retrieved from https://github.com/tlaronde/src There is no man page for drmkms (the kernel part), but there are man pages in the X sources, in the rst format (external/mit/libdrm/dist/man/drm-kms.7.rst) with a bunch of related resources that provide a view of the DRI thing (from the X POV). There is rst2man-3.10 (pkgsrc py310-docutils) to convert these to man pages. Should this be done (it is the X11/DRI interface, not the kernel one, so should reside in the X11R7 realm)? -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: DRM/KMS: report
On Sun, Oct 15, 2023 at 10:13:18AM +, Taylor R Campbell wrote: > > > DRM or now DRM2 (aka DRM/KMS) are inherently and heavily linked to > > X11 and to Linux. Due to the size of the thing, NetBSD is deriving > > a version from the one FreeBSD tries to derive. > > Not sure what you mean about FreeBSD, but our drm2 code base was > developed largely independent of whatever is in FreeBSD, and as far as > I know was started well before FreeBSD adopted the same approach of > writing Linux API shims. Then this is a difference with the first version, according to drm(4) that is referring exclusively to DRM (first version) and is partial (not mentionning DRM2 / DRMKMS) or obsolete. For the record, I started first by trying to review _all_ the commits starting from 2007-03-20 (the first import of the first---for NetBSD--- version)... But I realised, after some time and considering how fast I was going through, it was hopeless... So I have a view, but far from complete or even accurate. > > > To make things even worse, the abuse of acronyms is blurring things that > > didn't need to be made even less clear. Not to mention the fact that DRM > > is also used for Digital Rights Management---that has strictly nothing > > to do with the thing---, DRI (a part of the X11 stuff) is also used > > instead of DRM for the X11 part, and DRM2 is also referred too as > > DRM/KMS. > > sys/external/bsd/drm is the previous generation of the drm code base, > from before it did any kernel mode-setting (KMS). Display > configuration was done by peeking and poking device registers from > userland through /dev/mem and /dev/pci -- the legacy user mode-setting > approach (UMS). The /dev/dri/ nodes were used by userland only to map > some registers and manage graphics buffers bound into the GPU address > space. > > sys/external/bsd/drm2 is the current generation of the drm code base, > including both UMS and KMS. With KMS, display configuration is done > by a set of structured ioctls on /dev/dri/ nodes, with all device > register access done by the kernel. (The /dev/dri/ nodes are also > used to manage graphics buffers.) > > When I more or less started over from scratch, I called it drm2 just > so it would have a distinct place in the source tree while people > still relied on the previous generation of the code. > > By now I think we should just delete sys/external/bsd/drm; it has been > unmaintained for so long it is unlikely to work. If there's interest > in the legacy UMS drivers, they should all still be in the drm2 tree > and can be adapted like I did with viadrmums. But I have no hardware > for most of them. I will put all the documentation bits together some place for reference. Thanks for the clarifications! > > > The drivers using the new API have sometimes "kms" in the name (for > > i915, I guess to make a difference with the previous "legacy" > > i915drm), but generally not, or if this is the case, this is not the > > device attaching early: > > > > # DRMKMS drivers > > i915drmkms* at pci? dev ? function ? > > intelfb*at intelfbbus? > > `i915drmkms' happened because `i915' is not allowed (ends with a > digit) and `i915drm' was already taken. > > > To illustrate the namespace problem, take "radeon": > > > > radeondrm* is the legacy DRM driver and: > > > > radeon* is the DRM2 and this is its child, the fb, that has the "kms" > > substring: > > > > radeondrmkmsfb* at radeonfbbus? > > `radeondrmkms' happened because `radeonfb' was already taken. > > I'm not attached to these names, but they've been around for long > enough they are probably named in existing boot.cfg files, so changing > them might is likely to break people's bootloaders. > > Not hard to imagine creating a new way to tag drivers that can be > referenced by userconf so that renaming isn't necessary. > If the drivers were matching a rule, I have already implemented in sys/kern/subr_userconf.c (on my git fork on https://github.com/tlaronde/src) the use of "patterns" to change, disable, enable, find and list matching driver names. I could add specifiers to the "patterns" to match parent device or child device. I could extend too cfdata in order to allow to take into account a devclass and to match against it. Modules are setting a class and it would be the simpler to be able to use such a tag in userconf to disable the devices without having to resort to ad hoc lists---and even worse, to expand a magic name in MD bootinfo stuff, with the obligation to update lists and the risk to have to augment the size of bootinfo data. I wanted and still want to implement something gene
DRM/KMS: report
[I'm sending this to the tech-kern since the previous message on tech-userlevel is only: the list seems dead?] [CAVEATS: Please remember that I'm not an english native speaker, and that what follows is not a "lecture" or a judgement about what is done, but a home made translation in some english of some of the notes---there is more documentation to come later. if I wanted to look at the DRM/KMS stuff, it was because I felt (and still feel...) that I would never haved embarked in such an appalling task to try to tame a thing like that ;-) I'm not "blaming" or "naming and shaming"---or whatever the term is---or despising work or people.] 3 months ago, I have engaged to take a look at the DRM/KMS object, with the goal to ensure that the NetBSD kernel could be severed at will from it. Here is the report. I will start with code for the impatients, and will continue with documentation / comments and end with future directions (for me). Note: I have finally taken again an Internet optical fiber connection (after infelicities with a previous provider), so I have been able to pull and push on a fork that is here: https://github.com/tlaronde/src WHAT IS IN THESE SOURCES commit 6d715506703ed9f0bec6a39fec8794b5b8eb Author: Thierry LARONDE Date: Fri Oct 13 18:39:03 2023 +0200 In order to allow to change, disable, enable, find or list devices according to a pattern (specified between slashes; can be anchored at beginning with '^'; at end with '$'; but no wildcard dot, or count or range...), the userconf parsing are modified. It works... but not for what I wanted. Giving /drm/ for example as a pattern will actually disable all matching devices, but since "radeondrmkmsfb" matches, you end up with no display at all because the drm is nonetheless attempted. "/kms$/" and "/drm$/" could work. But this is more a debugging feature (except for find or list) than something to use bluntly for the moment. Should we have /pattern/@/parent_pattern/? Or enforce a namespace policy? At least, one should use "list /pattern/" or "find /pattern/" before modifying blindly. commit e62e0b293986bfb3a749ab499d8367b5c6a161a2 Author: Thierry LARONDE Date: Thu Oct 12 18:07:13 2023 +0200 Just add the precision that the pmap_pv_untrack() users are DRM2 aka DRMKMS drivers (not "legacy" DRM ones). commit 930cf9cd86c51551b7731777df2882a64ba655b7 Author: Thierry LARONDE Date: Thu Oct 12 09:00:56 2023 +0200 For consistency, what is related to monitors is not taken from XFree86 but taken from the latest VESA DMT (v 1.0, Rev. 13). So modelines are removed, and dmt added, and the code fixed to work with this with no user visible change for the moment. And some modes not defined in the VESA DMT are put in an extradmt file, with fixes for Mac monitors (taken from parameters in the Linux framebuffer code). For consistency too, published strings like "800x600x60" are replaced by "800x600@60Hz" to avoid multiplying apples by oranges and ambiguity about exactly what the last number describes. The double scan entries were not used and are not generated. DRM, DRM2 aka DRM/KMS: SOME NOTES DRM or now DRM2 (aka DRM/KMS) are inherently and heavily linked to X11 and to Linux. Due to the size of the thing, NetBSD is deriving a version from the one FreeBSD tries to derive. To make things worse, the API is changing significantly. So we can only adapt late; and, de facto, we always drag behind. The important thing to keep in mind is that this is heavily linked to X11. It's not something independent. To make things even worse, the abuse of acronyms is blurring things that didn't need to be made even less clear. Not to mention the fact that DRM is also used for Digital Rights Management---that has strictly nothing to do with the thing---, DRI (a part of the X11 stuff) is also used instead of DRM for the X11 part, and DRM2 is also referred too as DRM/KMS. The "legacy" ("first" version, at least in NetBSD) DRM drivers are these ones (for x86 ones): #i915drm* at drm? # Intel i915, i945 DRM driver #mach64drm* at drm? # mach64 (3D Rage Pro, Rage) DRM driver #mgadrm*at drm? # Matrox G[24]00, G[45]50 DRM driver #r128drm* at drm? # ATI Rage 128 DRM driver #radeondrm* at drm? # ATI Radeon DRM driver #savagedrm* at drm? # S3 Savage DRM driver #sisdrm*at drm? # SiS DRM driver #tdfxdrm* at drm? # 3dfx (voodoo) DRM driver The drivers using the new API have sometimes "kms" in the name (for i915, I guess to make a difference with the previous "legacy" i915drm), but generally not, or if this is the case, this is not the devi
DRM/KMS: report
[I'm sending this to the tech-kern since the previous message on tech-userlevel is only: the list seems dead?] [CAVEATS: Please remember that I'm not an english native speaker, and that what follows is not a "lecture" or a judgement about what is done, but a home made translation in some english of some of the notes---there is more documentation to come later. if I wanted to look at the DRM/KMS stuff, it was because I felt (and still feel...) that I would never haved embarked in such an appalling task to try to tame a thing like that ;-) I'm not "blaming" or "naming and shaming"---or whatever the term is---or despising work or people.] 3 months ago, I have engaged to take a look at the DRM/KMS object, with the goal to ensure that the NetBSD kernel could be severed at will from it. Here is the report. I will start with code for the impatients, and will continue with documentation / comments and end with future directions (for me). Note: I have finally taken again an Internet optical fiber connection (after infelicities with a previous provider), so I have been able to pull and push on a fork that is here: https://github.com/tlaronde/src WHAT IS IN THESE SOURCES commit 6d715506703ed9f0bec6a39fec8794b5b8eb Author: Thierry LARONDE Date: Fri Oct 13 18:39:03 2023 +0200 In order to allow to change, disable, enable, find or list devices according to a pattern (specified between slashes; can be anchored at beginning with '^'; at end with '$'; but no wildcard dot, or count or range...), the userconf parsing are modified. It works... but not for what I wanted. Giving /drm/ for example as a pattern will actually disable all matching devices, but since "radeondrmkmsfb" matches, you end up with no display at all because the drm is nonetheless attempted. "/kms$/" and "/drm$/" could work. But this is more a debugging feature (except for find or list) than something to use bluntly for the moment. Should we have /pattern/@/parent_pattern/? Or enforce a namespace policy? At least, one should use "list /pattern/" or "find /pattern/" before modifying blindly. commit e62e0b293986bfb3a749ab499d8367b5c6a161a2 Author: Thierry LARONDE Date: Thu Oct 12 18:07:13 2023 +0200 Just add the precision that the pmap_pv_untrack() users are DRM2 aka DRMKMS drivers (not "legacy" DRM ones). commit 930cf9cd86c51551b7731777df2882a64ba655b7 Author: Thierry LARONDE Date: Thu Oct 12 09:00:56 2023 +0200 For consistency, what is related to monitors is not taken from XFree86 but taken from the latest VESA DMT (v 1.0, Rev. 13). So modelines are removed, and dmt added, and the code fixed to work with this with no user visible change for the moment. And some modes not defined in the VESA DMT are put in an extradmt file, with fixes for Mac monitors (taken from parameters in the Linux framebuffer code). For consistency too, published strings like "800x600x60" are replaced by "800x600@60Hz" to avoid multiplying apples by oranges and ambiguity about exactly what the last number describes. The double scan entries were not used and are not generated. DRM, DRM2 aka DRM/KMS: SOME NOTES DRM or now DRM2 (aka DRM/KMS) are inherently and heavily linked to X11 and to Linux. Due to the size of the thing, NetBSD is deriving a version from the one FreeBSD tries to derive. To make things worse, the API is changing significantly. So we can only adapt late; and, de facto, we always drag behind. The important thing to keep in mind is that this is heavily linked to X11. It's not something independent. To make things even worse, the abuse of acronyms is blurring things that didn't need to be made even less clear. Not to mention the fact that DRM is also used for Digital Rights Management---that has strictly nothing to do with the thing---, DRI (a part of the X11 stuff) is also used instead of DRM for the X11 part, and DRM2 is also referred too as DRM/KMS. The "legacy" ("first" version, at least in NetBSD) DRM drivers are these ones (for x86 ones): #i915drm* at drm? # Intel i915, i945 DRM driver #mach64drm* at drm? # mach64 (3D Rage Pro, Rage) DRM driver #mgadrm*at drm? # Matrox G[24]00, G[45]50 DRM driver #r128drm* at drm? # ATI Rage 128 DRM driver #radeondrm* at drm? # ATI Radeon DRM driver #savagedrm* at drm? # S3 Savage DRM driver #sisdrm*at drm? # SiS DRM driver #tdfxdrm* at drm? # 3dfx (voodoo) DRM driver The drivers using the new API have sometimes "kms" in the name (for i915, I guess to make a difference with the previous "legacy" i915drm), but generally not, or if this is the case, this is not the devi
ISA: a book
FWIW---and this is probably already known by must---I found that: "The RISC-V reader: an open architecture atlas", by David Patterson and Andrew Waterman, Strawberry Canyon LLC, ISBN 9780999249116 to be a great help to "put things together"---I mean it is a short book (a hundred of pages excluding appendices) giving a kind of "root tree" about hardware/software considerations, with comparisons with other architectures; root tree on which one can "mount" various pieces of information he had picked up here and there (elsewhere), so that the whole picture can take shape. (It is not a text book or a high level: you can start programming for RISC-V with this.) If some read this list to try to get into kernel, this is perhaps a possible reference to add to the books you could or should read. FWIW, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: How to submit patches?
Le Sun, May 07, 2023 at 06:14:35PM +0200, Martin Husemann a écrit : > On Sun, May 07, 2023 at 04:56:33PM +0200, tlaro...@polynum.com wrote: > > I'm a bit reluctant to put all the platform lists in copy, since this > > is typically generic: it deals with the monitor capacities, updating > > the VESA DMT specs... > > I pointed a few people at your mail, but maybe you could describe the > motivation of the changes a bit more verbosly - at first it all looked > like a lot of churn for no particular reason (but that is probably because > I don't know anything about that part of the source). With NetBSD 10 beta, there was a change in resolution picked-up for the framebuffer compared to 9.3: I have a 16:9 ratio LCD; 9.3 picked-up this ratio and a correct resolution and font for the framebuffer, while 10 beta does not (on the very same hardware). Trying to investigate why this difference, I found nothing obvious. So I started to track back, starting from the end: the monitor. Since I didn't know anything about this stuff, I started to download the published specs by VESA and read the specs. So the first step was to update to the latest VESA DMT and to fix some things that were wrong: there are discrepancies about the Mac monitors "modelines"; some historical modes were missing; Xorg (the source used in NetBSD at least) is not up to date either and was used as a reference for the VESA DMT modes, while with the VESA DMT, de facto and de jure, it should not. Since I can work on that only on very scarce hours, instead of waiting (how long?) to finish all, I prefer to commit a step that is an independant unit by itself and is finished, without breaking anything (it adds modes; it corrects---printing "800x600x60" for a mode is multiplying apples by bananas, since 60 is a frequency so now it appears as "800x600@60Hz" for example; it removes unused things that just complexify things for someone who tries to update the code; etc.) so that at least, _this_, will not have to be done by someone else. Of course, some of the fields added from the VESA DMT would be needed in the future when updating code about EDID. So it is not gratuitous. Next step will be to review the EDID code and I will continue back (praying to not have to deal with drm...) until I find why it doesn't work correctly and understanding the framebuffer stuff. Is this clearer? -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: How to submit patches?
Le Sun, May 07, 2023 at 09:40:56AM -0400, Thor Lancelot Simon a écrit : > On Sat, May 06, 2023 at 12:12:54PM +0200, tlaro...@polynum.com wrote: > > > > How to submit patches without wasting time? (mine included) > > It might be that you get quicker response on one of the mailing lists > for platforms where the patches are particularly useful. It might not, > too - but the set of people with the knowledge to review work in this area > is not so large, and copying the per-port lists might help get their > attention. I'm a bit reluctant to put all the platform lists in copy, since this is typically generic: it deals with the monitor capacities, updating the VESA DMT specs... -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
PING sys/dev/videomode: latest DMT and complete Established Timings
Since there are some infelicities in the handling of the resolution of the framebuffer (10.0_BETA doesn't behave as 9.3), I have started to review the code, starting from the end: the monitor. The monitor being the reference, I have replaced the modelines, derived from XFree86, with the reference: the latest VESA DMT (v 1.0, Rev. 13) ---that is ahead compared to: /usr/xsrc/external/mit/xorg-server/dist/hw/xfree86/common/vesamodes. This file is: "dmt". I have also put modes not found in VESA DMT, but referenced in the Established Timings, so in VESA EDID, in a file "extradmt". XFree86 modelines can be easily computed from the DMT. The reverse is not true. Furthermore there are various VESA identifiers (one, two or three bytes) that will be used in the future. It is interesting to note, too, that there are discrepancies between what is found in the XFree86 modelines and what can be found in the modelines in the Linux framebuffer code---for one Established Timing mode, I had to resort to the Linux parameters since what is found in the XFree86 (at least 10.0 xsrc) is not accurate. "dmt" replaces "modelines" "extradmt" is new. "dmt2c.awk" replaces "modelines2c.awk" "videomode.c" has to be regenerated using Makefile.videomode. The remaining diff is adjustements for the new parameters. For ergonomy and consistancy, I have replaced strings like "800x600x60" by "800x600@60Hz". There are now 93 modes instead of 46 (the double scan entries and the related code weren't used; and this is not used in the present code either). For safety, not knowing if this has hardware implications, the new "reduced blanking" entries are skipped. This is only a first step and does not solve the problem I see. The next step will be reviewing and perhaps updating the edid code. And I will follow the track until I find why the preferences are not handled correctly from what is passed by the monitor. Note: this one infelicity, for me, is not severe enough to hinder, per se, the release of 10.0. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C # $NetBSD$ # These values were typed by Thierry Laronde , # 2023-02-27, from: # # -- # VESA and Industry Standards and Guidelines # for Computer Display Monitor Timing (DMT) # Version 1.0, Rev. 13 # February 8, 2013 # Copyright 1994--2013 Video Electronics Standards Association. All # other rights reserved. # -- # # In brief the document above states: USE AT YOUR OWN RISKS. # # This master file has only values as given in the specification # identified above. # # The values should have been taken as is. From these values, others can # be derived and there is even some redundancy (see the processing # script for the computations). The records are in the same order as in # the document: first line corresponds to page 18; last line to page # 105. There hence should be 88 different records here. # # In this file, empty or blank lines or lines beginning with a '#' are # ignored. # # Remaining are a sequence of line terminated records, with the # following blank separated fields: # # Timing_Name /* Hor_Pixels 'x' Ver_Pixels '@' Refresh_Rate 'Hz' suffix */ # Ids /* DMT_Id ',' STD_Id ',' CVT_Id (1 hexabyte, [2h] , [3h]) */ # Hor_Pixels # Ver_Pixels # Pixel_Clock /* MHz */ # Character_Width # Flags /* Scan_Type ('I' | 'N') ',' Reduced_Blanking ('RB' | 'N') */ # Hor_Sync_Polarity /* '+' | '-' */ # Ver_Sync_Polarity /* '+' | '-' */ # H_Right_Border # H_Front_Porch # Hor_Sync_Time # H_Back_Porch # H_Left_Border # V_Bottom_Border # V_Front_Porch # Ver_Sync_Time # V_Back_Porch # V_Top_Border # 640x350@85Hz 01,, 640 350 31.500 8 N,N + - 0 4 8 12 0 0 32 3 60 0 640x400@85Hz 02,3119, 640 400 31.500 8 N,N - + 0 4 8 12 0 0 1 3 41 0 720x400@85Hz 03,, 720 400 35.500 9 N,N - + 0 4 8 12 0 0 1 3 42 0 640x480@60Hz 04,3140, 640 480 25.175 8 N,N - - 1 1 12 5 1 8 2 2 25 8 640x480@72Hz 05,314C, 640 480 31.500 8 N,N - - 1 2 5 15 1 8 1 3 20 8 640x480@75Hz 06,314F, 640 480 31.500 8 N,N - - 0 2 8 15 0 0 1 3 16 0 640x480@85Hz 07,3159, 640 480 36.000 8 N,N - - 0 7 7 10 0 0 1 3 25 0 800x600@56Hz 08,, 800 600 36.000 8 N,N + + 0 3 9 16 0 0 1 2 22 0 800x600@60Hz 09,4540, 800 600 40.000 8 N,N + + 0 5 16 11 0 0 1 4 23 0 800x600@72Hz 0A,454C, 800 600 50.000 8 N,N + + 0 7 15 8 0 0 37 6 23 0 800x600@75Hz 0B,454F, 800 600 49.500 8 N,N + + 0 2 10 20 0 0 1 3 21 0 800x600@85Hz 0C,4559, 800 600 56.250 8 N,N + + 0 4 8 19 0 0 1 3 27 0 800x600@120Hz_rb 0D,, 800 600 73.25 8 N,RB + - 0 6 4 10 0 0 3 4 29 0 848x480@60Hz 0E,, 848 480 33.750 8 N,N + + 0 2 14 14 0 0 6 8 23 0 1024x768@43Hz_i 0F,, 1024 768 44.900 8 I,N + + 0 1 22 7 0 0 0 4 20 0 1024x768@60Hz 10,6140, 1024 768 65.000 8 N,N - - 0 3 17 20 0 0 3 6 29 0 1024x768@70Hz 11,614A, 1024 768 75.000 8 N,N - - 0 3 17 18 0 0 3 6 29 0 1024x768@75Hz 12,614F, 1024 768 78.750 8 N,N + + 0 2 12 22 0 0 1 3 28 0 1024x768@85Hz 13,6159, 1024 768
Re: How to submit patches?
Le Sat, May 06, 2023 at 02:13:58PM +0200, Martin Husemann a écrit : > On Sat, May 06, 2023 at 12:12:54PM +0200, tlaro...@polynum.com wrote: > > Hello, > > > > On Mon, 27 Feb 2023 12:33:32 +0100, I sent to this list a collection of > > patches for sys/dev/videomode/, starting by updating the DMT to the > > latest, and planning to review further the code (sending patches > > when I have achieved a complete step in the course, because I'm having > > a hard time finding some spare hours to work on this). > > > > There has been no comment; no reaction. > > Sorry, this happens sometimes - e.g. when topics are sligthly special > and noone who is familiar with that code has time to review immediately. > > Just ping after some reasonable time of no reaction (I'd say min one max > two weeks or so) by resending the patches. OK, I will resend the patches. Best, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
How to submit patches?
Hello, On Mon, 27 Feb 2023 12:33:32 +0100, I sent to this list a collection of patches for sys/dev/videomode/, starting by updating the DMT to the latest, and planning to review further the code (sending patches when I have achieved a complete step in the course, because I'm having a hard time finding some spare hours to work on this). There has been no comment; no reaction. How to submit patches without wasting time? (mine included) TIA -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
[PATCH] sys/dev/videomode: latest DMT and complete Established Timings
Since there are some infelicities in the handling of the resolution of the framebuffer (10.0_BETA doesn't behave as 9.3), I have started to review the code, starting from the end: the monitor. The monitor being the reference, I have replaced the modelines, derived from XFree86, with the reference: the latest VESA DMT (v 1.0, Rev. 13) ---that is ahead compared to: /usr/xsrc/external/mit/xorg-server/dist/hw/xfree86/common/vesamodes. This file is: "dmt". I have also put modes not found in VESA DMT, but referenced in the Established Timings, so in VESA EDID, in a file "extradmt". XFree86 modelines can be easily computed from the DMT. The reverse is not true. Furthermore there are various VESA identifiers (one, two or three bytes) that will be used in the future. It is interesting to note, too, that there are discrepancies between what is found in the XFree86 modelines and what can be found in the modelines in the Linux framebuffer code---for one Established Timing mode, I had to resort to the Linux parameters since what is found in the XFree86 (at least 10.0 xsrc) is not accurate. "dmt" replaces "modelines" "extradmt" is new. "dmt2c.awk" replaces "modelines2c.awk" "videomode.c" has to be regenerated using Makefile.videomode. The remaining diff is adjustements for the new parameters. For ergonomy and consistancy, I have replaced strings like "800x600x60" by "800x600@60Hz". There are now 93 modes instead of 46 (the double scan entries and the related code weren't used; and this is not used in the present code either). For safety, not knowing if this has hardware implications, the new "reduced blanking" entries are skipped. This is only a first step and does not solve the problem I see. The next step will be reviewing and perhaps updating the edid code. And I will follow the track until I find why the preferences are not handled correctly from what is passed by the monitor. Note: this one infelicity, for me, is not severe enough to hinder, per se, the release of 10.0. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C # $NetBSD$ # These values were typed by Thierry Laronde , # 2023-02-27, from: # # -- # VESA and Industry Standards and Guidelines # for Computer Display Monitor Timing (DMT) # Version 1.0, Rev. 13 # February 8, 2013 # Copyright 1994--2013 Video Electronics Standards Association. All # other rights reserved. # -- # # In brief the document above states: USE AT YOUR OWN RISKS. # # This master file has only values as given in the specification # identified above. # # The values should have been taken as is. From these values, others can # be derived and there is even some redundancy (see the processing # script for the computations). The records are in the same order as in # the document: first line corresponds to page 18; last line to page # 105. There hence should be 88 different records here. # # In this file, empty or blank lines or lines beginning with a '#' are # ignored. # # Remaining are a sequence of line terminated records, with the # following blank separated fields: # # Timing_Name /* Hor_Pixels 'x' Ver_Pixels '@' Refresh_Rate 'Hz' suffix */ # Ids /* DMT_Id ',' STD_Id ',' CVT_Id (1 hexabyte, [2h] , [3h]) */ # Hor_Pixels # Ver_Pixels # Pixel_Clock /* MHz */ # Character_Width # Flags /* Scan_Type ('I' | 'N') ',' Reduced_Blanking ('RB' | 'N') */ # Hor_Sync_Polarity /* '+' | '-' */ # Ver_Sync_Polarity /* '+' | '-' */ # H_Right_Border # H_Front_Porch # Hor_Sync_Time # H_Back_Porch # H_Left_Border # V_Bottom_Border # V_Front_Porch # Ver_Sync_Time # V_Back_Porch # V_Top_Border # 640x350@85Hz 01,, 640 350 31.500 8 N,N + - 0 4 8 12 0 0 32 3 60 0 640x400@85Hz 02,3119, 640 400 31.500 8 N,N - + 0 4 8 12 0 0 1 3 41 0 720x400@85Hz 03,, 720 400 35.500 9 N,N - + 0 4 8 12 0 0 1 3 42 0 640x480@60Hz 04,3140, 640 480 25.175 8 N,N - - 1 1 12 5 1 8 2 2 25 8 640x480@72Hz 05,314C, 640 480 31.500 8 N,N - - 1 2 5 15 1 8 1 3 20 8 640x480@75Hz 06,314F, 640 480 31.500 8 N,N - - 0 2 8 15 0 0 1 3 16 0 640x480@85Hz 07,3159, 640 480 36.000 8 N,N - - 0 7 7 10 0 0 1 3 25 0 800x600@56Hz 08,, 800 600 36.000 8 N,N + + 0 3 9 16 0 0 1 2 22 0 800x600@60Hz 09,4540, 800 600 40.000 8 N,N + + 0 5 16 11 0 0 1 4 23 0 800x600@72Hz 0A,454C, 800 600 50.000 8 N,N + + 0 7 15 8 0 0 37 6 23 0 800x600@75Hz 0B,454F, 800 600 49.500 8 N,N + + 0 2 10 20 0 0 1 3 21 0 800x600@85Hz 0C,4559, 800 600 56.250 8 N,N + + 0 4 8 19 0 0 1 3 27 0 800x600@120Hz_rb 0D,, 800 600 73.25 8 N,RB + - 0 6 4 10 0 0 3 4 29 0 848x480@60Hz 0E,, 848 480 33.750 8 N,N + + 0 2 14 14 0 0 6 8 23 0 1024x768@43Hz_i 0F,, 1024 768 44.900 8 I,N + + 0 1 22 7 0 0 0 4 20 0 1024x768@60Hz 10,6140, 1024 768 65.000 8 N,N - - 0 3 17 20 0 0 3 6 29 0 1024x768@70Hz 11,614A, 1024 768 75.000 8 N,N - - 0 3 17 18 0 0 3 6 29 0 1024x768@75Hz 12,614F, 1024 768 78.750 8 N,N + + 0 2 12 22 0 0 1 3 28 0 1024x768@85Hz 13,6159, 1024 768
Re: kernel goes dark on boot
Le Tue, Feb 21, 2023 at 10:00:10AM -0400, Jared McNeill a écrit : > Yeah sorry you can?t just not exit boot services and boot the OS. UEFI code > has certain expectations around the execution environment (MMU on, 1:1 PA to > VA for example) that starting the kernel is going to interfere with. The > moment the kernel touches the MMU, all of the resident UEFI code will cease > to function. This includes code that may be running asynchronously (timers > etc) that are not stopped properly due to the missing ExitBootServices call. > FWIW, having followed EDK II development list for a while, there are further modifications at the moment made because a lot of people are focusing on VMs (there is quite a market on this) and want to use EDK II UEFI code as emulated BIOS. And it must be noted that qemu seems to be the main target. Just a caveats for the braves who want to follow this... T. Laronde > > > On Feb 21, 2023, at 9:46 AM, Emmanuel Dreyfus wrote: > > > > ?On Tue, Feb 21, 2023 at 08:05:00AM -0400, Jared McNeill wrote: > >> After calling ExitBootServices(), the only things that work are UEFI > >> runtime > >> services. You'll have to find another way to print to the console. > > > > I can skip the ExitBootServices call and keep printing, I have already > > done thatn at least in C functions. I have no experience of doing that > > from assembly code. > > > > The same bug exists with a XEN3_DOM0 kernel. Xen starts up, and the > > kernel crash without displaying anything. I wonder if there are tools > > to trace the dom0 with help from Xen. > > > > -- > > Emmanuel Dreyfus > > m...@netbsd.org > -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
[PATCHES] sys/dev/videomode
Since the choice of the resolution (with 10.0 BETA) is not optimal, I have started to review sys/dev/videomode in order to fix the preferences. The first step was to update the timings. Since, with whatever choice for the resolution, a monitor will not do what it is not able to do, I replaced the modelines, derived from XFree86, with the specifications taken directly from the latest VESA DMT specification (this is an update even compared to the current xsrc/external/mit/xorg-server/dist/hw/xfree86/common/vesamodes). One of the main difference is that I do not put the specification in the XFree86 modeline format, but I take all the relevant informations from the spec, from which the modelines can also---obviously---be derived (the reverse is not true: a modeline doesn't distinguish between back/front porch and borders; and the VESA identifiers are not present even for VESA DMT modes). I attach the "dmt" file for reference (the awk script is updated and some modifications to other files are made in order to fit this in the present code without modifying its behavior for now; so it is useless alone). To my surprise, some of the "Established timings" (there is a bitmap in the EDID for these) are not specified in the VESA DMT. So I added an "extradmt". I had to derive the pseudo DMT timings from the XFree86 extramodes modelines, the problem being that, as said above, the distinction between porch and border is not made. So some values are fake ones. A supplementary problem is that some of the Mac II modes are not in the XFree86 modelines; but I found them in linux/drivers/video/macmodes.c i.e. in a Linux source file, allowing me to describe all the Established timing thus getting rid of the disturbing DIAGNOSTIC "no data for est. mode %s\n". The Linux source file is GPL 2. The question is: when it comes to parameters/hardware specs (I'm not taking code, I'm taking numbers), what is the license? Is it considered as "public" information or is the license binding? Or is an acknowledge of the source enough without being tied to the license of the file where the information (not code) was found? Note: these go to an extradmt file, i.e. is severed from the VESA DMT. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C $NetBSD$ # These values were typed by Thierry Laronde , # 2023-02-07, from: # # -- # VESA and Industry Standards and Guidelines # for Computer Display Monitor Timing (DMT) # Version 1.0, Rev. 12 # November 17, 2008 # Copyright 1994--2008 Video Electronics Standards Association. All # other rights reserved. # -- # # In brief the document above states: USE AT YOUR OWN RISKS. # # This master file has only values as given in the specification # identified above. # # The values should have been taken as is. From these values, others can # be derived and there is even some redundancy (see the processing # script for the computations). The records are in the same order as in # the document: first line corresponds to page 15; last line to page # 100. There hence should be 86 different records here. # # In this file, empty lines or lines beginning with a '#' are ignored. # Remaining are a sequence of line terminated records, with the # following blank separated fields: # # Timing_Name /* Hor_Pixels 'x' Ver_Pixels '@' Refresh_Rate 'Hz' suffix */ # Ids /* DMT_Id ',' STD_Id ',' CVT_Id (1 hexabyte, [2h] , [3h]) */ # Hor_Pixels # Ver_Pixels # Pixel_Clock /* MHz */ # Character_Width # Flags /* Scan_Type ('I' | 'N') ',' Reduced_Blanking ('RB' | 'N') */ # Hor_Sync_Polarity /* '+' | '-' */ # Ver_Sync_Polarity /* '+' | '-' */ # H_Right_Border # H_Front_Porch # Hor_Sync_Time # H_Back_Porch # H_Left_Border # V_Bottom_Border # V_Front_Porch # Ver_Sync_Time # V_Back_Porch # V_Top_Border # 640x350@85Hz 01,, 640 350 31.500 8 N,N + - 0 4 8 12 0 0 32 3 60 0 640x400@85Hz 02,3119, 640 400 31.500 8 N,N - + 0 4 8 12 0 0 1 3 41 0 720x400@85Hz 03,, 720 400 35.500 9 N,N - + 0 4 8 12 0 0 1 3 42 0 640x480@60Hz 04,3140, 640 480 25.175 8 N,N - - 1 1 12 5 1 8 2 2 25 8 640x480@72Hz 05,314C, 640 480 31.500 8 N,N - - 1 2 5 15 1 8 1 3 20 8 640x480@75Hz 06,314F, 640 480 31.500 8 N,N - - 0 2 8 15 0 0 1 3 16 0 640x480@85Hz 07,3159, 640 480 36.000 8 N,N - - 0 7 7 10 0 0 1 3 25 0 800x600@56Hz 08,, 800 600 36.000 8 N,N + + 0 3 9 16 0 0 1 2 22 0 800x600@60Hz 09,4540, 800 600 40.000 8 N,N + + 0 5 16 11 0 0 1 4 23 0 800x600@72Hz 0A,454C, 800 600 50.000 8 N,N + + 0 7 15 8 0 0 37 6 23 0 800x600@75Hz 0B,454F, 800 600 49.500 8 N,N + + 0 2 10 20 0 0 1 3 21 0 800x600@85Hz 0C,4559, 800 600 56.250 8 N,N + + 0 4 8 19 0 0 1 3 27 0 800x600@120Hz_rb 0D,, 800 600 73.25 8 N,RB + - 0 6 4 10 0 0 3 4 29 0 848x480@60Hz 0E,, 848 480 33.750 8 N,N + + 0 2 14 14 0 0 6 8 23 0 1024x768@43Hz_i 0F,, 1024 768 44.900 8 I,N + + 0 1 22 7 0 0 0 4 20 0 1024x768@60Hz 10,6140, 1024 768 65.000 8 N,N - - 0 3 17 20 0 0 3 6 29
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Le Sun, Jan 29, 2023 at 05:23:00PM +, Taylor R Campbell a écrit : > > Date: Sun, 29 Jan 2023 16:44:08 +0100 > > From: tlaro...@polynum.com > > > > I will look (silently) to dev/pci/radeonfb.c to understand better the > > logics and try to find if there is a way to obtain a better console > > display. > > FYI, dev/pci/radeonfb.c is the legacy radeon framebuffer driver only > for very old (~20-year-old) devices, not the modern drm driver. > Yep. Realized that when adding debugging information in this file that did not show up... > > BTW, the problem is with VGA and DVI(-D) connections. With another monitor > > connected with HDMI (so more recent than this present 16:9 monitor, that > > have only VGA and DVI-D connectors and was manufactured in > > 2012 according to the EDID), the framebuffer has a better resolution. > > Comparing dmesg output from `boot -vx' with the two connectors may > help to diagnose what's happening. > > (If you already sent it, sorry -- haven't had time to look closely > yet.) Yes: I have already sent the various dmesg'es to you :-) In the mean time, I will try to worm my way in the sources. Even if I don't succeed in finding a cure, I will undoubtely learn things along the way... -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Le Sun, Jan 29, 2023 at 03:59:45PM +0100, tlaro...@polynum.com a écrit : > Le Sun, Jan 29, 2023 at 02:54:39PM +0100, tlaro...@polynum.com a écrit : > > Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit : > > > > > > Context: I'm testing NetBSD 10.0 BETA on an isolated node (not > > > production). Only kernel and modules (not userland); and kernel is not > > > GENERIC but a special config one matching the previous 9.2 config > > > running on the node. > > > > > > No problem so far. As a user (and as advertised), I had simply to use > > > audiocfg(1) to set the new correct default for audio in order to have > > > sound back where I used to expect it. > > > > > > The main difference is about the framebuffer: previous kernel version > > > picked the correct mode. NetBSD 10.0 does not and use "entry level" > > > mode 640x480x67, resulting streched fat big characters; message: > > > > > > no data for est. mode 640x480x67 > > > > I think we are looking at the wrong place. The problem is the depth > > in the mode looked for: 67! The only depths the cards new about are > > multiple of 2^3. > > > > So where does this come from? > > Replying to myself: it is not the depth, but the frequency and it comes > from sys/dev/videomode/edid.c. > > Now trying to find why, at least, it does not find 640x480x60, which > exists---and 720x400x70 that exists also. I have it backward: the failure is displayed, for DIAGNOSTIC, for one mode that is not found, but this does not mean that others are not found. The monitor EDID advertizes only two modes: 640x480x60 and 720x400x70 (while it can do others). The screen being 16:9 (nominal resolution is 1600x900), the VESA mode chosen leads to this "ugly" rendering with stretched, fat characters---which was not the case with 9.2. But is correct with the logics implemented if I'm not (this time) mistaken. I will look (silently) to dev/pci/radeonfb.c to understand better the logics and try to find if there is a way to obtain a better console display. BTW, the problem is with VGA and DVI(-D) connections. With another monitor connected with HDMI (so more recent than this present 16:9 monitor, that have only VGA and DVI-D connectors and was manufactured in 2012 according to the EDID), the framebuffer has a better resolution. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit : > > Context: I'm testing NetBSD 10.0 BETA on an isolated node (not > production). Only kernel and modules (not userland); and kernel is not > GENERIC but a special config one matching the previous 9.2 config > running on the node. > > No problem so far. As a user (and as advertised), I had simply to use > audiocfg(1) to set the new correct default for audio in order to have > sound back where I used to expect it. > > The main difference is about the framebuffer: previous kernel version > picked the correct mode. NetBSD 10.0 does not and use "entry level" > mode 640x480x67, resulting streched fat big characters; message: > > no data for est. mode 640x480x67 I think we are looking at the wrong place. The problem is the depth in the mode looked for: 67! The only depths the cards new about are multiple of 2^3. So where does this come from? -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Le Sun, Jan 29, 2023 at 02:54:39PM +0100, tlaro...@polynum.com a écrit : > Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit : > > > > Context: I'm testing NetBSD 10.0 BETA on an isolated node (not > > production). Only kernel and modules (not userland); and kernel is not > > GENERIC but a special config one matching the previous 9.2 config > > running on the node. > > > > No problem so far. As a user (and as advertised), I had simply to use > > audiocfg(1) to set the new correct default for audio in order to have > > sound back where I used to expect it. > > > > The main difference is about the framebuffer: previous kernel version > > picked the correct mode. NetBSD 10.0 does not and use "entry level" > > mode 640x480x67, resulting streched fat big characters; message: > > > > no data for est. mode 640x480x67 > > I think we are looking at the wrong place. The problem is the depth > in the mode looked for: 67! The only depths the cards new about are > multiple of 2^3. > > So where does this come from? Replying to myself: it is not the depth, but the frequency and it comes from sys/dev/videomode/edid.c. Now trying to find why, at least, it does not find 640x480x60, which exists---and 720x400x70 that exists also. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [VGA connector] NetBSD 10.0 BETA kernel testing: framebuffer
Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit : > [Please feel free to redirect me to another list if this is not the > correct one for kernel beta testing] > > Context: I'm testing NetBSD 10.0 BETA on an isolated node (not > production). Only kernel and modules (not userland); and kernel is not > GENERIC but a special config one matching the previous 9.2 config > running on the node. > > No problem so far. As a user (and as advertised), I had simply to use > audiocfg(1) to set the new correct default for audio in order to have > sound back where I used to expect it. > > The main difference is about the framebuffer: previous kernel version > picked the correct mode. NetBSD 10.0 does not and use "entry level" > mode 640x480x67, resulting streched fat big characters; message: > > no data for est. mode 640x480x67 > > while in dmesg the framebuffer has the same dimensions as with the > 9.2 kernel: > > 9.2: > -radeondrmkmsfb0: framebuffer at 0xb000aec89000, size 1600x900, depth 32, > stride 6400 > > 10.0: > +radeondrmkmsfb0: framebuffer at 0xe034d000, size 1600x900, depth 32, stride > 6400 > > I have not updated the book blocks. Is the 10.0 kernel expecting to have > hints about the modes from the bootloader i.e. a new install would > have updated the boot blocks and I would not have seen this? I wondered if the problem was linked to the connector between the graphics card and the monitor. My monitor is an "old" one with VGA and DVI connectors (DVI is DVI-D not DVI-I). I'm using a VGA cable. Since I don't have a DVI cable, I tested connecting a small monitor (originally for a Raspberry) for which I have a cable with a DVI connector for the graphics card and a HDMI connector for the monitor. With this only monitor, nothing is displayed but what is "interesing" is that if I connect both monitors on the same graphics card, one with the VGA the other with the DVI (on card)--HDMI (on monitor), the kernel gets the informations from the DVI connected monitor, and displays "correctly" (for the size of the fonts)... on the VGA connected monitor. And I have not the message about the mode not found. If I try to connect the DVI-D (on the old monitor) to the HDMI (on card), the monitor works, but the problem is the same as with the VGA (but it is not that surprising since it is DVI-D with a cable translating DVI to HDMI; but DVI-D is not the same as DVI-I and something is probably lost in translation). So it has something to do with the connection, apparently the VGA one (and DVI-D). For VGA, there was a change about the EDID (an enhanced version E-EDID been designed in 2007). So was there a change in the VGA related code, expecting E-EDID while old monitors "speak" only EDID (for VGA connection)? -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Le Mon, Jan 23, 2023 at 05:17:29AM +0700, Robert Elz a écrit : > Date:Sun, 22 Jan 2023 20:27:24 +0100 > From:tlaro...@polynum.com > Message-ID: > > > | +Zone kernel: Available graphics memory: 9007199254079374 KiB > > I see something like that too, but while it is obviously absurd, > I'm not sure that it actually does any harm (maybe) - my system > mostly works -- though I am still using wsfb - the last time I > tried to start X with nouveau and no X server config at all > (a week or so ago) the kernel crashed very soon after. > > In every case I have looked that big number has been (when converted > to bytes, which the actual value being printed is - the output simply > divides by 2^10 (ie: >>10) for our convenience, a value of the same > general form, in your case > >9007199254079374 KiB == 9223372036177278976 bytes == 0x7FFFD79E3800 > > To me that suggests that probably something has a 64 bit value set to > MAXINT, and then writes a 32 bit value on top of it (and then treats that > as a 64 bit value). The top 32 bits being 0x7FFF seems always there. > [...] Another possibility is a ptr diff'ing that gave the correct value previously and is not pertinent anymore because the memory address hasi changed: 9.2: -radeondrmkmsfb0: framebuffer at 0xb000aec89000, size 1600x900, depth 32, stride 6400 while 10.0 is: +radeondrmkmsfb0: framebuffer at 0xe034d000, size 1600x900, depth 32, stride 6400 FWIW, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Hello, Le Sun, Jan 22, 2023 at 04:59:19PM +0100, Martin Husemann a écrit : > On Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com wrote: > > no data for est. mode 640x480x67 > > [..] > > > I have not updated the book blocks. Is the 10.0 kernel expecting to have > > hints about the modes from the bootloader i.e. a new install would > > have updated the boot blocks and I would not have seen this? > > Boot blocks should be unrelated to this, but boot method (UEFI or BIOS) > may play a role (that is not fully analyzed). > > We need more details, like full dmesg. > > Does the kernel probe the correct display connection? > > There are a few i915 PRs open that are caused by the wrong connector being > used or the proper connector not responding, so the display capabilities > can not be read, but there may be other reasons why the kernel can not > read the EDID data. Please find attached the 10.0 dmesg and the diff from 9.2 dmesg to 10.0 dmesg (not edited while the huge majority of differences are that PCI ids are translated to strings about vendor and product). Best, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023 The NetBSD Foundation, Inc. All rights reserved. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. NetBSD 10.0_BETA (CONFIG) #0: Sun Jan 22 11:01:04 CET 2023 tlaronde@cauchy.polynum.local:/usr/obj/polynum.NODECONF-cauchy.polynum.local_netbsd-9.2-amd64_netbsd-amd64/netbsd/obj/sys/arch/amd64/compile/CONFIG total memory = 8120 MB avail memory = 7834 MB timecounter: Timecounters tick every 10.000 msec timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100 mainbus0 (root) ACPI: RSDP 0x000F04A0 24 (v02 ALASKA) ACPI: XSDT 0xDDF9A078 74 (v01 ALASKA A M I01072009 AMI 00010013) ACPI: FACP 0xDDFA7AC8 00010C (v05 ALASKA A M I01072009 AMI 00010013) ACPI: DSDT 0xDDF9A188 00D940 (v02 ALASKA A M I0034 INTL 20120711) ACPI: FACS 0xDDFC7F80 40 ACPI: APIC 0xDDFA7BD8 62 (v03 ALASKA A M I01072009 AMI 00010013) ACPI: FPDT 0xDDFA7C40 44 (v01 ALASKA A M I01072009 AMI 00010013) ACPI: SSDT 0xDDFA7C88 000539 (v01 PmRef Cpu0Ist 3000 INTL 20120711) ACPI: SSDT 0xDDFA81C8 000AD8 (v01 PmRef CpuPm3000 INTL 20120711) ACPI: MCFG 0xDDFA8CA0 3C (v01 ALASKA A M I01072009 MSFT 0097) ACPI: HPET 0xDDFA8CE0 38 (v01 ALASKA A M I01072009 AMI. 0005) ACPI: SSDT 0xDDFA8D18 00036D (v01 SataRe SataTabl 1000 INTL 20120711) ACPI: SSDT 0xDDFA9088 0034E1 (v01 SaSsdt SaSsdt 3000 INTL 20091112) ACPI: ASF! 0xDDFAC570 A5 (v32 INTEL HCG 0001 TFSM 000F4240) ACPI: 5 ACPI AML tables successfully acquired and loaded ioapic0 at mainbus0 apid 8: pa 0xfec0, version 0x20, 24 pins cpu0 at mainbus0 apid 0 cpu0: Use lfence to serialize rdtsc cpu0: Intel(R) Pentium(R) CPU G3220 @ 3.00GHz, id 0x306c3 cpu0: node 0, package 0, core 0, smt 0 cpu1 at mainbus0 apid 2 cpu1: Intel(R) Pentium(R) CPU G3220 @ 3.00GHz, id 0x306c3 cpu1: node 0, package 0, core 1, smt 0 acpi0 at mainbus0: Intel ACPICA 20221020 acpi0: X/RSDT: OemId , AslId acpi0: MCFG: segment 0, bus 0-63, address 0xf800 ACPI: Dynamic OEM Table Load: ACPI: SSDT 0x8E7E9B90F808 0005AA (v01 PmRef ApIst3000 INTL 20120711) acpi0: SCI interrupting at int 9 acpi0: fixed power button present timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000 hpet0 at acpi0: high precision event timer (mem 0xfed0-0xfed00400) timecounter: Timecounter "hpet0" frequency 14318180 Hz quality 2000 acpiec0 at acpi0 (H_EC, PNP0C09-1): not present TPMX (PNP0C01) at acpi0 not configured FWHD (INT0800) at acpi0 not configured attimer1 at acpi0 (TIMR, PNP0100): io 0x40-0x43,0x50-0x53 irq 0 com0 at acpi0 (UAR1, PNP0501-1): io 0x3f8-0x3ff irq 4 com0: ns16550a, 16-byte FIFO lpt0 at acpi0 (LPTE, PNP0400): io 0x378-0x37f irq 5 acpiwmi0 at acpi0 (WMI1, PNP0C14-MXM2): ACPI WMI Interface acpiwmibus at acpiwmi0 not configured acpibut0 at acpi0 (PWRB, PNP0C0C-170): ACPI Power Button acpiwmi1 at acpi0 (WMIO, PNP0C14-0): ACPI WMI Interface acpiwmibus at acpiwmi1 not configured acpifan0 at acpi0 (FAN0, PNP0C0B-0): ACPI Fan acpifan1 at acpi0 (FAN1, PNP0C0B-1): ACPI Fan acpifan2 at acpi0 (FAN2, PNP0C0B-2): ACPI Fan acpifan3 at acpi0 (FAN3, PNP0C0B-3): ACPI Fan acpifan4 at acpi0 (FAN4, PNP0C0B-4): ACPI Fan acpitz0 at acpi0 (TZ00) acpitz0: active cooling level 0: 80.0C acpitz
Re: NetBSD 10.0 BETA kernel testing: framebuffer
Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit : > [...] > > The main difference is about the framebuffer: previous kernel version > picked the correct mode. NetBSD 10.0 does not and use "entry level" > mode 640x480x67, resulting streched fat big characters; message: > > no data for est. mode 640x480x67 > > while in dmesg the framebuffer has the same dimensions as with the > 9.2 kernel: > > 9.2: > -radeondrmkmsfb0: framebuffer at 0xb000aec89000, size 1600x900, depth 32, > stride 6400 > > 10.0: > +radeondrmkmsfb0: framebuffer at 0xe034d000, size 1600x900, depth 32, stride > 6400 > The differences between 9.2 (/^-/) and 10.0 (/^+/) extracted: -kern info: [drm] initializing kernel modesetting (CEDAR 0x1002:0x68F9 0x174B:0xE164). +initializing kernel modesetting (CEDAR 0x1002:0x68F9 0x174B:0xE164 0x00). -Zone kernel: Available graphics memory: 2601178 kiB -Zone dma32: Available graphics memory: 2097152 kiB +Zone kernel: Available graphics memory: 9007199254079374 KiB +Zone dma32: Available graphics memory: 2097152 KiB Note the value, on 10.0 about the "Zone kernel" and cf. with the correct (9.2) one. In PR #56847, this is mentionned about "nouveau" (and I have "radeon") and about the problem been with UEFI and not BIOS: this is incorrect, since my node is in legacy boot: it uses BIOS and the value is incorrect. So the problem is not UEFI vs. BIOS. There is also a third argument about CEDAR in 10.0 not existing in 9.2. May be the same as for the sound: 10.0 is not enumerating in the same order, and what succeeded previously because the first entry was fortunately the correct one, is now failing. Note: I stumbled upon PR #56847, previously, while searching something else and had quite a time, now, remembering it, finding it back with the PR search tools. And then, trying to find a way to find it back... I stumbled on a page by D. Holland stating that the bug report system should be revamped. It's difficult not to concur... May I suggest that a future system should send candidates PR to a mailing list so that keywords and sorting is done by knowledgeable people in order to put in their vincinity PRs based on the moon they are (probably) pointing to, instead of the finger of the reporter ? (It's not a derision against the reporter---me included; the reporter reports what he sees: symptoms.) -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
NetBSD 10.0 BETA kernel testing: framebuffer
[Please feel free to redirect me to another list if this is not the correct one for kernel beta testing] Context: I'm testing NetBSD 10.0 BETA on an isolated node (not production). Only kernel and modules (not userland); and kernel is not GENERIC but a special config one matching the previous 9.2 config running on the node. No problem so far. As a user (and as advertised), I had simply to use audiocfg(1) to set the new correct default for audio in order to have sound back where I used to expect it. The main difference is about the framebuffer: previous kernel version picked the correct mode. NetBSD 10.0 does not and use "entry level" mode 640x480x67, resulting streched fat big characters; message: no data for est. mode 640x480x67 while in dmesg the framebuffer has the same dimensions as with the 9.2 kernel: 9.2: -radeondrmkmsfb0: framebuffer at 0xb000aec89000, size 1600x900, depth 32, stride 6400 10.0: +radeondrmkmsfb0: framebuffer at 0xe034d000, size 1600x900, depth 32, stride 6400 I have not updated the book blocks. Is the 10.0 kernel expecting to have hints about the modes from the bootloader i.e. a new install would have updated the boot blocks and I would not have seen this? Best, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Pulling to netbsd-9 branch fixes for #54977
Hello, I have experienced a USB failure with an excessive amount of file cache, while the mounted filesystems shouldn't have this lot of blocks in cache: this was likely due to a rsync(1) failure on an USB connected disk. The USB was detached ("file system full") while rsync(1) was operating but the files stayed in cache and the umass0 was not reattachable when trying a "drvctl -r umass0". Very likely PR #54977 (in my case: an ARM SoC with only 1GB of memory). There are fixes in current (mentionned in PR). Could they be pulled to the 9 branch? Best, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
UEFI: caveats about not utf-8 dir entries
I don't know if this is for tech-kern or tech-userlevel (perhaps the two). I just read today, on the devel UEFI edk2 devel list, from patches for ext4, a comment on the problem of the encoding of dir entries. The problem is that, generally in fs, no encoding is specified: dir entries are just a sequence of bytes, whether nul byte terminated or with the length of the entry given (the later for ext4). UEFI (edk2) deals, internally, with UCS-2 strings. With ext4 (and I expect this is the same for other fs drivers), conversion is attempted from utf-8. Here, if the "from utf-8" conversion errors (not utf-8), the dir entry is skipped, meaning that not anything on a fs read can be reached by the UEFI code. This has to be kept in mind when populating a msdos partition for booting and for people wandering in a filesystem using the UEFI shell: even if the fs is readable, perhaps not everything will be accessible. FWIW, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
UEFI edk2 NetBSD support
I'm about to start to commit modifications to the UEFI edk2 sources in order to allow to build and test it under NetBSD. Why is it related to the kernel? Because UEFI is not limited to one arch (so it's not linked to some port); because the edk can be compiled and used on a not UEFI hardware in order to provide some UEFI support and it could be an alternative to Uboot; because on an only remotely accessible machine, a persistent runtime UEFI network driver could allow to explore the machine and, if the kernel supports it, could allow to remotely debug the kernel on a machine where there is no other direct mean to know what is going on particularly in the early stages of booting. Am I stepping on somebody else's toes for the UEFI edk2? (I don't speak about UEFI kernel support: I'm not working on that.) -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: debugging a kernel that doesn't start
Le Mon, Sep 12, 2022 at 09:17:52PM +0200, Edgar Fuß a écrit : > I'm trying to run NetBSD on a Dell PowerEdge R6515, and the kernel is being > loaded (PXE or USB) but then the machine hangs hard. > > What's the way to debug a kernel that hangs so early that you can't printf > or drop into ddb? I guess that's a phenomenon quite common for a new port > or changes to locore.s (or whatever that's called today), but it's completely > new to me. > > I have virtually no clue about PeCee hardware. At the point the kernel is > started, are BIOS routines still available? Start by trying to boot without the KMS. I had the problem of a kernel not reaching init, on a remote server, without any other access (no serial, no IPMI). See: http://notes.kergis.com/netbsd_on_OVH_baremetal.html -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Notes about booting/debugging NetBSD on an OVH baremetal server
FWIW, I have put there notes about the installation, booting and dual booting of NetBSD on an OVH baremetal server: https://notes.kergis.com/netbsd_on_OVH_baremetal.html The part that could be of interest to kernel developers is at the end: what I found handy or could be handy in trying to get information about what was going on and failing with very limited means to get information. It concerns UEFI, boot and the kernel. I'd like to know if some suggestions make sense (or not) and if there is already work in progress in some of these points. When I will have a slot of time, I plan to tackle UEFI but to see first if it could be installed to allow remote hardware exploration. FWIW, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [SUCCESS] Debugging/fixing a kernel stalled not crashing
Le Sun, Aug 21, 2022 at 03:25:36PM +, Emmanuel Dreyfus a écrit : > On Sun, Aug 21, 2022 at 02:16:58PM +0200, tlaro...@polynum.com wrote: > > Addition (asked by Taylor R Campbell): a current GENERIC boots only > > with i915drmkms disabled. > > > > With the framebuffer stuff enabled, it does not boot, and does not even > > panic and reboot. It freezes somewhere. The same as the 9.x series. > > I have a machine that randomy crash during boot since we had the Linux 5.x > DRM import. The feature is still an asset, since it supports the GPU > that was not supported before, but it suggests booting with DRM based > framebuffer is more fragile than booting without. Perhaps we need a boot > flag to disable framebuffer? This is my feeling too that a generic flag to disable it via userconf would be a good thing instead of explicitely listing all the drivers. And, at the very least, to advertise, for people installing on a server, to try with framebuffer disabled first, to see if NetBSD boots, and to try it with only after. When one installs on a remote server, without seeing anything about the boot process[*], it is quite frustating. *: I plan to play a little with UEFI EDKII to see if installing it and dealing with an ethernet card EFI Runtime driver (persistent after exiting boot) could be a solution for remote debugging. But no schedule set so don't hold your breath; it's vaporware for the moment. Other idea: write messages to memory in a place kept untouched by UEFI and NetBSD so that rebooting (in case of crash) in UEFI, an UEFI application could dump the memory on some place on the disk, in the EFI partition, for post-mortem inspection. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [SUCCESS] Debugging/fixing a kernel stalled not crashing
Addition (asked by Taylor R Campbell): a current GENERIC boots only with i915drmkms disabled. With the framebuffer stuff enabled, it does not boot, and does not even panic and reboot. It freezes somewhere. The same as the 9.x series. Le Sat, Aug 20, 2022 at 09:03:52PM +0200, tlaro...@polynum.com a écrit : > A final point: > > Context: I rent a baremetal server (OVH) that has an Intel Xeon > quadcore, IvyBridge, with 16Gb of RAM, 3 2TB disks, an Intel PRO 1000 > ethernet card (but the bandwith is limited to 100Mib). It is an entry > level offer, that I wanted only for an IPv4 address (there is an IPv6 > address too). > > The images to install include no BSD but only Linux/Debian variants. > > Following instructions from an helpful wiki page, I try to install using > a Linux rescue disk (provided by OVH), running all in memory, and having > qemu-system-x86_64 allowing to use a CDROM install image. > > Nothing booted. > > Since it was unclear from the web interface if the boot process was > depending or not on the information about an image being installed (to > allow booting from the disk), I then installed a Linux/Debian on only > one disk (one can select, 1, 2 or 3 disks, but if multiple disks this > is software RAID). > > Using the rescue system, I then resized the Debian partition and > installed NetBSD on another partition (dual booting) and, to bypass a > possible limitation in the booting process (only booting GRUB and > accessing directly GRUB), I chainloaded the NetBSD stage1 from the GRUB2 > menu, and verified, under qemu, this will boot, using GRUB2 boot once > feature so that if the NetBSD crashed and reboots, I can go back to > Debian to try something else. > > Still no success. > > It was almost certain there was a problem with the kernel. > > So I wrote a special /boot.cfg to test various things, custom compiling > a kernel (since the GENERIC installation one was not running), and tried > to validate step by step the booting procedure in order to try, after > to insert a cpu_reboot() instruction in the kernel to see where the > problem occurred (since when rebooting, I will be able to connect to > Debian, I would have known that before the instruction, it was OK). > > In order to limit the work, I used once more qemu but to install NetBSD > on another disk (so that I can in fact use qemu not with the rescue > system, but directly under Debian without trashing the very disk Debian > runs from). > > The first test was to see if, indeed, NetBSD stage2 was loaded. The > menu in /boot.cfg was simple: the instruction "quit". > > => First lesson: this does not work, because the rebooting is not a > total one, and mapping the drives (in GRUB2) to ensure that the booting > succeeds, the stage2 reboots but finally back to itself, so the machine > was unendlessly rebooting and I had no connection. > > It took me various modifications before realizing it was the case (under > qemu) so I abandonned the idea and tried to boot a custom kernel, > without SMP and without framebuffer (i915drmkms). > > This succeeded. > > I then get back to test letting the framebuffer. It didn't work. > I then disable the framebuffer for everything, and tried with SMP. It > worked. > Then, I tried 9.2 GENERIC and 9.3 GENERIC without framebuffer. Both > work. > > So the final lesson: NetBSD can be installed on such machine but the > framebuffer is a problem. And NetBSD is not far behind Linux, because > the Debian distribution is a recent one, and the main clue was in the > Linux dmesg: > > Command line: BOOT_IMAGE=/boot/vmlinuz-5.10.0-14-amd64 > root=UUID=eea6d0a4-03b6-44e6-8588-ff6c4eba2095 ro nomodeset iommu=pt > > The: nomodeset. > > Linux doesn't work with the embedded graphics (HD 4000) either. > > So it is partly a kernel problem (kernel stalling with framebuffer > initializations) but mainly an install problem (framebuffer in such > cases should be disabled). > > If someone thinks there can be interest in how I set dual booting, > chainloading NetBSD from GRUB2, and configuring the boot procedure, I > can write a mini-page about it. > > For the rest: problem solved. NetBSD can install on an OVH baremetal > (at least this kind of machine). > -- > Thierry Laronde > http://www.kergis.com/ > http://kertex.kergis.com/ >http://www.sbfa.fr/ > Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [SUCCESS] Debugging/fixing a kernel stalled not crashing
A final point: Context: I rent a baremetal server (OVH) that has an Intel Xeon quadcore, IvyBridge, with 16Gb of RAM, 3 2TB disks, an Intel PRO 1000 ethernet card (but the bandwith is limited to 100Mib). It is an entry level offer, that I wanted only for an IPv4 address (there is an IPv6 address too). The images to install include no BSD but only Linux/Debian variants. Following instructions from an helpful wiki page, I try to install using a Linux rescue disk (provided by OVH), running all in memory, and having qemu-system-x86_64 allowing to use a CDROM install image. Nothing booted. Since it was unclear from the web interface if the boot process was depending or not on the information about an image being installed (to allow booting from the disk), I then installed a Linux/Debian on only one disk (one can select, 1, 2 or 3 disks, but if multiple disks this is software RAID). Using the rescue system, I then resized the Debian partition and installed NetBSD on another partition (dual booting) and, to bypass a possible limitation in the booting process (only booting GRUB and accessing directly GRUB), I chainloaded the NetBSD stage1 from the GRUB2 menu, and verified, under qemu, this will boot, using GRUB2 boot once feature so that if the NetBSD crashed and reboots, I can go back to Debian to try something else. Still no success. It was almost certain there was a problem with the kernel. So I wrote a special /boot.cfg to test various things, custom compiling a kernel (since the GENERIC installation one was not running), and tried to validate step by step the booting procedure in order to try, after to insert a cpu_reboot() instruction in the kernel to see where the problem occurred (since when rebooting, I will be able to connect to Debian, I would have known that before the instruction, it was OK). In order to limit the work, I used once more qemu but to install NetBSD on another disk (so that I can in fact use qemu not with the rescue system, but directly under Debian without trashing the very disk Debian runs from). The first test was to see if, indeed, NetBSD stage2 was loaded. The menu in /boot.cfg was simple: the instruction "quit". => First lesson: this does not work, because the rebooting is not a total one, and mapping the drives (in GRUB2) to ensure that the booting succeeds, the stage2 reboots but finally back to itself, so the machine was unendlessly rebooting and I had no connection. It took me various modifications before realizing it was the case (under qemu) so I abandonned the idea and tried to boot a custom kernel, without SMP and without framebuffer (i915drmkms). This succeeded. I then get back to test letting the framebuffer. It didn't work. I then disable the framebuffer for everything, and tried with SMP. It worked. Then, I tried 9.2 GENERIC and 9.3 GENERIC without framebuffer. Both work. So the final lesson: NetBSD can be installed on such machine but the framebuffer is a problem. And NetBSD is not far behind Linux, because the Debian distribution is a recent one, and the main clue was in the Linux dmesg: Command line: BOOT_IMAGE=/boot/vmlinuz-5.10.0-14-amd64 root=UUID=eea6d0a4-03b6-44e6-8588-ff6c4eba2095 ro nomodeset iommu=pt The: nomodeset. Linux doesn't work with the embedded graphics (HD 4000) either. So it is partly a kernel problem (kernel stalling with framebuffer initializations) but mainly an install problem (framebuffer in such cases should be disabled). If someone thinks there can be interest in how I set dual booting, chainloading NetBSD from GRUB2, and configuring the boot procedure, I can write a mini-page about it. For the rest: problem solved. NetBSD can install on an OVH baremetal (at least this kind of machine). -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
[PARTIAL SUCCESS] Debugging/fixing a kernel stalled not crashing
Le Thu, Aug 18, 2022 at 04:33:04PM +0200, tlaro...@polynum.com a écrit : > Context: I rent a baremetal server and try to install NetBSD on it. I > finally installed a Linux (Debian) and installed NetBSD as a dual boot. > But NetBSD doesn't come up (in case there was a > network misconfiguration, I verified that no log, no dmesg was written) > and neither does it crashes and reboots (because I use GRUB2 boot once > feature and, if it was the case, the server will go back to Debian, and > it doesn't). > So: - I have installed a Linux/Debian and I'm using GRUB2 to chainload the stage1 block in order to load the NetBSD kernel, using the booting once feature of GRUB2 so that if something goes wrong, I can go back to the Linux/Debian; - I have set (since I can see nothing of the boot process) a /boot.cfg with several choices, and set the default in order from the chainloading done by GRUB2 to try various things (since I haven't found the possibility to mount ffs rw under Linux, I use qemu-system-x86_64, under Debian, to write and modify the NetBSD partitions); - The machine is an Intel Xeon, quadcore, IvyBridge. Since the GENERIC kernel does not boot, I have compiled a custom 9.3, stripping all unneeded, and adding this feature (commented out in the GENERIC config): acpismbus* at acpi?# ACPI SMBus CMI (experimental) since from x86/pci/imcsmb/imc.c, there are some pecularities about the (Sandy,Ivy)bridge with the Xeon. Disabling the framebuffer (i915drmkms) via userconf, and disabling the SMP, NetBSD boots on the machine. The dmesg is here: http://downloads.kergis.com/misc/rpt_netbsd9.3_monocore_no-fb.dmesg Since I fought quite a lot with Debian, GRUB2 and so on for the installation and the boot process, I have to verify if an SMP version of the same does boot or not. If an SMP does not boot, I will go back to the list to have tips about how I can best gain informations about what's going wrong in order to try to fix or help to fix it. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Debugging/fixing a kernel stalled not crashing
Hello, Le Fri, Aug 19, 2022 at 02:36:33PM +0100, David Brownlee a écrit : > Tangentially... > > If it's an issue picking up the root filesystem, you could boot an > INSTALL type kernel with a built in ramdisk with dhcpcd and sshd > enabled, and see if you can ssh into the box (I think someone had > pre-built arm images which did just that, so the code should be out > there :) Yes, I plan to test this also, depending on at what stage my reboot tactics indicates where the problem is. The aim being to be able to connect to a running kernel. When it will be achieved, the harder will have been made. I have already built a custom kernel (with acpismbus* added since the machine has IvyBridge and it is related, and it's not in GENERIC) and will start to debug tomorrow. Best, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Debugging/fixing a kernel stalled not crashing
Context: I rent a baremetal server and try to install NetBSD on it. I finally installed a Linux (Debian) and installed NetBSD as a dual boot. But NetBSD doesn't come up (in case there was a network misconfiguration, I verified that no log, no dmesg was written) and neither does it crashes and reboots (because I use GRUB2 boot once feature and, if it was the case, the server will go back to Debian, and it doesn't). I can't "see" the boot process (no IPMI for this entry level offer), but I have at least the dmesg from Linux for the description of the machine, and I'd like to give it a try to see if I can find the culprit and, this being identified, manage to correct it. In order to bisect the problem, it seems that the simplest would be to place a cpu_reboot() at various steps to identify the culprit since, if it reboots, I will be back to Debian and hence will know that "until this" it is OK. Questions: 1) Is src/sys/kern/init_main.c the correct file to start the bisection with? 2) Starting at what stage a problem would almost for sure cause a reboot (DDB_ONPANIC being unset) so that I can know that the problem is very likely before? I would then try perhaps to start back, from this point; 3) Are there places where cpu_reboot() may leave the hardware in such a state that a soft reset will perhaps not bring the machine back allowing the boot sequence to succeed (or is cpu_reboot() immuned from this)? TIA, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: pcc [was Re: valgrind]
Le Mon, Mar 21, 2022 at 08:54:43AM -0400, Mouse a écrit : > >> I've been making very-spare-time progress on building my own > >> compiler on and off for some years now; perhaps I'll eventually get > >> somewhere. [...] > > Have you looked at pcc? http://pcc.ludd.ltu.se/ and in our source > > tree in src/external/bsd/pcc . > > No, I haven't. I should - it may well end up being quicker to move an > existing compiler in the directions I want to go than to write my own. > And FWIW, there is also the collection of compilers in Plan9, that has now been released under the MIT license: https://p9f.org/ (Plan9 foundation). -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Kernel 9.1 panic with azalia
On Sat, Jun 26, 2021 at 06:49:17AM +0200, Martin Husemann wrote: > Also any reason to use 9.1 instead of 9.2 or 9.2_STABLE? > (Not that I think it would make a difference for azalia) Practical reason: I start to update the node I'm doing my main programing/developing work on and I then, after having verified that things are rolling and with some delay---specially if the node is a remote production server that it is not possible to update easily and for safety only when I have physical access to it in case of problem (this time: there was)---I put other nodes in sync to not have to cross-compile between NetBSD versions. When I updated the developing node, NetBSD was at 9.1. Since, for what I know (not much), virtualization always(?) present a defined common pseudo-hardware interface, I imagine that there is no virtualization that will allow to test a kernel in a VM, with access to an image of the real hardware present, so that one can verify that a tentative kernel will run on the actual hardware before switching kernels? I have still to verify that an UEFI bootloader will allow to implement by scripting a "boot once", so that if a new kernel (on a remote host) crashes, it reboots with a kernel that is known to work. It is probably possible to implement this with the existence of persistent storage of UEFI variables. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Kernel 9.1 panic with azalia
On Fri, Jun 25, 2021 at 09:32:40PM +, RVP wrote: > On Fri, 25 Jun 2021, RVP wrote: > > >On Fri, 25 Jun 2021, tlaro...@polynum.com wrote: > > > >>But if azalia is not supported anymore because it crashes the > >>kernel, shouldn't it be removed and not simply be commented out? > >> > > > >I think that your message is the first indication that azalia(4) > >is slowly bit-rotting... > > > > Just checked, and azalia no longer exists in the 9.99[.82] tree. Thanks to have checked! -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Kernel 9.1 panic with azalia
Hello, On Fri, Jun 25, 2021 at 08:47:30PM +, RVP wrote: > On Fri, 25 Jun 2021, tlaro...@polynum.com wrote: > > >The new kernel panics at boot time with azalia (it is not crucial since > >it is a server and I have no use with it but I have added the support > >since it's here and 7.1.1 has no problem with it). > > > > You must've compiled a custom kernel. 9.1 GENERIC has the `azalia' > driver commented out; hdaudio(4) is used instead. Try the same. Sure. But if azalia is not supported anymore because it crashes the kernel, shouldn't it be removed and not simply be commented out? (To give some context, when I build a new kernel, I just adjust the previous config for things that have been removed or changed, so I'm mainly diffing GENERIC to GENERIC to see changes, while my configs are not GENERIC---I remove support for whatever hardware is not here or whatever filesystems the node will not use ever, for example). Thanks for the tip though. Best, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Kernel 9.1 panic with azalia
Hello, I was trying to update a server, running a NetBSD 7.1.1 (amd64) to NetBSD 9.1. The new kernel panics at boot time with azalia (it is not crucial since it is a server and I have no use with it but I have added the support since it's here and 7.1.1 has no problem with it). It's a production server so I can not easily do tests. Here is the message (reconstructed by hand from written info---may contain blunders): azalia0: codec[2]: 0x1106/0x0441 (rev. 1.0), HDA rev. 1.0 panic: kmem_free(0xce000801,11) != allocated size 1844660333743030159360 vpanic() at netbsd:vpanic +0x143 snprintf() at netbsd:snprintf kmem_alloc() at netbsd:kmem_alloc generic_mixer_ensure_capacity() at netbsd:generic_mixer_ensure_capacity +0x7b generic_mixer_init() at netbsd:generic_mixer_init +0x1143 azalia_attach_intr() at netbsd:azalia_attach_intr +0xbf8 config_interrupts_thread() at netbsd:config_interrupts_thread +0x7e cpu0: End Traceback fatal breakpoint trap in supervisor mode trap type 1 code 0 rip 0x8021cc2d cs 0x8 rflags 0x202 0 ilevel 0 rsp 0xcc006741 curlwp 0x060092cacd80 pid 0.37 lowest kstack 0xcc00674192e0 stopped in pid 0.37 (system) at netbsd:breakpoint: 0x5: leave Note: with the 7.1.1 kernel, for azalia I have: azalia0: codec[2]: 0x1106/0x0441 (rev. 1.0), HDA rev. 1.0 azalia0: codec[3]: 0x8086/0x2806 (rev. 0.0), HDA rev. 1.0 The size in the panic is non sense. Hoping this can give enough clue to debug. TIA, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Ext4 support
On Fri, Apr 30, 2021 at 06:51:10AM -0500, Jonathan A. Kollasch wrote: > On Fri, Apr 30, 2021 at 12:56:04PM +0200, tlaro...@polynum.com wrote: > > There is excellent support, thanks to Reinoud Zandijk, in NetBSD for > > UDF. And this is cross-system (I use it to share---not distribute: it's > > not a NFS or a Samba---back-ups between NetBSD and MS Windows). > > > > It's only excellent if you have access to a functional UDF fsck > program. NetBSD and Linux do not have a functional UDF fsck. Yes, this is the lack. I have proposed some time ago to give some money (this will not be thousands of euros but at least some hundreds) so that someone(TM) maybe Reinoud Zandijk could work on this. IMHO, it's something that is worth adding since a cross-system FS is the solution for sharing (once more: not serving distributed data; but at least sharing). -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Ext4 support
On Thu, Apr 29, 2021 at 10:06:05PM +0200, Vincent DEFERT wrote: > > On 29/04/2021 20:34, Christos Zoulas wrote: > >Some ext4 features were implemented as part of GSoC 2016 (extents, > >htrees). > >I am sure that there are other unimplemented features. What are you looking > >for? > > > >christos > > > > I'd like to have full ext4 support so an ext4-formatted disk could be used > to exchange data between Linux and NetBSD, for instance. > > If some features have already been implemented, I guess it has been decided > to put them in /usr/src/sys/ufs/ext2fs and to keep that name. > So now, I know where to start. :) > > There is also the question of the specifications: for now, I just have the > Linux kernel sources and the wiki (https://ext4.wiki.kernel.org/). > I'm not aware of a more formal specification, but if one exists it would > help avoid the risk of being too influenced by GPL'd source code. There is excellent support, thanks to Reinoud Zandijk, in NetBSD for UDF. And this is cross-system (I use it to share---not distribute: it's not a NFS or a Samba---back-ups between NetBSD and MS Windows). So you might try this instead of a Linux only thing. My 2 cents, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: fsync error reporting
On Fri, Feb 19, 2021 at 01:43:07AM +, David Holland wrote: > [...] > > (9) We need a model for what happens to the unwritten data. Throwing > it away is clearly wrong (some may recall a furor a couple years ago > when it was discovered that Linux did this) but retrying and likely > failing on every subsequent fsync attempt isn't that useful either. > My suggestion is to allow retrying up to some arbitrary fixed number > of times and then mark the buffer broken, and provide some out-of-band > way to either discard everything (umount -f?) or start retrying again, > e.g. after manually reinserting accidentally ejected media. > FWIW, perhaps the concept of a dedicated separate recovery data storage (not specifying it as a physical local disk; could be a remote direct or indirect storage, energy backed-up memory etc.) could be envisioned for high reliability: writing unwritten blocks with informations allowing to know what, where and when and to fix or replay later. >From a superficial point of view, the problems seem all very complicated on the kernel level. It would be far simpler to have a kernel only allowing exclusive write to one process, and letting multiplexing be handled by a file server in user space, this file server being, actually, the only one to write and read and being the proxy for other processes, delivering failure messages to whom interested and allowing partial file locks too. This is probably not worth more than 2 cts [and don't expect any code in any reasonable future ;-)]. -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [FOUND] kernel 8.2 and 9.1 crashes
Hello, On Fri, Nov 13, 2020 at 08:42:03AM +0100, Martin Husemann wrote: > On Fri, Nov 13, 2020 at 07:35:24AM +0100, tlaro...@polynum.com wrote: > > I tried to recompile a kernel, with 8.2 and with 9.1 and both > > crash, 9.1 with: > > > > unable to execute instruction 0x18 (SMEP) > > > > (from memory) > > This is (I guess) the kernel jumping through a NULL function pointer. The problem is with: options PCKBD_CNATTACH_MAY_FAIL The option was commented out in my config. That (my keyboard is USB) I will not have a keyboard during the boot process, without the option, OK. But that it crashes... Obviously this "option" is not an option anymore so it should be on and not settable---unless someone can find why it crashes now, from 8.2, while it didn't before (the framebuffer and related support seems to be now a lot of code so the culprit is probably to be found there). Best, -- Thierry Laronde http://www.kergis.com/ http://kertex.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
kernel 8.2 and 9.1 crashes
I tried to recompile a kernel, with 8.2 and with 9.1 and both crash, 9.1 with: unable to execute instruction 0x18 (SMEP) (from memory) The kernel enters debugging but the keyboard being unusable (no key does whatever) I have to hard reboot. The last message (via dmesg) from 9.1 is: [ 1.7964198] ahcisata0 port 3: device present, speed: 6.0Gb/s It works with 8.0. Here is the dmesg from 8.0: Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018 The NetBSD Foundation, Inc. All rights reserved. Copyright (c) 1982, 1986, 1989, 1991, 1993 The Regents of the University of California. All rights reserved. NetBSD 8.0 (CONFIG) #0: Thu Apr 16 18:47:07 CEST 2020 tlaronde@cauchy.polynum.local:/usr/obj/polynum.NODECONF-cauchy.polynum.local_netbsd-8.0-amd64_netbsd-amd64/obj/sys/arch/amd64/compile/CONFIG total memory = 8120 MB avail memory = 7868 MB cpu_rng: RDRAND rnd: seeded with 128 bits timecounter: Timecounters tick every 10.000 msec timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100 MSI MS-7823 (1.0) mainbus0 (root) ACPI: RSDP 0x000F04A0 24 (v02 ALASKA) ACPI: XSDT 0xDDF9A078 74 (v01 ALASKA A M I01072009 AMI 00010013) ACPI: FACP 0xDDFA7AC8 00010C (v05 ALASKA A M I01072009 AMI 00010013) ACPI: DSDT 0xDDF9A188 00D940 (v02 ALASKA A M I0034 INTL 20120711) ACPI: FACS 0xDDFC7F80 40 ACPI: APIC 0xDDFA7BD8 62 (v03 ALASKA A M I01072009 AMI 00010013) ACPI: FPDT 0xDDFA7C40 44 (v01 ALASKA A M I01072009 AMI 00010013) ACPI: SSDT 0xDDFA7C88 000539 (v01 PmRef Cpu0Ist 3000 INTL 20120711) ACPI: SSDT 0xDDFA81C8 000AD8 (v01 PmRef CpuPm3000 INTL 20120711) ACPI: MCFG 0xDDFA8CA0 3C (v01 ALASKA A M I01072009 MSFT 0097) ACPI: HPET 0xDDFA8CE0 38 (v01 ALASKA A M I01072009 AMI. 0005) ACPI: SSDT 0xDDFA8D18 00036D (v01 SataRe SataTabl 1000 INTL 20120711) ACPI: SSDT 0xDDFA9088 0034E1 (v01 SaSsdt SaSsdt 3000 INTL 20091112) ACPI: ASF! 0xDDFAC570 A5 (v32 INTEL HCG 0001 TFSM 000F4240) ACPI: Executed 1 blocks of module-level executable AML code ACPI: 5 ACPI AML tables successfully acquired and loaded ioapic0 at mainbus0 apid 8: pa 0xfec0, version 0x20, 24 pins cpu0 at mainbus0 apid 0 cpu0: Intel(R) Pentium(R) CPU G3220 @ 3.00GHz, id 0x306c3 cpu0: package 0, core 0, smt 0 cpu1 at mainbus0 apid 2 cpu1: Intel(R) Pentium(R) CPU G3220 @ 3.00GHz, id 0x306c3 cpu1: package 0, core 1, smt 0 acpi0 at mainbus0: Intel ACPICA 20170303 acpi0: X/RSDT: OemId , AslId acpi0: MCFG: segment 0, bus 0-63, address 0xf800 ACPI: Dynamic OEM Table Load: ACPI: SSDT 0xFE821BD9E010 0003D3 (v01 PmRef Cpu0Cst 3001 INTL 20120711) ACPI: Dynamic OEM Table Load: ACPI: SSDT 0xFE810E813810 0005AA (v01 PmRef ApIst3000 INTL 20120711) ACPI: Dynamic OEM Table Load: ACPI: SSDT 0xFE821BCFB1D0 000119 (v01 PmRef ApCst3000 INTL 20120711) acpi0: SCI interrupting at int 9 timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000 hpet0 at acpi0: high precision event timer (mem 0xfed0-0xfed00400) timecounter: Timecounter "hpet0" frequency 14318180 Hz quality 2000 acpiec0 at acpi0 (H_EC, PNP0C09-1)acpiec0: unable to evaluate _GPE: AE_NOT_FOUND TPMX (PNP0C01) at acpi0 not configured FWHD (INT0800) at acpi0 not configured LDRC (PNP0C02) at acpi0 not configured attimer1 at acpi0 (TIMR, PNP0100): io 0x40-0x43,0x50-0x53 irq 0 CWDT (INT3F0D) at acpi0 not configured SIO1 (PNP0C02) at acpi0 not configured com2 at acpi0 (UAR1, PNP0501-1): io 0x3f8-0x3ff irq 4 com2: ns16550a, working fifo lpt2 at acpi0 (LPTE, PNP0400): io 0x378-0x37f irq 5 RMSC (PNP0C02) at acpi0 not configured acpiwmi0 at acpi0 (WMI1, PNP0C14-MXM2): ACPI WMI Interface acpiwmibus at acpiwmi0 not configured PDRC (PNP0C02) at acpi0 not configured acpibut0 at acpi0 (PWRB, PNP0C0C-170): ACPI Power Button acpiwmi1 at acpi0 (WMIO, PNP0C14-0): ACPI WMI Interface acpiwmibus at acpiwmi1 not configured PTMD (INT3394) at acpi0 not configured acpifan0 at acpi0 (FAN0, PNP0C0B-0): ACPI Fan acpifan1 at acpi0 (FAN1, PNP0C0B-1): ACPI Fan acpifan2 at acpi0 (FAN2, PNP0C0B-2): ACPI Fan acpifan3 at acpi0 (FAN3, PNP0C0B-3): ACPI Fan acpifan4 at acpi0 (FAN4, PNP0C0B-4): ACPI Fan acpitz0 at acpi0 (TZ00) acpitz0: active cooling level 0: 80.0C acpitz0: active cooling level 1: 55.0C acpitz0: active cooling level 2: 0.0C acpitz0: active cooling level 3: 0.0C acpitz0: active cooling level 4: 0.0C acpitz0: levels: critical 105.0 C acpitz1 at acpi0 (TZ01): cpu0 cpu1 acpitz1: levels: critical 105.0 C, passive 108.0 C, passive cooling ACPI: Enabled 6 GPEs in block 00 to 3F pci0 at mainbus0 bus 0: configuration mode 1 pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok pchb0 at pci0 dev 0
Re: [FOUND] kernel 9.0 crash on amd64
Hello, On Sun, Apr 19, 2020 at 08:17:06PM +0200, tlaro...@polynum.com wrote: > Hello, > > On Sun, Apr 19, 2020 at 05:10:37PM +, m...@netbsd.org wrote: > > On Sun, Apr 19, 2020 at 05:29:40PM +0200, tlaro...@polynum.com wrote: > > > Hello, > > > > > > Mainly in order to be able to test wine, I'm compiling a NetBSD kernel > > > from netbsd-9-0-RELEASE sources on an amd64 (Intel bicore). > > > > > > My config has very minimal changes from NetBSD 8.* config, the only > > > important > > > modification being USER_LDT (and I'm not putting option SVS). > > > > > > When it crashes, keyboard is unavailable and the information repeated is: > > > > > > prevented execution of 0x18 (SMEP) > > > fatal page fault in supervisor mode > > > trap type 6 code 0x10 rip 0x18 cs 0x8 rflags 0x10246 cr2 0x18 ilevel 0x8 > > > rsp 0xcc00ae0cbb40 > > > > > > The last bit registered and shown by dmesg (when rebooting with the 8.0 > > > kernel) is about enumerating ahcisata0. > > > > > > Does this ring some bell to somebody? > > > > > > TIA, > > > > SMEP is a hardware method to stop execution user-memory. > > It tried to execute a function pointer that isn't initialized, most > > likely. > > > > It would be interesting to see what the backtrace is > > sysctl -w ddb.onpanic=2 will print the backtrace and reboot, which > > should make it visible in the back of the dmesg. > > > > Also, if a kernel core dump is done, it should be in /var/crash, gunzip > > and crash -M netbsd.12 -N netbsd.12.core > > crash> bt > > > > should print a backtrace. This the option DIAGNOSTIC that crashes the kernel. Since the console is frozen and I have no core dump in /var/crash, I set DDB so that it bt and reboots. It goes too quick for me to get a full vision of the bt but it chokes when wandering in usb_event. If that may have something with it (since all USB attached devices are keyboard and mouse), PCKBD_CNATTACH_MAY_FAIL is not set. Best, -- Thierry Laronde http://www.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: kernel 9.0 crash on amd64
Hello, On Sun, Apr 19, 2020 at 05:10:37PM +, m...@netbsd.org wrote: > On Sun, Apr 19, 2020 at 05:29:40PM +0200, tlaro...@polynum.com wrote: > > Hello, > > > > Mainly in order to be able to test wine, I'm compiling a NetBSD kernel > > from netbsd-9-0-RELEASE sources on an amd64 (Intel bicore). > > > > My config has very minimal changes from NetBSD 8.* config, the only > > important > > modification being USER_LDT (and I'm not putting option SVS). > > > > When it crashes, keyboard is unavailable and the information repeated is: > > > > prevented execution of 0x18 (SMEP) > > fatal page fault in supervisor mode > > trap type 6 code 0x10 rip 0x18 cs 0x8 rflags 0x10246 cr2 0x18 ilevel 0x8 > > rsp 0xcc00ae0cbb40 > > > > The last bit registered and shown by dmesg (when rebooting with the 8.0 > > kernel) is about enumerating ahcisata0. > > > > Does this ring some bell to somebody? > > > > TIA, > > SMEP is a hardware method to stop execution user-memory. > It tried to execute a function pointer that isn't initialized, most > likely. > > It would be interesting to see what the backtrace is > sysctl -w ddb.onpanic=2 will print the backtrace and reboot, which > should make it visible in the back of the dmesg. > > Also, if a kernel core dump is done, it should be in /var/crash, gunzip > and crash -M netbsd.12 -N netbsd.12.core > crash> bt > > should print a backtrace. Since there was no dump in /var/crash and the messages were frozen with keyboard not responding, I commented out all the pckbd* isa stuff, keeping only the ws* stuff related to USB keyboard and mouse. And this time, the kernel boots... I will have (later this week) to try to re-establish some options to pin-point what the offending bit is (but the pckbd* is the more likely culprit; I had also a COMPAT_BSDPTY left but I doubt it could have any effect if the related stuff doesn't exist anymore in the 9.x branch...). FWIW, here is the diff between my 8.0 config and the 9.0 _booting_ one: Index: node.mdec === RCS file: /data/cvs/priv/2/4/cauchy/node.mdec,v retrieving revision 1.11 diff -u -r1.11 node.mdec --- node.mdec 17 Aug 2019 12:49:01 - 1.11 +++ node.mdec 19 Apr 2020 17:47:50 - @@ -55,6 +55,10 @@ optionsINSECURE# disable kernel security levels - X needs this +##9 +optionsAUDIO_BLK_MS=4 # make software with low latency needs performant + # no substantial CPU overhead on this platform + optionsRTC_OFFSET=0# hardware clock is this many mins. west of GMT optionsNTP # NTP phase/frequency locked loop @@ -73,6 +77,11 @@ #options PIPE_SOCKETPAIR # smaller, but slower pipe(2) optionsSYSCTL_INCLUDE_DESCR# Include sysctl descriptions in kernel +# CPU-related options. +optionsUSER_LDT# user-settable LDT; used by WINE +##9 +#no options SVS + # CPU features acpicpu* at cpu? # ACPI CPU (including frequency scaling) coretemp* at cpu? # Intel on-die thermal sensor @@ -94,12 +103,13 @@ # makeoptionsCOPTS="-O2 -fno-omit-frame-pointer" optionsDDB # in-kernel debugger -optionsDIAGNOSTIC # inexpensive kernel consistency checks -#options DDB_ONPANIC=1 # see also sysctl(8): `ddb.onpanic' +optionsDDB_COMMANDONENTER="bt" # execute command when ddb is entered +#options DIAGNOSTIC # inexpensive kernel consistency checks +optionsDDB_ONPANIC=0 # see also sysctl(7): `ddb.onpanic' optionsDDB_HISTORY_SIZE=512# enable history editing in DDB #options KGDB# remote debugger #options KGDB_DEVNAME="\"com\"",KGDB_DEVADDR=0x3f8,KGDB_DEVRATE=9600 -#makeoptions DEBUG="-g" # compile full symbol table +makeoptionsDEBUG="-g" # compile full symbol table #options SYSCALL_STATS # per syscall counts #options SYSCALL_TIMES # per syscall times #options SYSCALL_TIMES_HASCOUNTER# use 'broken' rdtsc (soekris) @@ -108,6 +118,7 @@ optionsCOMPAT_50 # NetBSD 5.0 compatibility, optionsCOMPAT_60 # NetBSD 6.0 compatibility. optionsCOMPAT_70 # NetBSD 7.0 binary compatibility. +optionsCOMPAT_80 # de dicto optionsCOMPAT_OSSAUDIO optionsCOMPAT_NETBSD32 @@ -115,7 +126,7 @@ optionsCOMPAT_LINUX32 # req. COMPAT_LINUX and COMPAT_NETBSD32 optionsEXEC_ELF32 # this one needed by xterm(1) which uses for now BSD ptys. -optionsCOMPAT_BSDPTY # /dev/[pt]ty?? ptys. +#options COMPAT_BSDPTY # /dev/[pt]ty?? ptys. # Wedge support optionsDKWEDGE_AUTODISCOVER# Automatically add dk(4) instances @@ -252,6 +263,9 @@ acpiout* at acpivga? # ACPI Display Output Device acpiwdrt* at acpi?# ACPI Watchdog Resource Table
kernel 9.0 crash on amd64
Hello, Mainly in order to be able to test wine, I'm compiling a NetBSD kernel from netbsd-9-0-RELEASE sources on an amd64 (Intel bicore). My config has very minimal changes from NetBSD 8.* config, the only important modification being USER_LDT (and I'm not putting option SVS). When it crashes, keyboard is unavailable and the information repeated is: prevented execution of 0x18 (SMEP) fatal page fault in supervisor mode trap type 6 code 0x10 rip 0x18 cs 0x8 rflags 0x10246 cr2 0x18 ilevel 0x8 rsp 0xcc00ae0cbb40 The last bit registered and shown by dmesg (when rebooting with the 8.0 kernel) is about enumerating ahcisata0. Does this ring some bell to somebody? TIA, -- Thierry Laronde http://www.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: NULL pointer arithmetic issues
On Mon, Feb 24, 2020 at 05:35:22PM -0500, Mouse wrote: > > Unless I remember wrong, older C standards explicitly say that the > > integer 0 can be converted to a pointer, and that will be the NULL > > pointer, and a NULL pointer cast as an integer shall give the value > > 0. > > The only one I have anything close to a copy of is C99, for which I > have a very late draft. > > Based on that: > > You are not quite correct. Any integer may be converted to a pointer, > and any pointer may be converted to an integer - but the mapping is > entirely implementation-dependent, except in the integer->pointer > direction when the integer is a "null pointer constant", defined as > "[a]n integer constant expression with the value 0" (or such an > expression cast to void *, though not if we're talking specifically > about integers), in which case "the resulting pointer, called a null > pointer, is guaranteed to compare unequal to a pointer to any object or > function". You could have meant that, but what you wrote could also be > taken as applying to the _run-time_ integer value 0, which C99's > promise does not apply to. (Quotes are from 6.3.2.3.) > > I don't think there is any promise that converting a null pointer of > any type back to an integer will necessarily produce a zero integer. > The wording was the same for C89 and there is this paragraph in K (second edition, p 102): "Pointers and integers are not interchangeable. Zero is the sole exception: the constant zero may be assigned to a pointer, and a pointer may be compared with the constant zero. The symbolic constant NULL is often used in place of zero, as a mnemonic to indicate more clearly that this is a special value for a pointer. [...]" I interpret this (the paragraph above and the standard) as: in comparing a pointer to the constant zero, the constant zero is converted to a pointer of NULL value, thus comparing pointer to pointer and not comparing an integer value (the integer value of the pointer) to an integer value (0). So defining NULL as the casting of 0 is (was?) in the C standard, the actual value of the expression i.e. of an incorrect (NULL) pointer being implementation defined. FWIW, -- Thierry Laronde http://www.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Fonts for console/fb for various locales: a proposal
On Mon, Sep 30, 2019 at 02:23:02PM +0200, Piotr Meyer wrote: > On Mon, Sep 30, 2019 at 11:01:51AM +0200, tlaro...@polynum.com wrote: > > On Mon, Sep 30, 2019 at 10:32:40AM +0200, Martin Husemann wrote: > > > I guess noone would object a metafont2wsfont converter tool. > > > Look at the true type tool Michael mentioned in xsrc/local and do > > > something > > > similar for metafont. > > > > I have already planed to re-start with the Hershey fonts, for reasons > > explained in my initial mail and for others and this will be combined > > with TeX (kerTeX). So there will probably be something in this > > line, at the end, even if it is only for my own use. > > Sorry for late comment but I would like to suggest mlterm-fb as > - probably - easiest solution for Your case (if I understood problem > correctly, of course). mlterm running in framebuffer console is > capable to use wide range of standard X fonts[1] without hassle. > > If You want to convert fonts to wsfont You may take a look at some > additional resources. In addition to already mentioned there is also > my small tool[2], created for my own work for bitmap terminus fonts > (see [3] for gallery) - it isn't useful for converting vectors, but > may provide a hints about your own methods of mapping from UTF codes > or Adobe names to particular code pages (wide range of definitions is > provided by original terminus package, for my case I made only one, > for cp437). > > 1 - https://www.mail-archive.com/netbsd-users@netbsd.org/msg10136.html > 2 - https://github.com/aniou/bdf2wsfont > 3 - http://smutek.pl/netbsd/wsfont/terminus/ > 4 - http://terminus-font.sourceforge.net/ > Thank you for the links. When I will tackle the task I will also provide short explanations about what different pieces achieve and comparisons between solutions (for example, I guess, since this has been totally lost in the huge hay stack of TeXLive, that very few people know about virtual fonts, even less know how it works; few people can make a link between METAFONT, freetype or Cairo; few people know how DVI compares to PDF, or that one can compare METAFONT/TeX/DVI with PS, doing in three what is done in one with the full fledge PS programming language---leading after to a drop of a part of PS to keep only PDF in a wide range of cases; etc.). -- Thierry Laronde http://www.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Fonts for console/fb for various locales: a proposal
On Mon, Sep 30, 2019 at 10:32:40AM +0200, Martin Husemann wrote: > I guess noone would object a metafont2wsfont converter tool. > Look at the true type tool Michael mentioned in xsrc/local and do something > similar for metafont. I have already planed to re-start with the Hershey fonts, for reasons explained in my initial mail and for others and this will be combined with TeX (kerTeX). So there will probably be something in this line, at the end, even if it is only for my own use. The next visible step will be on the users mailing list, to hopefully find japanese speaking users able to "sort" the oriental glyphes when I will produce the whole rendering of the Hershey fonts. -- Thierry Laronde http://www.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Proposal, again: Disable autoload of compat_xyz modules
On Fri, Sep 27, 2019 at 08:30:40AM +0200, Martin Husemann wrote: > On Thu, Sep 26, 2019 at 09:40:22PM +0200, tlaro...@polynum.com wrote: > > If the vulnerabilities can only be exploited by running Linux binaries, > > IMHO, the point is moot: the ones that don't run Linux binaries are not > > affected; the ones that do need to run some Linux binaries will have to > > add the feature so this adds a user's intervention for the very same > > result at the end. > > I guess the main fear is that the attacker can put a malicious (and likely > explicitly crafted for a certain bug in NetBSD's linux compat) binary on > your machine and exectue it. If you have no untrusted local users > and no admin installed linux binaries, the risc should be quite small. Well, I don't think "trusted local users" exist anymore. Because they bring with them (or is it the reverse? The device brings them) i-phones or whatever and connect them, and download applications... Slightly related: is NetBSD providing build services so that someone, not wanting to open his sources, could at least build his program for NetBSD without installing it? Because the best way to avoid the compatibility is to have native NetBSD binaries. -- Thierry Laronde http://www.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Proposal, again: Disable autoload of compat_xyz modules
On Thu, Sep 26, 2019 at 10:17:51AM +0200, Maxime Villard wrote: > I recently made a big set of changes to fix many bugs and vulnerabilities in > compat_linux and compat_linux32, the majority of which have a security impact > bigger than the Intel CPU bugs we hear about so much. These compat layers are > enabled by default, so everybody is affected. > I'm just an user, so I have just a question about the scope of the problem: Are the bugs and vulnerabilities in the compat_linux*, due to the compat glue added, opening code paths that can be exploited by a non-linux program for security threats or are the vulnerabilities only problems if a linux binary is run---and perhaps other (SCO) binaries? Because, as I see it, if this opens security problems even for the ones that do _not_ use linux (or other alien) binaries, as long as the features are still easily added (even by a post-install fix for pkgsrc programs) by loading a module for the ones who have to run alien programs, not including by default the compat_linux* modules (you don't speak about the NetBSD ABI compatibility, right?), seems reasonable. If the vulnerabilities can only be exploited by running Linux binaries, IMHO, the point is moot: the ones that don't run Linux binaries are not affected; the ones that do need to run some Linux binaries will have to add the feature so this adds a user's intervention for the very same result at the end. Furthermore, if the compatibility code is adding/opening problems that do not exist in the linux system emulated, it will mean that the security problem can be only exerted by someone creating a special version of a Linux program hoping it will be run under compat on NetBSD... Security is a matter of probabilities for me, and it seems that bugs (crashes) that is: not intentional "unfelicities" are more probable than malice in this case... Once more, I'm just a user. So the only thing I'm looking for is a precision about the scope of the problem---I will obviously cope with whatever decision is reached since I'm definitively not prepared to fork :-^ Best, -- Thierry Laronde http://www.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: bounty for fsck_udf(8) for shared disks
Hello Reinoud, On Thu, Jun 20, 2019 at 03:12:12PM +0200, Reinoud Zandijk wrote: > Hi Thierry, > > On Fri, Jun 14, 2019 at 12:19:11PM +0200, tlaro...@polynum.com wrote: > > So I'd like to see the good work made by Reinoud Zandijk put a step > > further with a robust fsck_udf(8) for using indeed UDF with non optical > > disks. > > I've started fsck_udf by first refactoring newfs_udf, makefs -t udf to use a > common core that i would like to use with fsck_udf for its patch-up work. > > I presume you format the discs for UDF v2.01? UDF v2.50 doesn't have the > support for resizing the metadata partition yet, so better use UDF v2.01 for > now, the default. > > I'll try to tackle fsck_udf for UDF v2.01 on discs first then, next to > recordable media. > Thanks for the very good work you have already done! (It is the most advanced support amongst BSDs if I'm not mistaken.) Yes I use v2.01 by default (Windows uses this too) and it works quite well and IMHO this is the best "portable" format that should be encouraged. Thanks for tackling this! and if the NetBSD foundation wants to sollicite me for some extra donation and to allow you to devote some time on it, I'm OK. Best regards, -- Thierry Laronde http://www.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
bounty for fsck_udf(8) for shared disks
Hello, Context: I have a NetBSD fileserver serving files to mainly various MS/Windows nodes and some NetBSD ones. The fileserver is making also various backups among which, in order to plan for disaster, one backup is made on USB removable disks that have to be directly readable by Windows nodes so that work could continue even if the fileserver was totally inaccessible for some time. I have been using UDF for removable USB disks shared with MS/Windows and, if the GPT partitioning is done according to MS/Windows expectations (or simply done initially under MS/Windows) it works more satisfactorily than trying to use ntfs-3g (that is even not available for NetBSD 8.x since the modification of fuse or refuse or whatever it depends upon), this latter being particularily slow. And since UDF is an open specification, it should be preferred. The principle lack is that of fsck_udf(8). I had a problem (due more to USB I think than UDF) and I had to recover the disk with MS/Windows chkdsk on the command line; NetBSD was unable to recover it. So I'd like to see the good work made by Reinoud Zandijk put a step further with a robust fsck_udf(8) for using indeed UDF with non optical disks. I'm willing to donate some money to support the effort. Best, -- Thierry Laronde http://www.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Reboot resistant USB bug
Hello, On Sat, Oct 13, 2018 at 08:31:43AM +0100, Iain Hibbert wrote: > On Thu, 11 Oct 2018, Emmanuel Dreyfus wrote: > > > Hello > > > > On both netbsd-8 and -current, I have a problem with USB devices that > > get stuck in a non-functionning state even after a reboot. > > > > This happens after interrupting transfer with different NFC readers > > from different vendors, and the only way to recover the device is > > to power-cycle it. I wonder if there could be a missing step in the > > way we initialize USB devices that could explain that situation. > > This is a 'state' issue which does not change unless the device is power > cycled, which we do not generally do as part of the init AFAIK. I noticed > this with Bluetooth adaptors many years ago and we issue a reset because > of that but it doesn't affect the USB part of the device and adaptors > sometimes do fail to restart on reboot. > > What do other OSs do in this way? It seems difficult to guess the state > and we just assume that it is in post-cold boot when we attach which may > not always be optimal. > > iain FWIW, my main workstation is multi-booted. I mainly use NetBSD but occasionnally have to go to MS Windows (I also use Plan9). When rebooting from Windows (so no power off), there are numerous issues with USB attached devices, obviously because of a persistent state established by Windows and not cleared off and that confuses NetBSD. The reverse is not true (rebooting from NetBSD to Windows), whether because NetBSD "clean" things even when rebooting or because Windows always re-establish a known (to it) state. When exiting Windows, I have to power down in order for NetBSD to restart correctly with USB devices. -- Thierry Laronde http://www.kergis.com/ http://www.sbfa.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
USB, NetBSD 7/amd64: crashes
On Thu, Jul 02, 2015 at 10:18:23AM +0100, Nick Hudson wrote: On 07/02/15 10:07, tlaro...@polynum.com wrote: Hello, On an NetBSD 6.1.5/amd64, when I connect a second USB connected disk to the machine, NetBSD freezes. Unable to connect remotely; hard reboot required. Can you try netbsd-7 or better still -current? I have tried a netbsd-7 kernel and it crashes as well and the problem is still with locking. Here is the bt: mutex_oncpu.part.0() at netbsd:mutex_oncpu.part.0+0x8 mutex_vector_enter() at netbsd:mutex_vector_enter+0x93 sdopen() at netbsd:sdopen+0x87 cdev_open() at netbsd:cdev_open+0xb2 spec_open() at netbsd:spec_open+0x250 VOP_OPEN() at netbsd:VOP_OPEN+0x33 vn_open() at netbsd:vn_open+0x1ea do_open() at netbsd:do_open+0x112 do_sys_openat() at netbsd:do_sys_openat+0x68 sys_open() at netbsd:sys_open+0x24 syscall() at netbsd:syscall+0x9c ---syscall (number 5)--- There is no problem if the two disks are connected when booting (How can concurrency been achieved when the numbering of devices depends on the number of devices connected? How can two concurrent devices be named when they have the same rights to claim the very same name---sd0 for example? If the not problematic obviously sequential enumeration when both connected does not lead to problem, how can a dynamic concurrent attachment be managed if one needs to remember how many are already connected, since the number depends on that, while the already connected may be concurrently detached---not the case here? Would it not be simpler to affect a USB port fixed name? No pun intended: I'm just trying to understand how it works). Desaster occurs when one disk is added concurrently to another one. FWIW, when rebooting after the crash, the two disks being then connected, the second one (the added one) is detected as sd0 while the first one is then sd1 (for the case where the variable enumeration had something to do with the resulting havoc). For reference, on 6.1.5 this was the same: ---8--- umass0: at uhub3 port 1 (addr 3) disconnected umass0 at uhub3 port 1 configuration 1 interface 0 umass0: Western Digital Elements 10A2, rev 2.10/10.42, addr 3 umass0: using SCSI over Bulk-Only scsibus0 at umass0: 2 targets, 1 lun per target sd0 at scsibus0 target 0 lun 0: WD, Elements 10A2, 1042 disk fixed sd0: fabricating a geometry sd0: 931 GB, 953837 cyl, 64 head, 32 sec, 512 bytes/sect x 1953458176 sectors sd0: fabricating a geometry sd0: GPT GUID: 960d762c-1cf3-11e5-b5f3-448a5b9b9f0f dk0 at sd0: Basic data partition dk0: 1953454080 blocks at 2048, type: umass1 at uhub2 port 6 configuration 1 interface 0 umass1: Western Digital Elements 10A8, rev 2.10/10.42, addr 3 umass1: using SCSI over Bulk-Only scsibus1 at umass1: 2 targets, 1 lun per target sd1 at scsibus1 target 0 lun 0: WD, Elements 10A8, 1042 disk fixed sd1(umass1:0:0:0): Check Condition on CDB: 0x00 00 00 00 00 00 SENSE KEY: Not Ready ASC/ASCQ: Logical Unit Is in Process Of Becoming Ready sd1: drive offline sd1: fabricating a geometry sd1: GPT GUID: f3d6ceb3-2183-11e5-8a35-448a5b9b9f0f sd1: detached uvm_fault(0x80771320, 0x0, 1) - e fatal page fault in supervisor mode trap type 6 code 0 rip 80238c1f cs 8 rflags 10287 cr2 8 cpl 0 rsp fe8976b0 panic: trap cpu1: Begin traceback... printf_nolog() at netbsd:printf_nolog startlwp() at netbsd:startlwp alltraps() at netbsd:alltraps+0x96 dkwedge_add() at netbsd:dkwedge_add+0x1d1 dkwedge_discover_gpt() at netbsd:dkwedge_discover_gpt+0x492 dkwedge_discover() at netbsd:dkwedge_discover+0x128 sdattach() at netbsd:sdattach+0x1cb config_attach_loc() at netbsd:config_attach_loc+0x1bb scsi_probe_bus() at netbsd:scsi_probe_bus+0x537 scsibus_config() at netbsd:scsibus_config+0x74 scsipi_completion_thread() at netbsd:scsipi_completion_thread+0x23 cpu1: End traceback... ---8--- Dropping in ddb on panic, more precisely there is: Stopped in pid 1.57 (system) at netbsd:mutex_vector_enter+0x80: movq 18(%r15),%rax This has nothing to do with MBR or GPT since I have tested with both. It is systematic whenever one disk is first connected and then a second is added. Once rebooted, the two disks being connected, they are both correctly accessible. Note: FWIW, the first (and sole) disk is sd0. When rebooting, the device nodes are reversed, the second one being sd0 and the first one being sd1. Question: is there some way to named partitions independantly from hardware random enumeration (via wedges names? But this would imply keeping persistently the name, so I guess in the GPT? Is there such a thing?) -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ http://www.arts-po.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ http://www.arts-po.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95
Re: USB, NetBSD 7/amd64: crashes
On Tue, Jul 07, 2015 at 01:44:52PM +0200, Edgar Fuss wrote: It's not clear why sd1 is detaching so early. Because there's insufficiant current to power two drives at once? But in my case, the first disk is idle even not mounted when the second one is connected. Furthermore, the two disks are connected to two distinct root usb* and when testing, X was even not running but the bare minimum (and the problem happens on two distinct machines). And when the disks are both connected when booting, there is no crash. So it seems to me that this can not be a power problem. -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ http://www.arts-po.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: USB, NetBSD 7/amd64: crashes
On Tue, Jul 07, 2015 at 10:32:12AM +0200, Manuel Bouyer wrote: On Tue, Jul 07, 2015 at 08:40:11AM +0200, tlaro...@polynum.com wrote: I have tried a netbsd-7 kernel and it crashes as well and the problem is still with locking. I'm not sure it is a locking problem. In the dmesg you provide there is sd1: detached; so it looks like the device is gone while still trying to access it. Whether one or the other. May the kernel release the faulting device before faulting? Furthermore, it is a bi-core, if the fault is on cpu1 are messages from cpu0 and cpu1 guaranteed to be ordered in dmesg? i.e. can one be sure that if sd1: detached appears before uvm_fault there is a resp. time ordering? And a cause/consequence link is not guaranteed either with two cores? I don't have, unfortunately, a single CPU node on which I could test whether it happens or not in this not concurrent case. This could give a supplementary indication about the level the problem is. (Can one instruct NetBSD to use only one CPU without an ad-hoc kernel?) -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ http://www.arts-po.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
[Was USB; is] dkwedge_add(), 6.1.5/amd64: freezes when 2 umass connected
On Thu, Jul 02, 2015 at 11:07:19AM +0200, tlaronde wrote: On an NetBSD 6.1.5/amd64, when I connect a second USB connected disk to the machine, NetBSD freezes. Unable to connect remotely; hard reboot required. Indeed, the system doesn't freeze but crashes (some long time without response is caused by backtracing but this can only be seen on the console). When a first USB disk is connected (umass0), adding another USB disk crashes everything. Here are the excerpts from dmesg for the crash: ---8--- umass0: at uhub3 port 1 (addr 3) disconnected umass0 at uhub3 port 1 configuration 1 interface 0 umass0: Western Digital Elements 10A2, rev 2.10/10.42, addr 3 umass0: using SCSI over Bulk-Only scsibus0 at umass0: 2 targets, 1 lun per target sd0 at scsibus0 target 0 lun 0: WD, Elements 10A2, 1042 disk fixed sd0: fabricating a geometry sd0: 931 GB, 953837 cyl, 64 head, 32 sec, 512 bytes/sect x 1953458176 sectors sd0: fabricating a geometry sd0: GPT GUID: 960d762c-1cf3-11e5-b5f3-448a5b9b9f0f dk0 at sd0: Basic data partition dk0: 1953454080 blocks at 2048, type: umass1 at uhub2 port 6 configuration 1 interface 0 umass1: Western Digital Elements 10A8, rev 2.10/10.42, addr 3 umass1: using SCSI over Bulk-Only scsibus1 at umass1: 2 targets, 1 lun per target sd1 at scsibus1 target 0 lun 0: WD, Elements 10A8, 1042 disk fixed sd1(umass1:0:0:0): Check Condition on CDB: 0x00 00 00 00 00 00 SENSE KEY: Not Ready ASC/ASCQ: Logical Unit Is in Process Of Becoming Ready sd1: drive offline sd1: fabricating a geometry sd1: GPT GUID: f3d6ceb3-2183-11e5-8a35-448a5b9b9f0f sd1: detached uvm_fault(0x80771320, 0x0, 1) - e fatal page fault in supervisor mode trap type 6 code 0 rip 80238c1f cs 8 rflags 10287 cr2 8 cpl 0 rsp fe8976b0 panic: trap cpu1: Begin traceback... printf_nolog() at netbsd:printf_nolog startlwp() at netbsd:startlwp alltraps() at netbsd:alltraps+0x96 dkwedge_add() at netbsd:dkwedge_add+0x1d1 dkwedge_discover_gpt() at netbsd:dkwedge_discover_gpt+0x492 dkwedge_discover() at netbsd:dkwedge_discover+0x128 sdattach() at netbsd:sdattach+0x1cb config_attach_loc() at netbsd:config_attach_loc+0x1bb scsi_probe_bus() at netbsd:scsi_probe_bus+0x537 scsibus_config() at netbsd:scsibus_config+0x74 scsipi_completion_thread() at netbsd:scsipi_completion_thread+0x23 cpu1: End traceback... ---8--- Dropping in ddb on panic, more precisely there is: Stopped in pid 1.57 (system) at netbsd:mutex_vector_enter+0x80: movq 18(%r15),%rax This has nothing to do with MBR or GPT since I have tested with both. It is systematic whenever one disk is first connected and then a second is added. Once rebooted, the two disks being connected, they are both correctly accessible. Note: FWIW, the first (and sole) disk is sd0. When rebooting, the device nodes are reversed, the second one being sd0 and the first one being sd1. Question: is there some way to named partitions independantly from hardware random enumeration (via wedges names? But this would imply keeping persistently the name, so I guess in the GPT? Is there such a thing?) -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ http://www.arts-po.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: USB, NetBSD 6.1.5/amd64: freezes when 2 umass connected
On Thu, Jul 02, 2015 at 10:18:23AM +0100, Nick Hudson wrote: On 07/02/15 10:07, tlaro...@polynum.com wrote: Hello, On an NetBSD 6.1.5/amd64, when I connect a second USB connected disk to the machine, NetBSD freezes. Unable to connect remotely; hard reboot required. Questions: 1) The machine has two usb ports, with uhub0 and uhub1 first attached resp. to these ones; the uhub2 cascading from uhub0 and uhub3 from uhub1. uhub2 has 6 ports removable; uhub3 has 8 ports removable; Since in /dev/ there are only 8 devices (from usb0 to usb7) could this be the problem? (6 + 8 = 14, even if I have only one USB device---first disk---and the second disk is only the second device; but how are the device nodes assigned to one USB port?) 2) The two USB disks are from the same vendor (Western Digital) but not exactly the same model (not the same capacity). Could the USB driver be confused by two similar devices connected to the same(?) USB tree? 3) Physically, on the machine, there are USB ports on the rear, and USB ports on the front. Does somebody know if front ports could be duplicating rear ports, that is slots on the front be in fact connected to the same ports as the rear ones causing conflict? I'm trying to find what is causing this misbehavior. And a freeze is rather annoying for a node that is mainly supposed to be administrated from remote... TIA, Can you try netbsd-7 or better still -current? This will be difficult on this node since during the time I have accessed to, it serves the files (SAMBA). I will try to get the offending USB disk and do test on my personnal machine, running 6.1.5 too (on amd64) and if the same behavior happens, I will try first to get a clue about what is going on, and second try a netbsd-7 or -current. But has something be made concerning USB and umass on post-6.1.x kernels that could give a clue about what the problem is/was? -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ http://www.arts-po.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
USB, NetBSD 6.1.5/amd64: freezes when 2 umass connected
Hello, On an NetBSD 6.1.5/amd64, when I connect a second USB connected disk to the machine, NetBSD freezes. Unable to connect remotely; hard reboot required. Questions: 1) The machine has two usb ports, with uhub0 and uhub1 first attached resp. to these ones; the uhub2 cascading from uhub0 and uhub3 from uhub1. uhub2 has 6 ports removable; uhub3 has 8 ports removable; Since in /dev/ there are only 8 devices (from usb0 to usb7) could this be the problem? (6 + 8 = 14, even if I have only one USB device---first disk---and the second disk is only the second device; but how are the device nodes assigned to one USB port?) 2) The two USB disks are from the same vendor (Western Digital) but not exactly the same model (not the same capacity). Could the USB driver be confused by two similar devices connected to the same(?) USB tree? 3) Physically, on the machine, there are USB ports on the rear, and USB ports on the front. Does somebody know if front ports could be duplicating rear ports, that is slots on the front be in fact connected to the same ports as the rear ones causing conflict? I'm trying to find what is causing this misbehavior. And a freeze is rather annoying for a node that is mainly supposed to be administrated from remote... TIA, -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ http://www.arts-po.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: USB, NetBSD 6.1.5/amd64: freezes when 2 umass connected
On Thu, Jul 02, 2015 at 05:22:27PM +0800, Paul Goyette wrote: On Thu, 2 Jul 2015, tlaro...@polynum.com wrote: Hello, On an NetBSD 6.1.5/amd64, when I connect a second USB connected disk to the machine, NetBSD freezes. Unable to connect remotely; hard reboot required. Questions: 1) The machine has two usb ports, with uhub0 and uhub1 first attached resp. to these ones; the uhub2 cascading from uhub0 and uhub3 from uhub1. uhub2 has 6 ports removable; uhub3 has 8 ports removable; Since in /dev/ there are only 8 devices (from usb0 to usb7) could this be the problem? (6 + 8 = 14, even if I have only one USB device---first disk---and the second disk is only the second device; but how are the device nodes assigned to one USB port?) 2) The two USB disks are from the same vendor (Western Digital) but not exactly the same model (not the same capacity). Could the USB driver be confused by two similar devices connected to the same(?) USB tree? 3) Physically, on the machine, there are USB ports on the rear, and USB ports on the front. Does somebody know if front ports could be duplicating rear ports, that is slots on the front be in fact connected to the same ports as the rear ones causing conflict? Unlikely. All of the motherboards i've played with have the rear ports hard-wired internally, while the front-panel ports are connected via a riser cable to sockets on the motherboard. I'm trying to find what is causing this misbehavior. And a freeze is rather annoying for a node that is mainly supposed to be administrated from remote... I've had problems in the past with only a single umass hard-drive being connected. I use the external WesternDigital hard drive for backups, and as long as only a single process is writing heavily to the drive, all is well. But if I try to have two different backups running from two different filesystems (whether or not on the same wdn physical drive), the external umass/scsi drive hands the entire system and needs a hard-boot. I have a gut feeling (without any hard evidence, FWIW!) that there's something not quite MP-safe with umass/scsi Well, in my case (the USB disks are used for backup too), the first disk is not even mounted and is not used when I try to connect the second one. So no write nor even read operation is attempted on _both_ disks. And it freezes the whole system (and I have nothing in the messages after rebooting, indicating whatever...) -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ http://www.arts-po.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
NetBSD 6.1.5/amd64 and USB poor performance
Hello, I have NetBSD 6.1.5 on a amd64 with USB 3.0 ports. When writing files to an external USB (3.0) connected disk, using ntfs-3g, the write performance is abyssal : it is only USB 1.0 (12 Mbps or 1.5MB/s). From the manual pages (ehci(4)), NetBSD 6.x supports only USB 2.0 via ehci(4). The ehci connectors have also companion controllers (ohci(4) and uhci(4)) that support USB 1.0. My kernel config had only ehci support. Nonetheless, the write performance is only USB 1.0. The disk is attached with umass. Is the problem with umass? With ntfs-3g ? (but for what reason shall the performance of a filesystem driver depend on the way the device is connected?) Problem with librefuse ? Any clue would be welcomed. TIA, -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ http://www.arts-po.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Groff
On Thu, Jun 04, 2015 at 08:05:16AM +0300, Aleksej Saushev wrote: tlaro...@polynum.com writes: [pledge for TeX---not TexLive] There's a lot better approach that beats all the above on all accounts. Import libxml2, libxslt, w3m that are all readily available, convert man pages to a human-readable and human-writable format, which is XML, and stop using archaic formats. This has a number of significant benefits over TeX or roff: 1. XML is well-known, the syntax doesn't require anything special to learn. 2. There's abundancy of software to process it. 3. XML can be used immediately, without preprocessing step (just point web browser at it, and it will load stylesheet and perform XSL transformation for you). 4. Desktop users will have really good rendering as provided by Firefox or Webkit. That there may be not software but bloatware: Firefox and al. to succeed, more or less, to provide a rendering has nothing to appeal to me. That this bloat format has to be processed by tools that depend on gigabytes of software needing C++ compiler and al. to ---try to--- be compiled is definitively not what I call a system typesetting. Needing gigs of memory to try to run firefox or chrome or whatever has nothing to appeal to me; not to mention that the last time I gave a try to compile chromium it retrieved half of the Google cache as dependencies, took hours of compilation (on a rather decent computer) to finally fail to _link_ the objects because 4 gigs of memory was not enough! Furthermore, all the text tools provided by the system (and even only the POSIX.2 text tools) can be readily used on a TeX file. Finally, my idea would be the reverse: use the lean TeX engine (and even the METAFONT engine) to format and rasterize for a hypertext page viewer (a browser) to display. A hypertext page viewer, able to render including state of the art mathematical typesetting and figures, with a small pure C program with enuncombered licence. But I guess that I'm one of the few that still use Plan9 or NetBSD (or *BSD) because small is beautiful and because for me freedom means depending the least possible on not maintenable (not holding in one's---mine--hand) things. And the irony is that I'm convinced that the undercover actual Third World War will become an open Third World War and that anything depending on external and world inteconnection will simply cease to exist and that my line of choice is more sustainable than others. I'm out of the trend now; but the trend changes, and changes independently from the ones who follow it... (That's why, for the very same reason as stated above, I do not follow trend, I simply do what I feel correct to do. I may be wrong, but my error is neither caused by wanting to be sync with fashion nor by wanting systematically to be out of fashion: I simply ignore fashion.) I gather that I will not convince you; but you can surely conclude that you will never convince me ;) -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ http://www.arts-po.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Groff
On Thu, Jun 04, 2015 at 11:44:25AM +0100, Robert Swindells wrote: Johnny Billquist b...@softjar.se wrote: What happened to the original roff? I mean, groff is just a gnu replacement for roff. Maybe switch back to the original? The sources to all of DWB are available from ATT: http://www2.research.att.com/~astopen/download/ It needs a bit of work to get it to build on NetBSD though. FWIW, to show that I'm not a sectarist: John Hobby derived MetaPost from METAFONT for drawing pictures. For text, it uses TeX but can also use roff. The roff support is still there in kerTeX. So MetaPost can also be used to generate PS figures with roff text formatted. -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ http://www.arts-po.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Groff (was: Removing ARCNET stuffs)
On Mon, Jun 01, 2015 at 05:50:07PM +, David Holland wrote: On Sun, May 31, 2015 at 09:24:48PM -0400, Andrew Cagney wrote: (oh and please delete C++ groff, just replace it with that AWK script) which awk script? :-) (quite seriously, I've been looking for a while for an alternative to groff for typesetting the miscellaneous articles in base. (Delenda Carthago...) Once more, I will re-advertise that the complete Donald E. Knuth typesetting system is available, that can be even restricted to strictly just D.E.K.'s work (even with the fonts, this is a matter of far less than 10 MB); that is pure C89 (some auxiliaries invoke POSIX.2 utilities, mainly sh(1) but these are just auxiliaries); that comes with the fonts, the ability to design the fonts, the formatting (TeX) and a format dvi à la PDF that can be used to generate a formatted text version; the means to use also mathematics; the means to draw figures rasterized with METAFONT (more general figures with MetaPOST, supplementary, but this generates PS); and with a compiling framework that is not GPLn but BSD. Since for a system written in C the main human language is CEE that is a kind of technical english, the limitation to 8 bits (that could be changed by dealing with font directories and not font files, i.e. a directory of 256 glyphes sub-font) is not an immediate problem. The conversion from roff to tex should be easier than the reverse and I expect relatively simple for 95% of the work (the man pages). IMHO, the main tasks remaining are (could be GSoC by the way): - give a DVI viewer (starting from scratch); - extend with the minimal changes TeX to be able to use UTF-8 (meaning, as UTF-8, that ASCII can be fed as is, but that this is just 8 bits still at entry---mouth); - whether develop a C SmallScript to be able to interpret the limited MetaPOST PostScript; or extend DVI and METAFONT to handle MetaPOST capabilities and rasterize the figures, in order for the system to be totally self-sufficient (no PDF viewer or PostScript interpreter to be able to render the pages). It is here: http://www.kergis.com/en/kertex.html It is not orphaned but stalled for the moment due to ETIME. Best, -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
USB (ehci) mouse attachment problem
Hello, This is probably related to PR #44706. But, precisely, in this PR, the offending mouse is a Logitech one, and this also the case here. But on my NetBSD amd64 6.1.5 system, I have also a Logitech USB keyboard, and at least every odd time, if this USB keyboard is connected directly via USB (and not with a USB/PS2 converter), the logitech _mouse_ is recognized as a keyboard leading to the lost of the real one. The kernel is compiled with ehci, uhci and ohci, since some USB ports are supposed to be for low speed devices (keyboard and mouse) so I expected the necessity of USB 1.0 support, which seems to request, from ehci(4), uchi or ohci. Concerning both keyboard and mouse (both appearing as Logitech keyboard, the only difference being the second number of the iclass), here is an excerpt of dmesg: Intel product 0x8c31 (USB serial bus, interface 0x30, revision 0x05) at pci0 dev 20 function 0 not configured Intel product 0x8c3a (miscellaneous communications, revision 0x04) at pci0 dev 22 function 0 not configured ehci0 at pci0 dev 26 function 0: Intel product 0x8c2d (rev. 0x05) ehci0: interrupting at ioapic0 pin 16 ehci0: EHCI version 1.0 usb0 at ehci0: USB revision 2.0 usb1 at ehci1: USB revision 2.0 isa0 at pcib0 pckbc0 at isa0 port 0x60-0x64 uhub0 at usb0: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhub1 at usb1: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhub2 at uhub1 port 1: vendor 0x8087 product 0x8000, class 9/0, rev 2.00/0.05, addr 2 uhub2: single transaction translator uhub3 at uhub0 port 1: vendor 0x8087 product 0x8008, class 9/0, rev 2.00/0.05, addr 2 uhub3: single transaction translator uhub2: 6 ports with 6 removable, self powered uhub3: 6 ports with 6 removable, self powered uhidev0 at uhub2 port 3 configuration 1 interface 0 uhidev0: Logitech Logitech USB Keyboard, rev 1.10/23.00, addr 3, iclass 3/1 ukbd0 at uhidev0 wskbd0 at ukbd0: console keyboard, using wsdisplay0 uhidev1 at uhub2 port 3 configuration 1 interface 1 uhidev1: Logitech Logitech USB Keyboard, rev 1.10/23.00, addr 3, iclass 3/0 uhidev1: 2 report ids uhid0 at uhidev1 reportid 1: input=2, output=0, feature=0 uhid1 at uhidev1 reportid 2: input=1, output=0, feature=0 uhub2: device problem, disabling port 4 -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ http://www.arts-po.fr/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: ahcisata0: BSY never cleared, TD 0x80
On Tue, Jun 25, 2013 at 03:48:08PM +0200, Manuel Bouyer wrote: ahcisata0: BSY never cleared, TD 0x80 [...] messages too. (Furthermore, there are, when trying to get smart informations via atactl(8): wd1: dos partition I/O error at this point it's only trying to read the MBR, and fails. any other message before this ? No. Only that it fails to read the very first sector when I finally manage to kill the reading process (takes minutes). -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Atheros Ethernet product 0x1091
Hello, This ethernet device is embedded in a Gigabyte motherboard. The pcidb says: AR8161/8165 PCI-E Gigabit Ethernet Controller It is neither recognized by age(4), alc(4), ale(4) or lii(4) (dealing with L1, L2 or other). Does anybody know if there is support for this in the planning, or if there is a driver for this on a *BSD flavor that could be ported to NetBSD? TIA -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: Atheros Ethernet product 0x1091
On Wed, Jun 05, 2013 at 06:50:52PM +0200, tlaro...@polynum.com wrote: AR8161/8165 PCI-E Gigabit Ethernet Controller So it is a new chipset and there are sources (the licence is not GPL) for a collaborative work for Linux and FreeBSD family. The name is alx: http://www.linuxfoundation.org/collaborate/workgroups/networking/alx Does NetBSD participate to this also? -- Thierry Laronde tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C