Re: how do I preset ddb's LINES to zero

2023-12-15 Thread tlaronde
On Sat, Dec 16, 2023 at 02:58:59PM +1100, matthew green wrote:
> Andrew Cagney writes:
> > > > thanks, I'll add that (it won't help with my immediate problem of a
> > > > panic during boot though)
> > >
> > > From DDB command prompt "set $lines = 0" ...
> >
> > Um, the test framework's VM is stuck waiting for someone to hit the
> > space bar :-)
> >
> > I guess I could modify my pexpect script to do just that, but I was
> > kind of hoping I could do something like add ddb.lines=0 to the boot
> > line.
> 
> try "options DB_MAX_LINE=0" in your kernel?
> 
> we have poor boot-command line support if you compare against
> say what linux can do.
> 

I have added to userconf(4) (this has not been merged in NetBSD) support for 
"aliases"
(variables that can be macros), and patterns etc. Support has been added to 
config(1)
to generate "commands" to interpret by userconf(1) at start-up time
(userconf(4) interprets whatever has been added by config(1); then
whatever is passed by the bootloader; and then perhaps, enters an
interactive session if the -c flag was given; what is added via
config(1) is always interpreted).

It wouldn't be difficult to add in userconf(4) a command to set such parameters,
with then the possibility to add, at user will, "commands" to be
interpreted at start-up time via config(1); or passed by the bootloader; or 
written
in userconf(4) interactive session.

userconf(4), M.I., is the correct place to add these. And the majority
of the work has already been done to allow such extensions (see
https://github.com/tlaronde/BeSiDe for the code).
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: __futex(2): use outside Linux compat

2023-12-11 Thread tlaronde
On Mon, Dec 11, 2023 at 01:00:38PM +, Robert Swindells wrote:
> 
> tlaro...@kergis.com wrote:
> > In Mesa code implementations for futex_wake() and futex_wait() are
> > provided for Linux, Windows, FreeBSD and OpenBSD.
> >
> > There is a __futex(2) syscall in NetBSD, used only for now, if I'm not
> > mistaken, to implement Linux compat.
> 
> The Linux emulation of futexes in NetBSD does not work correctly.
> 
> > Is it OK to use for NetBSD "native" code since it is not "advertised"
> > by a man page?
> 
> No.

OK, thanks for the precisions.

To state the problem: NetBSD userland is probably the sole user of the
non futex code in Mesa. Hence, since userland doesn't follow the same
code path as the same apps on other OSes, and since this code
(!UTIL_FUTEX_SUPPORTED) is a second rate citizen considering that the
 main development (Linux) is taking another path, it could be that
the apps (the various Mesa libs components) are exercising bugs in
this part, the "tearing" or "threaded" (incorrect lines in a window)
that can be observed on NetBSD in certain circumstances being caused by
userlevel concurrent accesses, and not by kernel cache problems (there
have been reports that these defects are decreasing under heavy load
and this is perhaps only because under heavy loads there are less
threads concurrently running for the X clients, and they have no
occasion to trash shared zones that should be, normally, protected by
futexes).

So 3 options:
1) To fix the futex support on NetBSD ("ideal" solution but
quite involved, at leas for me);
2) Debug the non futex code in Mesa (meaning only finding if
the problems seen can come from there);
3) Let it be for now...

I will probably opt for 3) since I wanted to debug Mesa for other more
disastrous infelicities (crashes with xine(1) or vlc(1)---and probably
others since this comes from Mesa libs and probably not from the way
the API is used in the clients).
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


__futex(2): use outside Linux compat

2023-12-11 Thread tlaronde
In Mesa code implementations for futex_wake() and futex_wait() are
provided for Linux, Windows, FreeBSD and OpenBSD.

There is a __futex(2) syscall in NetBSD, used only for now, if I'm not
mistaken, to implement Linux compat.

Is it OK to use for NetBSD "native" code since it is not "advertised"
by a man page?
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Puzzling crash and strange reporting Re: ATI video card not recognized

2023-12-07 Thread tlaronde
On Thu, Dec 07, 2023 at 03:36:41PM +0100, Reinoud Zandijk wrote:
> Hi,
> 
> On Wed, Dec 06, 2023 at 09:52:54AM +0100, Reinoud Zandijk wrote:
> > On Mon, Dec 04, 2023 at 04:15:39PM +0100, Reinoud Zandijk wrote:
> > > On Tue, Jun 23, 2020 at 01:26:21PM +0200, Reinoud Zandijk wrote:
> > > > my old videocard died and I replaced it with a slightly newer one but 
> > > > it isn't
> > > > recognized and nothing other than vga0 attaches. Its an Gigabyte Radeon 
> > > > RX460
> > > > with 2 GB ram.
> > > > 
> > > > 002:00:0: ATI Technologies Radeon RX460 (VGA display, revision 0xcf)
> > > > 002:00:1: ATI Technologies Radeon RX 460/550/640SP, RX 560/560X HD Audio
> > > > Controller (mixed mode multimedia)
> > > > 
> > > 
> > > Back again :) I tried out the videocard again in 10.0 i(beta) and got a 
> > > lot
> > > further. However I still stumble on a panic when starting X :
> 
> A puzzling report and a worrysome crash occured while resizing a Firefox
> window:
> 
> ...
> [ 1.00] NetBSD 10.99.10 (GENERIC) #0: Mon Dec  4 16:01:51 CET 2023
> [ 1.00]   
> rein...@gorilla.13thmonkey.org:/usr/sources/cvs.netbsd.org/src-clean/sys/arch/amd64/compile/obj/GENERIC
> [ 1.00] total memory = 65456 MB
> [ 1.00] avail memory = 63301 MB
> ...
> [ 4.627885] kern.module.path=/stand/amd64/10.99.10/modules
> [ 4.640006] [drm] initializing kernel modesetting (POLARIS11 
> 0x1002:0x67EF 0x1458:0x22D6 0xCF).
> [ 4.640006] [drm] register mmio base: 0xFCE0
> [ 4.640006] [drm] register mmio size: 262144
> [ 4.640006] [drm] PCIE atomic ops is not supported
> [ 4.640006] [drm] add ip block number 0 
> [ 4.640006] [drm] add ip block number 1 
> [ 4.640006] [drm] add ip block number 2 
> [ 4.640006] [drm] add ip block number 3 
> [ 4.640006] [drm] add ip block number 4 
> [ 4.640006] [drm] add ip block number 5 
> [ 4.640006] [drm] add ip block number 6 
> [ 4.648106] [drm] add ip block number 7 
> [ 4.648106] [drm] add ip block number 8 
> [ 4.807888] ATOM BIOS: 113-TIC15322-X01
> [ 4.807888] [drm] UVD is enabled in VM mode
> [ 4.807888] [drm] UVD ENC is enabled in VM mode
> [ 4.807888] [drm] VCE enabled in VM mode
> [ 4.807888] [drm] vm size is 256 GB, 2 levels, block size is 10-bit, 
> fragment size is 9-bit
> [ 4.818504] amdgpu0: VRAM: 2048M 0x00F4 - 0x00F47FFF 
> (2048M used)
> [ 4.818504] amdgpu0: GART: 256M 0x00FF - 0x00FF0FFF
> [ 4.818504] [drm] Detected VRAM RAM=2048M, BAR=256M
> [ 4.818504] [drm] RAM width 128bits GDDR5
> [ 4.818504] Zone  kernel: Available graphics memory: 9007199252279140 KiB
> ?

For this one, see my message on the list (today), subject:

DRMKMS: bug in pseudo linus si_meminfo
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


DRMKMS: bug in pseudo linux si_meminfo

2023-12-07 Thread tlaronde
When initializing drmkms, the kernel prints bogus things like:

[ 4.193896] Zone  kernel: Available graphics memory: 9007199254113272 KiB
[ 4.193896] Zone   dma32: Available graphics memory: 2097152 KiB

The reason is to be found in

sys/external/bsd/drm2/include/linux/mm.h

which fills a pseudo Linux sysinfo struct (limited to members used).

But:
- Linux sysinfo(2) specifies that totalram is in bytes, while
totalhigh is in pages. In mm.h, totalram is initialized in
pages (not bytes) and totalhigh is defined with kernel_map->size,
that is a virtual address (?), converted in pages;

- then in:

sys/external/bsd/drm2/dist/drm/ttm/ttm_memory.c:320

mem = si->totalram - si->totalhigh;

The problem is that this is substracting oranges to apples. On
my node I have these (added aprint_*):

[ 4.224447] si_meminfo: totalram: 1756268; totalhigh: 8479211520;
memunit: 4096

it's clear that totalram (pages) - totalhigh (changed to pages
but virtual memory) leads to a negative result then casted
to unsigned long long yielding the bogus number seen.

Furthermore, when setting zone->max_mem, the memory is divided by
two (>> 1)? But why? Is it to force reserving at most only half of
what is available to graphics? A comment would be welcome explaining the
reason why.

This explains the number found for dma32: since the available memory
exceeds 2^32, 2^32 is taken as the max but, once more, divided by 2.

Do somebody know the Linux guts enough to clarify what totalhigh
refers to? (certainly not a virtual address)

Isn't it dangerous to change the "units" of totalram (bytes in Linux,
but here pages) since (I have not traced the use of the pseudo structure
in the remaining code) if values are used elsewhere in the drivers,
it is likely to wreak havoc the linux code.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Kernel startup: the blob (animal) approach

2023-11-30 Thread tlaronde
When the kernel panics, it can enter ddb or reboot.

There could be another mode implementing a blob (animal) approach this
way:

At startup, userconf(4) [1] parses an array of instructions compiled
in by config(1), then processes, for arch supporting this,
instructions passed by bootinfo and enters eventually an interactive
session.

For this interactive session, userconf(1) could register the modifying
commands in a memory zone accessible by the other routines (at the
moment, there is a history recorded but in a static array with no use
at all).

When the kernel continues, before climbing down a dev node, a "disable
this" (shorten as "D this") could be added in this shared zone. If
everything goes well, the next node will erase the instruction with
its own. If something goes wrong, the panic will add a "print " (shorten as "P ") and there could be a third mode:
instead of entering ddb or rebooting, the kernel restarts: it is not
reloaded it restarts from the beginning.

The userconf replays: instructions compiled in by config(1), bootinfo
ones, no interactive session but replaying the instructions in the
shared zone, thus disabling the offending device.

Then the kernel continues and will worm its path avoiding this
panicing point until, eventually, reaching userland, when remote
connections can be done and displaying what instructions (and what
debugging informations) are in the shared zone.

So: trial/error but if error, trying another path.

Or, instead of trying artificial intelligence, trying natural one.

[1] This is the userconf(4) I have modified:

https://github.com/tlaronde/netbsd-src/tree/tsjl
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


[PATCHES] config(1) / cons(9) / userconf(4) : Extensions

2023-11-27 Thread tlaronde
Code is at:

https://github.com/tlaronde/netbsd-src/tree/tsjl

in 3 commits:

- config(1): it accepts now also context neutral "userconf"
directives. These add the double quoted string given as
argument to the userconf_kconf[] array. This array is
interpreted by userconf(4) during startup (see below).
Typically, MI userconf instructions should go in sys/conf/std
like these ones:
#
# Userconf MI aliases.
#
userconf"alias azerty qaQAwzWZaqAQ;m:MzwZWm,M?,;<..:>/"
# start to define an executable macro "fr"
#   - printing a message
userconf"alias -c fr print changing to pseudo-fr kbd mapping"
#   - mapping from def of azerty (-a azerty) and mapping * and -
userconf"alias -c fr kmap -a azerty `*~-"
#   - printing a hint: * and - are mapped at the upper left key
userconf"alias -c fr p * and - are mapped to upper left ^2"
# Here, another macro: the drmkms alias is defined in MD code
userconf"alias -c nodrmkms disable -a drmkms"

and then in the kernel config, MD directives can be added for
example to define drmkms (an alias; each instruction creates
or adds to the definition):

# DRMKMS drivers
i915drmkms* at pci? dev ? function ?
intelfb*at intelfbbus?
userconf "alias drmkms i915drmkms*"

radeon* at pci? dev ? function ?
radeondrmkmsfb* at radeonfbbus?
userconf "alias drmkms radeon*"

#amdgpu*   at pci? dev ? function ?
#amdgpufb* at amdgpufbbus?
 
nouveau*   at pci? dev ? function ?
nouveaufb* at nouveaufbbus?
userconf "alias drmkms nouveau*"

- cons(9): two new routines: cnmapreset() and cnmap() allow
a "late" mapping of chars in startup console (works only with
cnget*()), allowing a kind of keyboard mapping for use during
this step;

- userconf(4): in order for interaction and for the config(1)
generated userconf_kconf[] array of instructions to be more
useful, a lot of things have been added to userconf(4):

o At init time, userconf interprets instructions
(cmdlines) in userconf_kconf[] (generated by
config(1)) before processing bootinfo directives and,
perhaps, entering interactive session if the "-c" flag
was passed to the kernel;

o aliases: one can create aliases, including
executable ones (macros). Userconf does its own
alloc/free stuff for this;

=> userconf_parse() thus handle taking definition of
aliases and recursing for macros;

o patterns: one can select devices using patterns.
This works for change, disable, enable, find and list;

o new built-ins:

* aliases: create or add definition to an
alias (that can be executable); allocated;

* kmap: maps characters on the console
(calling cons(9) added routines) allowing a
kind of keyboard mapping for not US ASCII
keyboards;

* print: echos tokens including dereferencing
of aliases;

* unalias: delete an alias; freed;

* vis: visualize (show) the definition of an
alias (uninterpreted)---show and 'S' were not
chosen to keep 'S' for "set" in the future; see
FUTURE DIRECTIONS;

* debug0: display config(1) added instructions
parsed at startup time;

* debug1: display debugging information about
userconf memory and structures allocations;

* debug2: display debugging information about
userconf defined aliases;

o Ergonomy: in order to limit the number of characters
to be able to give:

* input is case insensitive;

* built-ins can be given with a single letter
key (in all cases less one, this is the
initial); a macro is at least two chars, 
starting by a letter. Single letters are 
reserved for built-ins;

* no special character is needed for pattern
or alias: a flag has to be given with a hyphen
and a letter to change the interpretation of
the next token (this was proposed by RVP).

o FUTURE DIRECTIONS: I have reserved 'S' for set: a
lot of things presently in MD

userconf_parse() return status

2023-11-21 Thread tlaronde
With some delay, I'm finishing modification of cons/userconf/config
(having implemented more in userconf than initially projected):

* aliases hence local malloc/free;
* executable aliases (macros without parameters but multiple
lines possible meaning that one can define drmkms
as alias with a list of devices and one can define
"nodrmkms" as "disable -a drmkms" for example);
* multiple arguments (instead of only one), aliases being 
replaced and their definition parsed;
* patterns;
* kmap (char mapping for console input);
* single letter support for built-ins (case insensitive:
'e'|'E' for enable and so on).

Questions about current usage:

It is not obvious, but userconf_parse() was a function, returning
something. In fact, 0 generally (including when error) and (-1) when
quitting which is used (not obvious) to quit interactive mode (kernel
-c).

There are two outside usages of userconf_parse():

sys/arch/x86/x86/x86_userconf.c
sys/dev/fdt/fdt_userconf.c

where the return status is not tested (it should, since one could
imagine adding a "quit" or 'Q' in the series of instructions to
"comment out" the remaining instructions). So I will correct these.

I have modified userconf_parse() to return negative on error, 0 if OK,
and 1 if quitting. Is this OK? (Even if this is not of great use,
returning different values---here: negative ones---on error documents
the code).

The other question concerns the "history" in userconf.

I have corrected a blunder (a 'd' as "command" where a
'e'---enable---was expected) and I have added single letter support.

But if the history is now correct and could be executed (with support
for single letters), it is not accessible to user. So was this intended
to record what was done at boot time for post-mortem or debugging
purposes (which it seems) or was this intended for interactive user
comfort---I don't think so because I fail to see a benefit: user
will not repeat the same command again and again...

Does someone know the history of... "history"?
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


cons(9): char mapping

2023-11-18 Thread tlaronde
For at least userconf, are added means to define a char mapping for
the console during startup (the userconf char to char mapping command
will be "kmap", key 'k'; and a series of instructions, assembled by
config(1), will be proceeded during userconf_init() before
userconf_bootinfo(), allowing one to add a pseudo keyboard mapping
in the kernel config for early interaction).

Attached is the diff.

Comments?
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
diff --git a/share/man/man9/cons.9 b/share/man/man9/cons.9
index 42db38b25d5b..af96362d8f32 100644
--- a/share/man/man9/cons.9
+++ b/share/man/man9/cons.9
@@ -24,11 +24,13 @@
 .\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 .\" POSSIBILITY OF SUCH DAMAGE.
 .\"
-.Dd June 8, 2010
+.Dd November 16, 2023
 .Dt CONS 9
 .Os
 .Sh NAME
 .Nm cnbell ,
+.Nm cncmap ,
+.Nm cncmapreset ,
 .Nm cnflush ,
 .Nm cngetc ,
 .Nm cngetsn ,
@@ -41,6 +43,10 @@
 .Ft void
 .Fn cnbell "u_int pitch" "u_int period" "u_int volume"
 .Ft void
+.Fn cncmap "u_char from" "u_char to"
+.Ft void
+.Fn cncmapreset "void"
+.Ft void
 .Fn cnflush "void"
 .Ft int
 .Fn cngetc "void"
@@ -80,10 +86,17 @@ milliseconds at given
 Note that the
 .Fa volume
 value is ignored commonly.
+.It Fn cncmap
+Maps a char to another char. The mapping is only used inside
+.Fn cngetc
+and the facility exists to allow a kind of keyboard mapping during
+startup interaction. The nul char is never mapped.
+.It Fn cncmapreset
+Resets the char mapping to the identity mapping.
 .It Fn cnflush
 Waits for all pending output to finish.
 .It Fn cngetc
-Poll (busy wait) for an input and return the input key.
+Poll (busy wait) for an input and return the mapped input key.
 Returns 0 if there is no console input device.
 .Fn cnpollc
 .Em must
@@ -154,6 +167,7 @@ cnpollc(0);
 .Xr pckbd 4 ,
 .Xr pcppi 4 ,
 .Xr tty 4 ,
+.Xr userconf 4 ,
 .Xr wscons 4 ,
 .Xr wskbd 4 ,
 .Xr printf 9 ,
diff --git a/sys/dev/cons.c b/sys/dev/cons.c
index f3a2387fbceb..cc8db394b339 100644
--- a/sys/dev/cons.c
+++ b/sys/dev/cons.c
@@ -95,6 +95,8 @@ structtty *volatile constty;  /* virtual console 
output device */
 struct consdev *cn_tab;/* physical console device info */
 struct vnode *cn_devvp[2]; /* vnode for underlying device. */
 
+static unsigned char cn_cmap[UCHAR_MAX+1]; /* char mapping for cngetc() */
+
 void
 cn_set_tab(struct consdev *tab)
 {
@@ -109,6 +111,15 @@ cn_set_tab(struct consdev *tab)
 * cn_tab updates.
 */
cn_tab = tab;
+
+   /*
+* Char mapping is only done in cngetc() i.e. in kernel
+* startup when the console is not a tty. Assuming here that
+* if there were more than one console, there would be a
+* different terminal, that is a different keyboard attached
+* to the console so a different mapping.
+*/
+   cncmapreset();
 }
 
 int
@@ -315,6 +326,29 @@ cnkqfilter(dev_t dev, struct knote *kn)
return error;
 }
 
+void
+cncmapreset(void)
+{
+   unsigned char c;
+
+   /* Consistency, a keyboard is supposed attached to a cons */
+   if (cn_tab == NULL)
+   return;
+
+   for (c = 0; c <= UCHAR_MAX; c++)
+   cn_cmap[c] = c;
+}
+
+void
+cncmap(unsigned char from, unsigned char to)
+{
+   if (cn_tab == NULL)
+   return;
+
+   if (from)   /* Nul is never mapped */
+   cn_cmap[from] = to;
+}
+
 int
 cngetc(void)
 {
@@ -325,7 +359,9 @@ cngetc(void)
const int rv = (*cn_tab->cn_getc)(cn_tab->cn_dev);
if (rv >= 0) {
splx(s);
-   return rv;
+   /* Nul is never mapped */
+   return (rv && rv <= UCHAR_MAX)?
+   (int) cn_cmap[(unsigned char)rv] : rv;
}
docritpollhooks();
}
diff --git a/sys/dev/cons.h b/sys/dev/cons.h
index 9fed7cb0eb00..aba8def6a743 100644
--- a/sys/dev/cons.h
+++ b/sys/dev/cons.h
@@ -79,6 +79,8 @@ externstruct consdev *cn_tab;
 void   cn_set_tab(struct consdev *);
 
 void   cninit(void);
+void   cncmapreset(void);
+void   cncmap(unsigned char, unsigned char);
 intcngetc(void);
 intcngetsn(char *, int);
 void   cnputc(int);


Re: [RFC 2] userconf(4): 2nd proposal

2023-11-05 Thread tlaronde
FWIW, various things I have modified can be seen here:

https://github.com/tlaronde/netbsd-src/tree/tsjl

The userconf version present at the moment on the published branch, was
my first attempt (patterns introduced by slashes---working but not
solving the problem about drmkms).

There are other bits listed at the root in CHANGES.tsjl (other files:
WIP.tsjl and GOALS.tsjl are self-explanatory, but GOALS is empty 
at the moment except for the title and the date and will have to
be filled when the documentation is ready).

Next step will be to implement patterns and groups for
userconf/config(1), but with a revised syntax as proposed by RVP (but
for now, I will go with groups defined by config(1), and not allowing
variables aka aliases, first to not implement custom allocation,
second to allow the feature to be of use for all archs, and not
only the ones using boot.cfg).

I may probably not report on the list when done for this or even
about what I will be doing next: this will be explained in the files
mentionned above and the sources will be published in the branch. If
something is found useful by somebody, just cherry-pick (my work
published here is under a 2 clauses BSD licence).
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [RFC 2] userconf(4): 2nd proposal

2023-11-05 Thread tlaronde
On Sun, Nov 05, 2023 at 11:17:02AM +, RVP wrote:
> 
> Oh, I like the idea (I've always wanted a mechanism to list drivers
> etc. using patterns); it's just the syntax that sticks in the craw.
> Too many meta-chars. there.
> 
> OTOH, `cmd -p xyz* *abc' doesn't need much thought. And, aliases
> are pretty standard too. But, this is your show, n'est pas...?
> Don't let me stop you!

I like this more: flags introduced by '-' since if a flag is not a
number there is no ambiguity with negative numbers (allowed for the
more builtin facility). So -p would mean pattern, -g use groups instead
of driver name and -pg apply a pattern to group names and even -s
meaning STARred (-pgs, letters in whatever order, meaning apply a
pattern to group names for STARred devices)...

And without flags, this is the present syntax untouched.

This will address too the legitimate concern of Staffan Thomén about
keyboard mapping: this adds less characters, and not special ones.

For variables, I will refrain for the moment because this will impose
to add a fraction of a page (1/4, 1/2 or 1/1) as scratch "memory" to
allocate and a simple allocation scheme (a la Unix version 6 for
example: see Lion's Commentary) in order to not allocate with kernel
facilities. Not difficult, not adding much but if not necessary...
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [RFC 2] userconf(4): 2nd proposal

2023-11-05 Thread tlaronde
On Sun, Nov 05, 2023 at 01:53:31PM +0200, Staffan Thomén wrote:
> One thing I'd like to point out is that I often find I don't have the
> right keyboard layout or am restricted in some way in from typing in the
> bootloader (glitchy serial connection or really fast repeating keyboard
> or something), so keeping the syntax brief and with as few non-
> alphabetical characters as possible would probably be a good idea.
> 
> Just throwing some cents on the pile,

I have been annoyed by that too (a GENERIC kernel has a US qwerty
default compiled in) and I wondered if a supplementary short command
to switch the mapping, in userconf, would not be convenient too (no
need to deal with accented characters or whatever: just providing the
ASCII chars where the engraving of a different keyboard puts there).

For the extra characters, I think what can be accessible on the numpad
is handy (I even had * not accessible with some USB keyboards...).
This leaves the braces (for the groups) more problematic.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [RFC 2] userconf(4): 2nd proposal

2023-11-04 Thread tlaronde
On Sat, Nov 04, 2023 at 11:59:00AM +0100, Martin Husemann wrote:
> On Sat, Nov 04, 2023 at 11:25:01AM +0100, tlaro...@kergis.com wrote:
> > I think that my second proposal is the simplest, allowing not breaking
> > existing and introducing extensions without much typing.
> 
> This whole thing still makes no sense to me. You can do what you want
> with userconf already and this is not a common operation so any simplification
> for something that only makes sense (1) for ad hoc testing or (2) encoded
> in boot.cfg does not gain us anything for real.
> 
> For the real world issue at hand (bugs in kernel drivers that claim the
> console but then do not work) either a boot flag (like RB_MD4 on x86)
> or what you call "ad hoc mechanism" makes a lot more sense to me.

An "ad hoc mechanism" would be to construct a list of drivers to
disable them in block i.e. exactly the same as what can be done
already with userconf if you know the drivers names. The only
advantage of this ad hoc solution would be to require only a
generic instruction instead of several commands to disable all or the
necessity to know exactly which one to disable. This is what my
proposal is about, but instead of polluting the sources with an
"ad hoc" solution, by adding a feature that can be of some more
general use in other cases, for debugging or disabling a collection of
devices (group).

So how can you discard my proposal as "no sense", when your ad hoc
solution is only a variation around the same thing?

Secondly, a more fine grained solution to disable a portion of the
drivers dealing with the console is more involved---because if it was
not, it would have already been done, no?

And this is the problem: the drm2/ source is 206 MB (!!!). Our drmkms
sources are already not in sync with the Linux ones (I'm watching
them and there have been already major changes, for i915 and
particularily for amdgpu). So the NetBSD turtle may beat the Linux
hare, but in the end; certainly not in a speed race...

And there is the NetBSD 10 release. A definitive or even only correct
solution will not be found if 10 has to be released soon.

I'm just proposing something simple enough to improve the "crude
solution"---on par with the Linux/GRUB feature. That's the best that
can be done for the time being due to the size (that's the word...) of
the problem...
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



Re: [RFC 2] userconf(4): 2nd proposal

2023-11-04 Thread tlaronde
On Sat, Nov 04, 2023 at 09:30:53AM +, RVP wrote:
> On Sat, 4 Nov 2023, tlaro...@kergis.com wrote:
> 
> > > > No...: this is a break of existing. Trailing `*' selects STARred devices
> > > > (I'm not the inventor of this). So `*' can not be used as a joker ;-)
> > > > 
> > > 
> > > You can allow escapes for those:
> > > 
> > > uc> disable i915drmkms\*  # exact match STARred
> > > uc> disable *kms\*# only STARed `*kms'
> > > 
> > 
> > But this breaks existing...
> > 
> 
> Fine. You can introduce the notion of flags.
> For example `-p' for pattern:
> 
> uc> disable i915drmkms*   # std. starred device
> uc> disable -p *drm*  # disable using pattern
> 
> You can also add, let's say, a `-g' group flag:
> 
> uc> list -g   # list all "groups"
> uc> list -g drmkms# list devices in group drmkms
> uc> disable -g drmkms # disable group drmkms
> 

Yep, but know you see what became of the simplifications ;-)

I covered the same ground as you to end up with my first proposal, in
order, by the '=' to allow the keep the present syntax alone and to
have a new differing, and imposing double quoting of strings in order
to be able, if needed, later, to have variable names unquoted,
precisely for use in boot.cfg.

I think that my second proposal is the simplest, allowing not breaking
existing and introducing extensions without much typing.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [RFC 2] userconf(4): 2nd proposal

2023-11-04 Thread tlaronde
On Sat, Nov 04, 2023 at 08:31:09AM +, RVP wrote:
> On Sat, 4 Nov 2023, tlaro...@kergis.com wrote:
> 
> > > 1) Allowing shell-like patterns (not hard to implement):
> > > 
> > > uc> disable drm*  # all starting with `drm'
> > 
> > No...: this is a break of existing. Trailing `*' selects STARred devices
> > (I'm not the inventor of this). So `*' can not be used as a joker ;-)
> > 
> 
> You can allow escapes for those:
> 
> uc> disable i915drmkms\*  # exact match STARred
> uc> disable *kms\*# only STARed `*kms'
> 

But this breaks existing...

> > I have contemplated, too, adding for example "variables" to userconf and
> > rejected it because this would be only useful for arch supporting
> > boot.cfg,
> > 
> 
> Definition in boot.cfg was the intent.

Yes, this could be useful but for boot.cfg; but boot.cfg  is not
supported by all archs.

Hence the "grouping" proposal, that is independent from boot.cfg, and
that can have, IMO, a usage not limited to drmkms---a group being
defined so that the devices enabled/disabled by group can, indeed,
work or not imped the behavior of the kernel if disabled; I made
experiments disabling devices with pattern matching for drmkms,
ending disabling a child with the simple result that I painted myself
in a corner: there was no more display... So grouping is also
supposed to be "safe" variables definition.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [RFC 2] userconf(4): 2nd proposal

2023-11-04 Thread tlaronde
On Sat, Nov 04, 2023 at 08:20:43AM +, Michael van Elst wrote:
> tlaro...@kergis.com writes:
> 
> >disable {drmkms} # NEW: disable devices belonging to group "drmkms"
> 
> Almost noone would need to turn off all drmkms drivers. What you may
> want to control is that a GPU isn't used as a console. Disabling a driver
> is just our crude workaround to achieve this.

The problem is, at the moment, that we can not separate the GPU
handling from the drmkms stuff, meaning that one can not modify "at
run time" because, in some cases, one never gets to "run time": it
crashes.

The drmkms code (drm2/) has increased the size of the kernel sources
by... 50% (!). A "correct" solution can not be found now by diving in
the drmkms code.

So the crude workaround has to be achieved in a simpler way than
listing all the drmkms related drivers: a user trying GENERIC
does not necessarily know what is present on his hardware and does not
have to find what particular drivers he has to disable/enable.

> 
> I don't think that autoconf is the right place for such a control,
> it should be a boot parameter, maybe even something that can be
> changed at runtime later.
> 
> The current system of boot parameters is limited and differs a lot
> between platforms. We need a common way to set boot parameters and
> these should be mostly defined in a platform-agnostic way.
> 

For the moment, putting definition of groups in config(1) and handling
in userconf, achieves this goal of arch independence.

And since the problems with drmkms are mainly for x86 machines, there
is for x86 boot.cfg in which by default we could disable drmkms and
simply instruct user to enable it (try once) at userconf console with
"enable {drmkms}" and, if this works, to comment out the
"disable {drmkms}" in boot.cfg.

> 
> >Hint: Linuces distributions "work" as proposed images on servers,
> >where NetBSD fails.
> 
> Servers usually do no have drmkms capable hardware, and if they have,
> you probably want to use that hardware.

Been there and seen this (I mean: didn't see anything...): to use the
hardware, you have to know it is here; when drmkms makes the kernel
crash, on a remote node without remote boot administration/console,
you will never know what it has and you will think that NetBSD
simply doesn't work...

So, disabling drmkms to verify that NetBSD works without it allows
you to know what the hardware is and, after that, you can try
to enable drmkms at least knowing that if it crashes (if you don't have
access anymore...), this does not mean that NetBSD can not drive it,
simply that this has to be without drmkms (we need to have a boot once
feature too so that if a remote node crashes, rebooting restore a
working boot sequence).
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [RFC 2] userconf(4): 2nd proposal

2023-11-04 Thread tlaronde
On Sat, Nov 04, 2023 at 07:41:19AM +, RVP wrote:
> On Sat, 4 Nov 2023, tlaro...@kergis.com wrote:
> 
> > - 1) No change to the general form of current syntax;
> > 
> > - 2) Selection can be as presently: by number (index in cfdata), by
> > name (driver name), but also (NEW) by pattern: a pattern is
> > between slashes, it is a fix substring, that can be optionnally
> > anchored at the beginning with `^' and at the end with `$';
> > 
> > - 3) (NEW) If the selector (will this word do?) in 2) is surrounded by
> > braces `{' `}', the selector is for a group of devices;
> > 
> > - 4) The STAR (existing) is still handled as a suffix.
> > 
> > Examples:
> > 
> > disable i915drmkms  # existing syntax
> > 
> > disable {drmkms}# NEW: disable devices belonging to group "drmkms"
> > 
> > disable {/^drm/}*   # NEW: disable devices belonging to groups
> > # whose name begins with the substr "drm" if
> > # they are STARred ones.
> > 
> 
> I think you can simplify things a bit by:
> 
> 1) Allowing shell-like patterns (not hard to implement):
> 
> uc> disable drm*  # all starting with `drm'

No...: this is a break of existing. Trailing `*' selects STARred devices
(I'm not the inventor of this). So `*' can not be used as a joker ;-)

> uc> disable *drm* *usb$   # all with `drm' anywhere and those ending in 
> `usb'
> uc> disable foo   # exact match `foo'
> uc> disable 1 # exact match 1 (index)
> 
> 2) Having an alias facility:
> 
> uc> alias drm_disable=disable i915*; disable *radeon*; ...
> uc> drm_disable   # executes: RHS text (no recursive expansion)
> uc> alias drm_disable=# remove alias `drm_disable'

I have contemplated, too, adding for example "variables" to userconf and
rejected it because this would be only useful for arch supporting
boot.cfg, and useless in userconf per se.

It is useless in userconf per se, because it is not persistent: the
time one will spend defining the aliases would be longer than the time
to type directly the disabling of several devices at userconf prompt ;-)

The goal, for me, is to have something generic, available on all
archs (hence put it in kern/subr_userconf.c and config(1)), and not an
ad hoc trick for drmkms, so that there is not something we have to
remember to update when something changes (groups will be set for the
benefits of userconf by config(1) with a macro added for the purpose).
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


[RFC 2] userconf(4): 2nd proposal

2023-11-04 Thread tlaronde
Revised proposition:

- 1) No change to the general form of current syntax;

- 2) Selection can be as presently: by number (index in cfdata), by
name (driver name), but also (NEW) by pattern: a pattern is
between slashes, it is a fix substring, that can be optionnally
anchored at the beginning with `^' and at the end with `$';

- 3) (NEW) If the selector (will this word do?) in 2) is surrounded by
braces `{' `}', the selector is for a group of devices;

- 4) The STAR (existing) is still handled as a suffix.

Examples:

disable i915drmkms  # existing syntax

disable {drmkms}# NEW: disable devices belonging to group "drmkms"

disable {/^drm/}*   # NEW: disable devices belonging to groups
# whose name begins with the substr "drm" if
# they are STARred ones.

This work for all actions: change, enable, disable, find and list.

Remainder:

Drmkms is crashing the kernel in various configurations. The drivers
can not be modloaded, they have to be compiled in the kernel. Hence a
way to disable them at booting time is needed.

Hint: Linuces distributions "work" as proposed images on servers,
where NetBSD fails. But this is because GRUB has a switch to disable
drmkms. And the switch is on. Even Linux does not try to use drmkms in
server configurations...
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [RFC] userconf(4) modification

2023-11-03 Thread tlaronde
On Fri, Nov 03, 2023 at 09:41:14AM +0100, Martin Husemann wrote:
> On Thu, Nov 02, 2023 at 05:32:20PM +0100, tlaro...@kergis.com wrote:
> > On Thu, Nov 02, 2023 at 05:05:53PM +0100, Martin Husemann wrote:
> [..]
> > > Something like:
> > > 
> > >  uc> drm off
> > > 
> > > and then have the drm command use a fixed build-in table of driver names
> > > to disable individual drivers.
> > 
> > This is precisely what I dislike: an ad hoc addition with the
> > necessity to be careful about what objects have to be regenerated
> > whenever something is touched or changed.
> 
> Well, there are two parts to it:
> 
>  1) the user interface: for a user following hints from the internet
> because their new machine blanks the screen at boot time the command
> has to be as simple as possible. We may work around that by adding
> the required magic to the standard boot menu on install media.
> 
>  2) the implementation: a very simple and scalable implementation
> (instead of the static list of known DRI devices, which IMO is not
> that hard to maintain either) is a global kernel variable like
> "drm_enabled" and all DRM related drivers checking for that in their
> probe function.
> 

When booting (boot(8)), there is switches to disable multiprocessor
(-1), ACPI (-2), SVS (-3) and some MD (-4).

Do you mean adding a -5 for example? That is this will have nothing to
do with userconf?

Alternatively, since my proposed syntax and my proposed explanations
failed to find any support, I could go with a simplest (from some
offlist input): do not change the current syntax, but accept a pattern
between slashs '/^?pattern$?/', and a group between braces '{drmkms}',
allowing:

disable {drmkms}
list /^usb/

Note: this will be drmkms and not drm because there is still something
"different": drm, the old drivers.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [RFC] userconf(4) modification

2023-11-02 Thread tlaronde
On Thu, Nov 02, 2023 at 04:12:43PM +, Robert Swindells wrote:
> 
> tlaro...@kergis.com wrote:
> > As stated in a message before, disabling, via userconf(4), all the
> > drmkms drivers can not rely on a pattern matching since, for historical
> > reasons (several versions of DRM), the namespace of the drivers is not
> > "ruled".
> 
> I don't see the need for this, it would be unusual for more than one
> drm driver to attach.
> 
> You can also build a custom kernel.

The problem is when installing a GENERIC kernel.

You don't know, a priori, what will be encountered.

Been there: I installed a "generic" distribution on a lended baremetal
remote server (without any access except ssh when it comes up), and
NetBSD failed to work, because of the drmkms/

And there can be more than one drmkms driver. It is not unusual to
have a GPU integrated (on board) and a discrete one (on an extension).

The goal is to provide a way to disable drmkms entirely without
knowing what the kernel will actually encounter.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [RFC] userconf(4) modification

2023-11-02 Thread tlaronde
On Thu, Nov 02, 2023 at 05:05:53PM +0100, Martin Husemann wrote:
> I would prefer to have a special new command that does all the magic
> internaly, and don't waste code and complexity on pattern matching
> and generalizations.
> 
> Something like:
> 
>  uc> drm off
> 
> and then have the drm command use a fixed build-in table of driver names
> to disable individual drivers.

This is precisely what I dislike: an ad hoc addition with the
necessity to be careful about what objects have to be regenerated
whenever something is touched or changed.

The pattern matching (already implemented in a previous attempt)
doesn't cost much in code (it is not a regex implementation) and
allows too to reduce listing of devices in order to not have literally
hundreds of entries to browse.

userconf is, for me, a debugging/developing tool too. So it can be
useful for more than drmkms to disable a whole range of devices
(whether by group name or by pattern matching if the namespace is
"ruled").
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [RFC] userconf(4) modification

2023-11-02 Thread tlaronde
On Thu, Nov 02, 2023 at 06:59:50PM +0300, Valery Ushakov wrote:
> On Thu, Nov 02, 2023 at 16:29:42 +0100, tlaro...@kergis.com wrote:
> 
> > You will find attached the man page in order to be able to comment
> > about the proposed new syntax---supplementary syntax: it does not
> > replace the "legacy" one.
> 
> The man page is super-confusing.  Someone who needs to use userconf to
> get their system to boot needs a clear reference, but the proposed
> version tries to be overly formal and ends up a bit opaque.
> 
> I also don't understand why it is necessary to call the old syntax -
> "legacy".  From the man page my impression is that the command can be
> either
> 
> command dev
> 
> or
> 
> command property = value

I called it "legacy" because (I'm not an english native speaker) I
didn't find (or didn't know) how to call it differently.

In the present syntax (what I call "legacy"), you can give as a device
specification whether a number or a driver name.

If I want to introduce something else: a group name, I have to change
the syntax if I don't want to introduce something extra fancy to
stipulate it's a group name and not a driver name.

Hence the '=' that permits to clearly identify the "new" syntax
against the old one; specifying what property we are matching against
allow further extensions without syntax modifications if needed (not
proposed here).

> 
> both are in a sense a kind of device selector, why do you have to
> declare one of them "legacy"?  The user probably doesn't care much
> either way, they need to get the kernel booting and are not interested
> in the lore.
> 
> Why the thing after = is called "expression"?  That position only
> accepts two kinds of literals, one of which is a shorthand for the
> other (but I had to re-read that paragraph several times and I'm still
> not quite sure it actually clearly says that).

It's an expression because it depends. It can be a number (positive
integer) for devno; it can be a string literal (exact match) or a
pattern (substring match).

I retained the shorthand (literal string) because of the present
syntax. But it could be discarded in favor of the only /^drmkms$/
syntax i.e. a special case of pattern matching: matching against whole
string.

Since I'm not an english native speaker, I tend to put in text a
pseudo KNF. This is why it is "formal". It seems my attempt to be
"boring but clear" failed...

The current (not mine) man page is not formal. But it doesn't
tell the true story either. The STARred devices are not explained.
The devno is not explained either---and the range is not checked
in the code allowing access with a negative number in the cfdata
vector.

I will be grateful to some english native speaker or someone
confortable enough with english to fix the man page and/or propose a
syntax that will not require more acrobatics to "understand" that what
is wanted is neither a device by index (number), nor a device by
driver name but something else.

I like strong typing of variables... Awk(1), perl(1) and whatever
loose typing languages are not my cup of tea.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


[RFC] userconf(4) modification

2023-11-02 Thread tlaronde
As stated in a message before, disabling, via userconf(4), all the
drmkms drivers can not rely on a pattern matching since, for historical
reasons (several versions of DRM), the namespace of the drivers is not
"ruled".

So I want to add a "group" member to the cfdata structure, with
modifications to config(1) to set it, in order to allow to disable
devices by a group name.

Additionnaly, because I had already implemented it, there is a pattern
matching feature too.

You will find attached the man page in order to be able to comment
about the proposed new syntax---supplementary syntax: it does not
replace the "legacy" one.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
.\" $NetBSD: userconf.4,v 1.14 2019/05/27 21:19:55 wiz Exp $
.\"
.\" Copyright (c) 2001 The NetBSD Foundation, Inc.
.\" All rights reserved.
.\"
.\" This code is derived from software contributed to The NetBSD Foundation
.\" by Gregory McGarry.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\"notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\"notice, this list of conditions and the following disclaimer in the
.\"documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
.\" PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.\"
.Dd November 1, 2023
.Dt USERCONF 4
.Os
.Sh NAME
.Nm userconf
.Nd in-kernel device configuration manager
.Sh SYNOPSIS
.Cd options USERCONF
.Sh DESCRIPTION
.Nm
is the in-kernel device configuration manager.
It is used to alter the kernel autoconfiguration framework at runtime.
.Nm
is activated from the boot loader by passing the
.Fl c
option to the kernel.
.Sh COMMAND SYNTAX
There is a subset of meta-commands described immediately below and
action commands, described after in separated sections, in two syntaxes: 
legacy and new, the new syntax extending the possibilities offered by
the legacy one.
.Pp
.Nm
has a
.Xr more 1 Ns -like
functionality; if a number of lines in a command's output exceeds the
number defined in the lines variable, then
.Nm
displays
.Dq "-- more --"
and waits for a response, which may be one of:
.Bl -tag -offset indent -width ""
.It 
one more line.
.It 
one more page.
.It Ic q
abort the current command, and return to the command input mode.
.El
.Pp
The common meta-commands are the following:
.Bl -tag -width 5n
.It Ic lines Ar count
Specify the number of lines before more. A negative number suppresses
the paging.
.It Ic base Ar 8 | 10 | 16
Base for displaying large numbers.
.It Ic exit
A synonym for
.Ic quit .
.It Ic help
Display online help, including ranges of device number, list of
device names and groups.
.It Ic quit
Leave userconf.
.It Ic \&?
A synonym for
.Ic help .
.El
.Sh LEGACY SYNTAX AND COMMANDS
.Nm
supports the legacy syntax:
.Bd -ragged -offset indent
.Ic command Op Ar option
.Ed
.Pp
and offers the following commands:
.Bl -tag -width 5n
.It Ic change Ar devno | dev
Change devices.
.It Ic disable Ar devno | dev
Disable devices.
.It Ic enable Ar devno | dev
Enable devices.
.It Ic find Ar devno | dev
Find devices.
.It Ic list
List current configuration.
.El
.Sh NEW SYNTAX AND COMMANDS
.Nm
supports the new syntax:
.Bd -ragged -offset indent
.Ic command Ar property Li \&= Ar expression
.Ed
.Pp
The
.Li \&=
has to be interpreted as meaning: defines the collection of devices on
which to apply the command with devices whose stated property matches
the expression.
.Pp
The commands are the following (same as with legacy syntax):
.Bl -tag -width 7n
.It Ic change
Change devices.
.It Ic disable
Disable devices.
.It Ic enable
Enable devices.
.It Ic find
Find devices.
.It Ic list
List devices
.El
.Pp
A
.Ar property
is one of the literals
.Bl -tag -width 5n
.It Li devno
the index number of the device in the cfdata vector. The expression
shall be a positive or nul integer value, less than the cardinal of
devices in the cfdata vector.
.It 

[PATCHES] Adding Xorg libdrm rst2man(1) translated man pages

2023-10-23 Thread tlaronde
[Note: no need to Cc me anymore. Culprit (me...) being found; and
problem solved.]

I have added the translated man pages in xsrc/local/man/man[37] and a
UPDATING file at the root of xsrc.

Can be pulled from:

https://github.com/tlaronde/xsrc

commit b24a2c96577617a6297efde04ce5628985291eb4 (HEAD -> trunk,
origin/trunk, origin/HEAD)
Author: Thierry LARONDE 
Date:   Mon Oct 23 11:45:48 2023 +0200

Adding rst2man translated libdrm man pages.


and updated src/ as well to proceed the supplementary man pages, these
man pages (in whatever format) being added to the xcomp set.

Can be pulled from:

https://github.com/tlaronde/src
commit 0645b8ce539a57e81ea8ee1e8102e66bff1d9c15 (HEAD -> trunk,
origin/trunk, origin/HEAD)
Author: Thierry LARONDE 
Date:   Mon Oct 23 11:49:07 2023 +0200

Adding the Xorg libdrm rst2man(1) generated man pages.

-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: DRM-KMS: add a devclass DV_DRMKMS and allow userconf to deal with classes

2023-10-19 Thread tlaronde
On Thu, Oct 19, 2023 at 10:41:53AM -0400, Mouse wrote:
> >>> [...DV_DRMKMS...userconf...]
> >> [...devices in multiple classes...maybe use a separate namespace,
> >> used by only config(1) and userconf?...]
> > This is precisely why I ask for comment ;-)
> 
> :-)
> 
> > I have two requirements:
> 
> > - that the solution is not ad hoc i.e. that it can provide, in
> > userconf, facilities not limited to drmkms (I don't want to implement
> > a special case to recognize "drmkms" and to expand to all the STARred
> > driver names implied);
> 
> I agree with this; special-casing drmkms would be...suboptimal.
> 
> > - that it will not imply to have to maintain special data for
> > userconf to recognize some "magic" strings.
> 
> You already need that, in that userconf has to have some way to
> recognize the string "drmkms" as a device category (hinted by the
> "class =" syntax, but it still needs error-checking) and map it into
> the corresponding DV_ value.  I don't see it as significantly worse for
> config(1) to generate some data structure mapping device class names
> into whatever userconf would need to affect all devices of that class.
> 
> Though it occurs to me that there are too many things called "class"
> here.  "Group"?  "Category"?  "Collection"?

I concluded too that config(1) can do the generation of the tables during
the translation so there should not be a need to "manually" keep
up-to-date data files.

I think it would make sense to use "Group" and that this should be in
fact special to userconf: ability to handle, with userconf, a group of
devices, the list of groups being defined at config time, with some
USERCONF(USERCONF_GROUP_DRIVER, string) macro.

And adding the command in userconf to "set" a variable to a list, so
that for example: "disable name in $var" or "disable group in $var"
works (but for drmkms it will be defined at config time so this would
be: 'disable group = "drmkms"'.

This will allow customization both for a developper in source, and
for an end user to set, for userconf, a group of devices he wants to
enable or disable. (In this case, when the group is composed of devices
not mandatorily related in some way, "collection" would be a better
term than "group" (I'm with von Neumann when it comes to Set theory;
but let's not be pedantic).
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: DRM-KMS: add a devclass DV_DRMKMS and allow userconf to deal with classes

2023-10-19 Thread tlaronde
On Thu, Oct 19, 2023 at 09:10:36AM -0400, Mouse wrote:
> > I propose to add a DV_DRMKMS class to sys/device.h:enum_devclass; to
> > augment cfdata with a devclass member [...]
> 
> > Comments?
> 
> This is not intended as criticism; I am just trying to examine all
> sides of this question.
> 
> Why use the sys/sys/device.h kind of device class for userconf?  Is
> there some reason to think it will be useful to userconf other device
> classes, or do you expect other device-class machinery to have a use
> for DV_DRMKMS, or is it a question of just reusing the existing device
> class rather than creating a new kind of device class, or what?

I'm just trying to stay in the vincinity of cfdata, for the headers and
for the benefit (consummation) of config(1) uphill and userconf
downhill.

For the moment, the drivers are given the DV_DULL class, while for
modules several classes are given. But userconf doesn't deal with
modules...

The other reason is that with the drmkms multiple modules classes are
provided. It seems to me that, even if it would be useful to disable
specific childs (if only for debugging purposes), at the moment there
should be a "main" class to disable everything uphill.

So the DV_DRMKMS is not exactly the "drmkms" class of modules...

> 
> I'm also thinking it could be useful for a device to fall into multiple
> classes for userconf, but I _think_ DV_* classes don't support a device
> being in multiple classes.

Yes: the DV_* are exclusive: a device can not appear in several classes.
This is emphasized in the man page and in the source.

>  It also could be useful for custom kernels
> to have custom modifications to device classification.  So I'm
> wondering if it would be better for this to be a namespace specific to
> config(1) and userconf rather than having anything to do with DV_*
> values.

This is precisely why I ask for comment ;-) I have two requirements: 

- that the solution is not ad hoc i.e. that it can provide, in userconf,
facilities not limited to drmkms (I don't want to implement a special
case to recognize "drmkms" and to expand to all the STARred driver names
implied);
- that it will not imply to have to maintain special data
for userconf to recognize some "magic" strings.

But the second item: generating data according to conf is the task of
config(1). So config(1) should do the job.

Indeed good question: devclass or modules classes or something else? The
usr.bin/config/TODO is already listing the problem of the two kind of
classes.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


DRM-KMS: add a devclass DV_DRMKMS and allow userconf to deal with classes

2023-10-19 Thread tlaronde
[Please do CC me on reply since I _am_ subscribed to the list but don't
get the messages...]

Note: code can be seen on https://github.com/tlaronde/src .

I have implemented "patterns" in sys/kern/subr_userconf.c, in order to
allow to manipulate (change, disable, enable, find, list) a device
matching a possibly anchored substring.

But this doesn't solve the problem for dmskms (to be able to disable all
with a single well knows instruction) since the names don't match a
regular pattern.

I propose to add a DV_DRMKMS class to sys/device.h:enum_devclass; to
augment cfdata with a devclass member and modify config(1) accordingly
so that in sys/kern/subr_userconf.c can be introduced a (supplementary
for now; not replacing) new syntax:

exp: number | string | magic | pattern

string: '"' alpha alphanum* '"' /* case insensitive */
magic: alpha alphanum   /* case insensitive */
pattern: '/' ['^'] alphanum ['$'] '/'   /* case insensitive */

{change, disable, enable, find, list} name = exp
{change, disable, enable, find, list} class = magic

so that:

disable class = drmkms

does the trick.

There is already in usr.bin/config/TODO a paragraph about classes, so
it seems this proposal leans towards what was expected.

Comments?
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: drm.4 man page and import of X11 drm-kms.7 and al.

2023-10-18 Thread tlaronde
On Wed, Oct 18, 2023 at 10:35:26AM +, Taylor R Campbell wrote:
> > Date: Tue, 17 Oct 2023 14:39:57 +0200
> > From: tlaro...@kergis.com
> > 
> > I have modified drm.4 to state that the drivers are obsolete and 
> > to suppress a mention of viadrm that was removed long ago (now superseded by
> > viadrmums, provided in drm2/ ---drmkms--- part).
> > 
> > Patch can be retrieved from https://github.com/tlaronde/src
> 
> Thanks, I took the opportunity to update the whole man page.  Didn't
> realize until now that our drm(4) man page was a local creation
> requiring local maintenance.
> 

It's sure is an improvement! Thanks for doing so!
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: drm.4 man page and import of X11 drm-kms.7 and al.

2023-10-18 Thread tlaronde
On Wed, Oct 18, 2023 at 10:35:26AM +, Taylor R Campbell wrote:
> 
> > There is no man page for drmkms (the kernel part), but there are man
> > pages in the X sources, in the rst format
> > (external/mit/libdrm/dist/man/drm-kms.7.rst) with a bunch of related
> > resources that provide a view of the DRI thing (from the X POV).
> > 
> > There is rst2man-3.10 (pkgsrc py310-docutils) to convert these to man
> > pages.
> > 
> > Should this be done (it is the X11/DRI interface, not the kernel one, so
> > should reside in the X11R7 realm)?
> 
> It might be reasonable to ship libdrm man pages in /usr/X11R7/man but
> we would need to import the pregenerated rst2man output into
> xsrc/external.  Not hard in principle but somewhat annoying to deal
> with.  That said, a cursory skim suggests there's a lot missing here.
> I see a lot of API functions cross-referenced, but I don't see their
> documentation here?  So I'm not sure how useful this would be.

What exists is probably better than nothing and, at the very least,
drm-kms.7 gives a (part) of the view---unfortunately, a comment in the
old version in a header was giving a view of what was wanted on the
kernel side, not mentionned in drm-kms7, but I didn't find the equivalent
in the new sources (and the documentation provided on the Web by
the Linux team is not up-to-date either---there are for example
mentions of drmP.h that doesn't exist anymore).

So I'm for providing what exists, once more for programmer writing X11
clients (the kernel part is another problem).
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: drm.4 man page and import of X11 drm-kms.7 and al.

2023-10-17 Thread tlaronde
On Tue, Oct 17, 2023 at 02:33:43PM +0100, Robert Swindells wrote:
> 
> tlaro...@kergis.com wrote:
> > So to clarify: I'm proposing to convert the rst doc pages to man
> > pages (with for example the utility I cite), and to add the man pages,
> > in man format, to the sources (in order for the sources to not depend
> > on a supplementary external tool) and to install the man pages in 
> > /usr/X11R7/man/.
> 
> I wouldn't bother installing man pages for this, someone working on the
> kernel already has the source tree.
> 
> Maybe the drm.4 manpage could be extended to describe the current
> status.

But someone writing an X11 client should have the information: NetBSD is
also a system for development. The man pages should be at least in
Xcomp.

There is still a big part in user space. And it will help too the
ones who want to have a clue about what it is---not to mention that
this will clarify the fact that this is heavily X11 linked, which
is part of the problem:  how could a GPU be used for not "rendering",
but as a General Purpose Graphics Processor, if it's not the kernel
that is arbitrating but the X11 server. This does mean that an
arbitrary application could not work without being converted to
use the X11 interface.

(I'm my view, the kernel should detect all the resources (it's its role:
a kernel is a resource manager allowing a policy to resources access)
including auxiliary processors like GPUs, and the rendering is only a
specialized usage of these resources.)
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: drm.4 man page and import of X11 drm-kms.7 and al.

2023-10-17 Thread tlaronde
[I mean to reply to Mouse, but I have a hell of a time with majordomo or
this mailing list: I'm subscribed but I don't get the messages! Can
someone look at this, please. TIA.]

So to clarify: I'm proposing to convert the rst doc pages to man
pages (with for example the utility I cite), and to add the man pages,
in man format, to the sources (in order for the sources to not depend
on a supplementary external tool) and to install the man pages in 
/usr/X11R7/man/.

The X11 part, for the interface, has changed in 2012, but seems (again:
for the interface) stable but the implementation changes and the Linux
kernel implementation is still changing frequently and heavily (the 
drm2/ sources are already significantly behind the Linux sources with
not trivial changes; it's, for me, a lost race...).

On Tue, Oct 17, 2023 at 02:39:58PM +0200, tlaronde wrote:
> I have modified drm.4 to state that the drivers are obsolete and 
> to suppress a mention of viadrm that was removed long ago (now superseded by
> viadrmums, provided in drm2/ ---drmkms--- part).
> 
> Patch can be retrieved from https://github.com/tlaronde/src
> 
> There is no man page for drmkms (the kernel part), but there are man
> pages in the X sources, in the rst format
> (external/mit/libdrm/dist/man/drm-kms.7.rst) with a bunch of related
> resources that provide a view of the DRI thing (from the X POV).
> 
> There is rst2man-3.10 (pkgsrc py310-docutils) to convert these to man
> pages.
> 
> Should this be done (it is the X11/DRI interface, not the kernel one, so
> should reside in the X11R7 realm)?
> -- 
> Thierry Laronde 
>  http://www.kergis.com/
> http://kertex.kergis.com/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C

-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


drm.4 man page and import of X11 drm-kms.7 and al.

2023-10-17 Thread tlaronde
I have modified drm.4 to state that the drivers are obsolete and 
to suppress a mention of viadrm that was removed long ago (now superseded by
viadrmums, provided in drm2/ ---drmkms--- part).

Patch can be retrieved from https://github.com/tlaronde/src

There is no man page for drmkms (the kernel part), but there are man
pages in the X sources, in the rst format
(external/mit/libdrm/dist/man/drm-kms.7.rst) with a bunch of related
resources that provide a view of the DRI thing (from the X POV).

There is rst2man-3.10 (pkgsrc py310-docutils) to convert these to man
pages.

Should this be done (it is the X11/DRI interface, not the kernel one, so
should reside in the X11R7 realm)?
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: DRM/KMS: report

2023-10-15 Thread tlaronde
On Sun, Oct 15, 2023 at 10:13:18AM +, Taylor R Campbell wrote:
> 
> > DRM or now DRM2 (aka DRM/KMS) are inherently and heavily linked to
> > X11 and to Linux.  Due to the size of the thing, NetBSD is deriving
> > a version from the one FreeBSD tries to derive.
> 
> Not sure what you mean about FreeBSD, but our drm2 code base was
> developed largely independent of whatever is in FreeBSD, and as far as
> I know was started well before FreeBSD adopted the same approach of
> writing Linux API shims.

Then this is a difference with the first version, according to drm(4)
that is referring exclusively to DRM (first version) and is partial
(not mentionning DRM2 / DRMKMS) or obsolete.

For the record, I started first by trying to review _all_ the commits
starting from 2007-03-20 (the first import of the first---for NetBSD---
version)... But I realised, after some time and considering how fast I
was going through, it was hopeless... So I have a view, but far from
complete or even accurate.

> 
> > To make things even worse, the abuse of acronyms is blurring things that
> > didn't need to be made even less clear. Not to mention the fact that DRM
> > is also used for Digital Rights Management---that has strictly nothing
> > to do with the thing---, DRI (a part of the X11 stuff) is also used
> > instead of DRM for the X11 part, and DRM2 is also referred too as 
> > DRM/KMS.
> 
> sys/external/bsd/drm is the previous generation of the drm code base,
> from before it did any kernel mode-setting (KMS).  Display
> configuration was done by peeking and poking device registers from
> userland through /dev/mem and /dev/pci -- the legacy user mode-setting
> approach (UMS).  The /dev/dri/ nodes were used by userland only to map
> some registers and manage graphics buffers bound into the GPU address
> space.
> 
> sys/external/bsd/drm2 is the current generation of the drm code base,
> including both UMS and KMS.  With KMS, display configuration is done
> by a set of structured ioctls on /dev/dri/ nodes, with all device
> register access done by the kernel.  (The /dev/dri/ nodes are also
> used to manage graphics buffers.)
> 
> When I more or less started over from scratch, I called it drm2 just
> so it would have a distinct place in the source tree while people
> still relied on the previous generation of the code.
> 
> By now I think we should just delete sys/external/bsd/drm; it has been
> unmaintained for so long it is unlikely to work.  If there's interest
> in the legacy UMS drivers, they should all still be in the drm2 tree
> and can be adapted like I did with viadrmums.  But I have no hardware
> for most of them.

I will put all the documentation bits together some place for reference.

Thanks for the clarifications!

> 
> > The drivers using the new API have sometimes "kms" in the name (for
> > i915, I guess to make a difference with the previous "legacy"
> > i915drm), but generally not, or if this is the case, this is not the
> > device attaching early:
> > 
> > # DRMKMS drivers 
> > i915drmkms* at pci? dev ? function ?
> > intelfb*at intelfbbus?
> 
> `i915drmkms' happened because `i915' is not allowed (ends with a
> digit) and `i915drm' was already taken.
> 
> > To illustrate the namespace problem, take "radeon":
> > 
> > radeondrm* is the legacy DRM driver and:
> > 
> > radeon* is the DRM2 and this is its child, the fb, that has the "kms"
> > substring:
> > 
> > radeondrmkmsfb* at radeonfbbus?
> 
> `radeondrmkms' happened because `radeonfb' was already taken.
> 
> I'm not attached to these names, but they've been around for long
> enough they are probably named in existing boot.cfg files, so changing
> them might is likely to break people's bootloaders.
> 
> Not hard to imagine creating a new way to tag drivers that can be
> referenced by userconf so that renaming isn't necessary.
> 

If the drivers were matching a rule, I have already implemented in
sys/kern/subr_userconf.c (on my git fork on
https://github.com/tlaronde/src) the use of "patterns" to change,
disable, enable, find and list matching driver names.

I could add specifiers to the "patterns" to match parent device or child
device.

I could extend too cfdata in order to allow to take into account a
devclass and to match against it.

Modules are setting a class and it would be the simpler to be able to
use such a tag in userconf to disable the devices without having to
resort to ad hoc lists---and even worse, to expand a magic name in MD
bootinfo stuff, with the obligation to update lists and the risk to
have to augment the size of bootinfo data.

I wanted and still want to implement something gene

DRM/KMS: report

2023-10-14 Thread tlaronde
[I'm sending this to the tech-kern since the previous message on
tech-userlevel is only: the list seems dead?]

[CAVEATS: Please remember that I'm not an english native speaker, and
that what follows is not a "lecture" or a judgement about what is done,
but a home made translation in some english of some of the notes---there
is more documentation to come later.
if I wanted to look at the DRM/KMS stuff, it was because I felt (and
still feel...) that I would never haved embarked in such an appalling
task to try to tame a thing like that ;-) I'm not "blaming" or
"naming and shaming"---or whatever the term is---or despising work
or people.]

3 months ago, I have engaged to take a look at the DRM/KMS object, with
the goal to ensure that the NetBSD kernel could be severed at will from
it.

Here is the report.

I will start with code for the impatients, and will continue with
documentation / comments and end with future directions (for me).

Note: I have finally taken again an Internet optical fiber connection
(after infelicities with a previous provider), so I have been able to
pull and push on a fork that is here:

https://github.com/tlaronde/src


 WHAT IS IN THESE SOURCES

commit 6d715506703ed9f0bec6a39fec8794b5b8eb
Author: Thierry LARONDE 
Date:   Fri Oct 13 18:39:03 2023 +0200

In order to allow to change, disable, enable, find or list devices
according to a pattern (specified between slashes; can be anchored at
beginning with '^'; at end with '$'; but no wildcard dot, or count or
range...), the userconf parsing are modified.

It works... but not for what I wanted. Giving /drm/ for example as a
pattern will actually disable all matching devices, but since
"radeondrmkmsfb" matches, you end up with no display at all because the
drm is nonetheless attempted.

"/kms$/" and "/drm$/" could work. But this is more a debugging feature
(except for find or list) than something to use bluntly for the moment.

Should we have /pattern/@/parent_pattern/? Or enforce a namespace
policy?

At least, one should use "list /pattern/" or "find /pattern/" before
modifying blindly.

commit e62e0b293986bfb3a749ab499d8367b5c6a161a2
Author: Thierry LARONDE 
Date:   Thu Oct 12 18:07:13 2023 +0200

Just add the precision that the pmap_pv_untrack() users are DRM2
aka DRMKMS drivers (not "legacy" DRM ones).

commit 930cf9cd86c51551b7731777df2882a64ba655b7
Author: Thierry LARONDE 
Date:   Thu Oct 12 09:00:56 2023 +0200

For consistency, what is related to monitors is not taken from
XFree86 but taken from the latest VESA DMT (v 1.0, Rev. 13).  So
modelines are removed, and dmt added, and the code fixed to work
with this with no user visible change for the moment. And some
modes not defined in the VESA DMT are put in an extradmt file, with
fixes for Mac monitors (taken from parameters in the Linux framebuffer
code).

For consistency too, published strings like "800x600x60" are replaced
by "800x600@60Hz" to avoid multiplying apples by oranges and
ambiguity about exactly what the last number describes.

The double scan entries were not used and are not generated.

DRM, DRM2 aka DRM/KMS: SOME NOTES

DRM or now DRM2 (aka DRM/KMS) are inherently and heavily linked to
X11 and to Linux.  Due to the size of the thing, NetBSD is deriving
a version from the one FreeBSD tries to derive. To make things worse,
the API is changing significantly. So we can only adapt late; and, de
facto, we always drag behind.

The important thing to keep in mind is that this is heavily linked to
X11. It's not something independent.

To make things even worse, the abuse of acronyms is blurring things that
didn't need to be made even less clear. Not to mention the fact that DRM
is also used for Digital Rights Management---that has strictly nothing
to do with the thing---, DRI (a part of the X11 stuff) is also used
instead of DRM for the X11 part, and DRM2 is also referred too as 
DRM/KMS.

The "legacy" ("first" version, at least in NetBSD) DRM drivers are these
ones (for x86 ones):

#i915drm*   at drm? # Intel i915, i945 DRM driver
#mach64drm* at drm? # mach64 (3D Rage Pro, Rage) DRM driver
#mgadrm*at drm? # Matrox G[24]00, G[45]50 DRM driver
#r128drm*   at drm? # ATI Rage 128 DRM driver
#radeondrm* at drm? # ATI Radeon DRM driver
#savagedrm* at drm? # S3 Savage DRM driver
#sisdrm*at drm? # SiS DRM driver
#tdfxdrm*   at drm? # 3dfx (voodoo) DRM driver

The drivers using the new API have sometimes "kms" in the name (for
i915, I guess to make a difference with the previous "legacy"
i915drm), but generally not, or if this is the case, this is not the
devi

DRM/KMS: report

2023-10-14 Thread tlaronde
[I'm sending this to the tech-kern since the previous message on
tech-userlevel is only: the list seems dead?]

[CAVEATS: Please remember that I'm not an english native speaker, and
that what follows is not a "lecture" or a judgement about what is done,
but a home made translation in some english of some of the notes---there
is more documentation to come later.
if I wanted to look at the DRM/KMS stuff, it was because I felt (and
still feel...) that I would never haved embarked in such an appalling
task to try to tame a thing like that ;-) I'm not "blaming" or
"naming and shaming"---or whatever the term is---or despising work
or people.]

3 months ago, I have engaged to take a look at the DRM/KMS object, with
the goal to ensure that the NetBSD kernel could be severed at will from
it.

Here is the report.

I will start with code for the impatients, and will continue with
documentation / comments and end with future directions (for me).

Note: I have finally taken again an Internet optical fiber connection
(after infelicities with a previous provider), so I have been able to
pull and push on a fork that is here:

https://github.com/tlaronde/src


 WHAT IS IN THESE SOURCES

commit 6d715506703ed9f0bec6a39fec8794b5b8eb
Author: Thierry LARONDE 
Date:   Fri Oct 13 18:39:03 2023 +0200

In order to allow to change, disable, enable, find or list devices
according to a pattern (specified between slashes; can be anchored at
beginning with '^'; at end with '$'; but no wildcard dot, or count or
range...), the userconf parsing are modified.

It works... but not for what I wanted. Giving /drm/ for example as a
pattern will actually disable all matching devices, but since
"radeondrmkmsfb" matches, you end up with no display at all because the
drm is nonetheless attempted.

"/kms$/" and "/drm$/" could work. But this is more a debugging feature
(except for find or list) than something to use bluntly for the moment.

Should we have /pattern/@/parent_pattern/? Or enforce a namespace
policy?

At least, one should use "list /pattern/" or "find /pattern/" before
modifying blindly.

commit e62e0b293986bfb3a749ab499d8367b5c6a161a2
Author: Thierry LARONDE 
Date:   Thu Oct 12 18:07:13 2023 +0200

Just add the precision that the pmap_pv_untrack() users are DRM2
aka DRMKMS drivers (not "legacy" DRM ones).

commit 930cf9cd86c51551b7731777df2882a64ba655b7
Author: Thierry LARONDE 
Date:   Thu Oct 12 09:00:56 2023 +0200

For consistency, what is related to monitors is not taken from
XFree86 but taken from the latest VESA DMT (v 1.0, Rev. 13).  So
modelines are removed, and dmt added, and the code fixed to work
with this with no user visible change for the moment. And some
modes not defined in the VESA DMT are put in an extradmt file, with
fixes for Mac monitors (taken from parameters in the Linux framebuffer
code).

For consistency too, published strings like "800x600x60" are replaced
by "800x600@60Hz" to avoid multiplying apples by oranges and
ambiguity about exactly what the last number describes.

The double scan entries were not used and are not generated.

DRM, DRM2 aka DRM/KMS: SOME NOTES

DRM or now DRM2 (aka DRM/KMS) are inherently and heavily linked to
X11 and to Linux.  Due to the size of the thing, NetBSD is deriving
a version from the one FreeBSD tries to derive. To make things worse,
the API is changing significantly. So we can only adapt late; and, de
facto, we always drag behind.

The important thing to keep in mind is that this is heavily linked to
X11. It's not something independent.

To make things even worse, the abuse of acronyms is blurring things that
didn't need to be made even less clear. Not to mention the fact that DRM
is also used for Digital Rights Management---that has strictly nothing
to do with the thing---, DRI (a part of the X11 stuff) is also used
instead of DRM for the X11 part, and DRM2 is also referred too as 
DRM/KMS.

The "legacy" ("first" version, at least in NetBSD) DRM drivers are these
ones (for x86 ones):

#i915drm*   at drm? # Intel i915, i945 DRM driver
#mach64drm* at drm? # mach64 (3D Rage Pro, Rage) DRM driver
#mgadrm*at drm? # Matrox G[24]00, G[45]50 DRM driver
#r128drm*   at drm? # ATI Rage 128 DRM driver
#radeondrm* at drm? # ATI Radeon DRM driver
#savagedrm* at drm? # S3 Savage DRM driver
#sisdrm*at drm? # SiS DRM driver
#tdfxdrm*   at drm? # 3dfx (voodoo) DRM driver

The drivers using the new API have sometimes "kms" in the name (for
i915, I guess to make a difference with the previous "legacy"
i915drm), but generally not, or if this is the case, this is not the
devi

ISA: a book

2023-07-31 Thread tlaronde
FWIW---and this is probably already known by must---I found that:

"The RISC-V reader: an open architecture atlas", by David Patterson
and Andrew Waterman, Strawberry Canyon LLC, ISBN 9780999249116

to be a great help to "put things together"---I mean it is a short book
(a hundred of pages excluding appendices) giving a kind of "root tree"
about hardware/software considerations, with comparisons with other
architectures; root tree on which one can "mount" various pieces of
information he had picked up here and there (elsewhere), so that the
whole picture can take shape. (It is not a text book or a high
level: you can start programming for RISC-V with this.)

If some read this list to try to get into kernel, this is perhaps a
possible reference to add to the books you could or should read.

FWIW,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: How to submit patches?

2023-05-07 Thread tlaronde
Le Sun, May 07, 2023 at 06:14:35PM +0200, Martin Husemann a écrit :
> On Sun, May 07, 2023 at 04:56:33PM +0200, tlaro...@polynum.com wrote:
> > I'm a bit reluctant to put all the platform lists in copy, since this
> > is typically generic: it deals with the monitor capacities, updating
> > the VESA DMT specs...
> 
> I pointed a few people at your mail, but maybe you could describe the
> motivation of the changes a bit more verbosly - at first it all looked
> like a lot of churn for no particular reason (but that is probably because
> I don't know anything about that part of the source).

With NetBSD 10 beta, there was a change in resolution picked-up for the
framebuffer compared to 9.3: I have a 16:9 ratio LCD; 9.3 picked-up this
ratio and a correct resolution and font for the framebuffer, while
10 beta does not (on the very same hardware).

Trying to investigate why this difference, I found nothing obvious.

So I started to track back, starting from the end: the monitor.

Since I didn't know anything about this stuff, I started to download the
published specs by VESA and read the specs.

So the first step was to update to the latest VESA DMT and to fix
some things that were wrong: there are discrepancies about
the Mac monitors "modelines"; some historical modes were missing; Xorg
(the source used in NetBSD at least) is not up to date either and
was used as a reference for the VESA DMT modes, while with the VESA DMT,
de facto and de jure, it should not.

Since I can work on that only on very scarce hours, instead of waiting
(how long?) to finish all, I prefer to commit a step that is an
independant unit by itself and is finished, without breaking anything
(it adds modes; it corrects---printing "800x600x60" for a mode is
multiplying apples by bananas,
since 60 is a frequency so now it appears as "800x600@60Hz" for example;
it removes unused things that just complexify things for someone
who tries to update the code; etc.) so that at least, _this_, will
not have to be done by someone else.

Of course, some of the fields added from the VESA DMT would be needed
in the future when updating code about EDID. So it is not gratuitous.

Next step will be to review the EDID code and I will continue back
(praying to not have to deal with drm...) until I find why it doesn't
work correctly and understanding the framebuffer stuff.

Is this clearer?
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: How to submit patches?

2023-05-07 Thread tlaronde
Le Sun, May 07, 2023 at 09:40:56AM -0400, Thor Lancelot Simon a écrit :
> On Sat, May 06, 2023 at 12:12:54PM +0200, tlaro...@polynum.com wrote:
> > 
> > How to submit patches without wasting time? (mine included)
> 
> It might be that you get quicker response on one of the mailing lists
> for platforms where the patches are particularly useful.  It might not,
> too - but the set of people with the knowledge to review work in this area
> is not so large, and copying the per-port lists might help get their
> attention.

I'm a bit reluctant to put all the platform lists in copy, since this
is typically generic: it deals with the monitor capacities, updating
the VESA DMT specs...
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


PING sys/dev/videomode: latest DMT and complete Established Timings

2023-05-06 Thread tlaronde

Since there are some infelicities in the handling of the resolution of
the framebuffer (10.0_BETA doesn't behave as 9.3), I have started to
review the code, starting from the end: the monitor.

The monitor being the reference, I have replaced the modelines, derived
from XFree86, with the reference: the latest VESA DMT (v 1.0, Rev. 13)
---that is ahead compared to:
/usr/xsrc/external/mit/xorg-server/dist/hw/xfree86/common/vesamodes.

This file is: "dmt".

I have also put modes not found in VESA DMT, but referenced in the
Established Timings, so in VESA EDID, in a file "extradmt".

XFree86 modelines can be easily computed from the DMT. The reverse is
not true. Furthermore there are various VESA identifiers (one, two or
three bytes) that will be used in the future.

It is interesting to note, too, that there are discrepancies between
what is found in the XFree86 modelines and what can be found in the
modelines in the Linux framebuffer code---for one Established Timing
mode, I had to resort to the Linux parameters since what is found in
the XFree86 (at least 10.0 xsrc) is not accurate.

"dmt" replaces "modelines"
"extradmt" is new.
"dmt2c.awk" replaces "modelines2c.awk"

"videomode.c" has to be regenerated using Makefile.videomode.

The remaining diff is adjustements for the new parameters.

For ergonomy and consistancy, I have replaced strings like "800x600x60"
by "800x600@60Hz".

There are now 93 modes instead of 46 (the double scan entries and the
related code weren't used; and this is not used in the present code
either).

For safety, not knowing if this has hardware implications, the new
"reduced blanking" entries are skipped.

This is only a first step and does not solve the problem I see.

The next step will be reviewing and perhaps updating the edid code.
And I will follow the track until I find why the preferences are
not handled correctly from what is passed by the monitor.

Note: this one infelicity, for me, is not severe enough to hinder, per
se, the release of 10.0.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
# $NetBSD$
# These values were typed by Thierry Laronde ,
# 2023-02-27, from:
# 
# --
# VESA and Industry Standards and Guidelines
# for Computer Display Monitor Timing (DMT)
# Version 1.0, Rev. 13
# February 8, 2013
# Copyright 1994--2013 Video Electronics Standards Association. All
# other rights reserved.
# --
# 
# In brief the document above states: USE AT YOUR OWN RISKS.
# 
# This master file has only values as given in the specification
# identified above.
# 
# The values should have been taken as is. From these values, others can
# be derived and there is even some redundancy (see the processing
# script for the computations). The records are in the same order as in
# the document: first line corresponds to page 18; last line to page
# 105. There hence should be 88 different records here.
# 
# In this file, empty or blank lines or lines beginning with a '#' are
# ignored.
#
# Remaining are a sequence of line terminated records, with the
# following blank separated fields:
# 
# Timing_Name /* Hor_Pixels 'x' Ver_Pixels '@' Refresh_Rate 'Hz' suffix */
# Ids /* DMT_Id ',' STD_Id ',' CVT_Id (1 hexabyte, [2h] , [3h]) */
# Hor_Pixels
# Ver_Pixels
# Pixel_Clock /* MHz */
# Character_Width
# Flags /* Scan_Type ('I' | 'N') ',' Reduced_Blanking ('RB' | 'N') */
# Hor_Sync_Polarity /* '+' | '-' */
# Ver_Sync_Polarity /* '+' | '-' */
# H_Right_Border
# H_Front_Porch
# Hor_Sync_Time
# H_Back_Porch
# H_Left_Border
# V_Bottom_Border
# V_Front_Porch
# Ver_Sync_Time
# V_Back_Porch
# V_Top_Border
# 
640x350@85Hz 01,, 640 350 31.500 8 N,N + - 0 4 8 12 0 0 32 3 60 0
640x400@85Hz 02,3119, 640 400 31.500 8 N,N - + 0 4 8 12 0 0 1 3 41 0
720x400@85Hz 03,, 720 400 35.500 9 N,N - + 0 4 8 12 0 0 1 3 42 0
640x480@60Hz 04,3140, 640 480 25.175 8 N,N - - 1 1 12 5 1 8 2 2 25 8
640x480@72Hz 05,314C, 640 480 31.500 8 N,N - - 1 2 5 15 1 8 1 3 20 8
640x480@75Hz 06,314F, 640 480 31.500 8 N,N - - 0 2 8 15 0 0 1 3 16 0
640x480@85Hz 07,3159, 640 480 36.000 8 N,N - - 0 7 7 10 0 0 1 3 25 0
800x600@56Hz 08,, 800 600 36.000 8 N,N + + 0 3 9 16 0 0 1 2 22 0
800x600@60Hz 09,4540, 800 600 40.000 8 N,N + + 0 5 16 11 0 0 1 4 23 0
800x600@72Hz 0A,454C, 800 600 50.000 8 N,N + + 0 7 15 8 0 0 37 6 23 0
800x600@75Hz 0B,454F, 800 600 49.500 8 N,N + + 0 2 10 20 0 0 1 3 21 0
800x600@85Hz 0C,4559, 800 600 56.250 8 N,N + + 0 4 8 19 0 0 1 3 27 0
800x600@120Hz_rb 0D,, 800 600 73.25 8 N,RB + - 0 6 4 10 0 0 3 4 29 0
848x480@60Hz 0E,, 848 480 33.750 8 N,N + + 0 2 14 14 0 0 6 8 23 0
1024x768@43Hz_i 0F,, 1024 768 44.900 8 I,N + + 0 1 22 7 0 0 0 4 20 0
1024x768@60Hz 10,6140, 1024 768 65.000 8 N,N - - 0 3 17 20 0 0 3 6 29 0
1024x768@70Hz 11,614A, 1024 768 75.000 8 N,N - - 0 3 17 18 0 0 3 6 29 0
1024x768@75Hz 12,614F, 1024 768 78.750 8 N,N + + 0 2 12 22 0 0 1 3 28 0
1024x768@85Hz 13,6159, 1024 768 

Re: How to submit patches?

2023-05-06 Thread tlaronde
Le Sat, May 06, 2023 at 02:13:58PM +0200, Martin Husemann a écrit :
> On Sat, May 06, 2023 at 12:12:54PM +0200, tlaro...@polynum.com wrote:
> > Hello,
> > 
> > On Mon, 27 Feb 2023 12:33:32 +0100, I sent to this list a collection of
> > patches for sys/dev/videomode/, starting by updating the DMT to the
> > latest, and planning to review further the code (sending patches
> > when I have achieved a complete step in the course, because I'm having
> > a hard time finding some spare hours to work on this).
> > 
> > There has been no comment; no reaction.
> 
> Sorry, this happens sometimes - e.g. when topics are sligthly special
> and noone who is familiar with that code has time to review immediately.
> 
> Just ping after some reasonable time of no reaction (I'd say min one max
> two weeks or so) by resending the patches.

OK, I will resend the patches.

Best,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


How to submit patches?

2023-05-06 Thread tlaronde
Hello,

On Mon, 27 Feb 2023 12:33:32 +0100, I sent to this list a collection of
patches for sys/dev/videomode/, starting by updating the DMT to the
latest, and planning to review further the code (sending patches
when I have achieved a complete step in the course, because I'm having
a hard time finding some spare hours to work on this).

There has been no comment; no reaction.

How to submit patches without wasting time? (mine included)

TIA
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


[PATCH] sys/dev/videomode: latest DMT and complete Established Timings

2023-02-27 Thread tlaronde
Since there are some infelicities in the handling of the resolution of
the framebuffer (10.0_BETA doesn't behave as 9.3), I have started to
review the code, starting from the end: the monitor.

The monitor being the reference, I have replaced the modelines, derived
from XFree86, with the reference: the latest VESA DMT (v 1.0, Rev. 13)
---that is ahead compared to:
/usr/xsrc/external/mit/xorg-server/dist/hw/xfree86/common/vesamodes.

This file is: "dmt".

I have also put modes not found in VESA DMT, but referenced in the
Established Timings, so in VESA EDID, in a file "extradmt".

XFree86 modelines can be easily computed from the DMT. The reverse is
not true. Furthermore there are various VESA identifiers (one, two or
three bytes) that will be used in the future.

It is interesting to note, too, that there are discrepancies between
what is found in the XFree86 modelines and what can be found in the
modelines in the Linux framebuffer code---for one Established Timing
mode, I had to resort to the Linux parameters since what is found in
the XFree86 (at least 10.0 xsrc) is not accurate.

"dmt" replaces "modelines"
"extradmt" is new.
"dmt2c.awk" replaces "modelines2c.awk"

"videomode.c" has to be regenerated using Makefile.videomode.

The remaining diff is adjustements for the new parameters.

For ergonomy and consistancy, I have replaced strings like "800x600x60"
by "800x600@60Hz".

There are now 93 modes instead of 46 (the double scan entries and the
related code weren't used; and this is not used in the present code
either).

For safety, not knowing if this has hardware implications, the new
"reduced blanking" entries are skipped.

This is only a first step and does not solve the problem I see.

The next step will be reviewing and perhaps updating the edid code.
And I will follow the track until I find why the preferences are
not handled correctly from what is passed by the monitor.

Note: this one infelicity, for me, is not severe enough to hinder, per
se, the release of 10.0.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
# $NetBSD$
# These values were typed by Thierry Laronde ,
# 2023-02-27, from:
# 
# --
# VESA and Industry Standards and Guidelines
# for Computer Display Monitor Timing (DMT)
# Version 1.0, Rev. 13
# February 8, 2013
# Copyright 1994--2013 Video Electronics Standards Association. All
# other rights reserved.
# --
# 
# In brief the document above states: USE AT YOUR OWN RISKS.
# 
# This master file has only values as given in the specification
# identified above.
# 
# The values should have been taken as is. From these values, others can
# be derived and there is even some redundancy (see the processing
# script for the computations). The records are in the same order as in
# the document: first line corresponds to page 18; last line to page
# 105. There hence should be 88 different records here.
# 
# In this file, empty or blank lines or lines beginning with a '#' are
# ignored.
#
# Remaining are a sequence of line terminated records, with the
# following blank separated fields:
# 
# Timing_Name /* Hor_Pixels 'x' Ver_Pixels '@' Refresh_Rate 'Hz' suffix */
# Ids /* DMT_Id ',' STD_Id ',' CVT_Id (1 hexabyte, [2h] , [3h]) */
# Hor_Pixels
# Ver_Pixels
# Pixel_Clock /* MHz */
# Character_Width
# Flags /* Scan_Type ('I' | 'N') ',' Reduced_Blanking ('RB' | 'N') */
# Hor_Sync_Polarity /* '+' | '-' */
# Ver_Sync_Polarity /* '+' | '-' */
# H_Right_Border
# H_Front_Porch
# Hor_Sync_Time
# H_Back_Porch
# H_Left_Border
# V_Bottom_Border
# V_Front_Porch
# Ver_Sync_Time
# V_Back_Porch
# V_Top_Border
# 
640x350@85Hz 01,, 640 350 31.500 8 N,N + - 0 4 8 12 0 0 32 3 60 0
640x400@85Hz 02,3119, 640 400 31.500 8 N,N - + 0 4 8 12 0 0 1 3 41 0
720x400@85Hz 03,, 720 400 35.500 9 N,N - + 0 4 8 12 0 0 1 3 42 0
640x480@60Hz 04,3140, 640 480 25.175 8 N,N - - 1 1 12 5 1 8 2 2 25 8
640x480@72Hz 05,314C, 640 480 31.500 8 N,N - - 1 2 5 15 1 8 1 3 20 8
640x480@75Hz 06,314F, 640 480 31.500 8 N,N - - 0 2 8 15 0 0 1 3 16 0
640x480@85Hz 07,3159, 640 480 36.000 8 N,N - - 0 7 7 10 0 0 1 3 25 0
800x600@56Hz 08,, 800 600 36.000 8 N,N + + 0 3 9 16 0 0 1 2 22 0
800x600@60Hz 09,4540, 800 600 40.000 8 N,N + + 0 5 16 11 0 0 1 4 23 0
800x600@72Hz 0A,454C, 800 600 50.000 8 N,N + + 0 7 15 8 0 0 37 6 23 0
800x600@75Hz 0B,454F, 800 600 49.500 8 N,N + + 0 2 10 20 0 0 1 3 21 0
800x600@85Hz 0C,4559, 800 600 56.250 8 N,N + + 0 4 8 19 0 0 1 3 27 0
800x600@120Hz_rb 0D,, 800 600 73.25 8 N,RB + - 0 6 4 10 0 0 3 4 29 0
848x480@60Hz 0E,, 848 480 33.750 8 N,N + + 0 2 14 14 0 0 6 8 23 0
1024x768@43Hz_i 0F,, 1024 768 44.900 8 I,N + + 0 1 22 7 0 0 0 4 20 0
1024x768@60Hz 10,6140, 1024 768 65.000 8 N,N - - 0 3 17 20 0 0 3 6 29 0
1024x768@70Hz 11,614A, 1024 768 75.000 8 N,N - - 0 3 17 18 0 0 3 6 29 0
1024x768@75Hz 12,614F, 1024 768 78.750 8 N,N + + 0 2 12 22 0 0 1 3 28 0
1024x768@85Hz 13,6159, 1024 768 

Re: kernel goes dark on boot

2023-02-21 Thread tlaronde
Le Tue, Feb 21, 2023 at 10:00:10AM -0400, Jared McNeill a écrit :
> Yeah sorry you can?t just not exit boot services and boot the OS. UEFI code 
> has certain expectations around the execution environment (MMU on, 1:1 PA to 
> VA for example) that starting the kernel is going to interfere with. The 
> moment the kernel touches the MMU, all of the resident UEFI code will cease 
> to function. This includes code that may be running asynchronously (timers 
> etc) that are not stopped properly due to the missing ExitBootServices call.
> 

FWIW, having followed EDK II development list for a while, there are
further modifications at the moment made because a lot of people are
focusing on VMs (there is quite a market on this) and want to use
EDK II UEFI code as emulated BIOS. And it must be noted that qemu seems
to be the main target.

Just a caveats for the braves who want to follow this...

T. Laronde

> 
> > On Feb 21, 2023, at 9:46 AM, Emmanuel Dreyfus  wrote:
> > 
> > ?On Tue, Feb 21, 2023 at 08:05:00AM -0400, Jared McNeill wrote:
> >> After calling ExitBootServices(), the only things that work are UEFI 
> >> runtime
> >> services. You'll have to find another way to print to the console.
> > 
> > I can skip the ExitBootServices call and keep printing, I have already
> > done thatn at least in C functions. I have no experience of doing that
> > from assembly code.
> > 
> > The same bug exists with a XEN3_DOM0 kernel. Xen starts up, and the 
> > kernel crash without displaying anything. I wonder if there are tools
> > to trace the dom0 with help from Xen.
> > 
> > -- 
> > Emmanuel Dreyfus
> > m...@netbsd.org
> 

-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


[PATCHES] sys/dev/videomode

2023-02-21 Thread tlaronde
Since the choice of the resolution (with 10.0 BETA) is not optimal, I
have started to review sys/dev/videomode in order to fix the
preferences.

The first step was to update the timings.

Since, with whatever choice for the resolution, a monitor will not do
what it is not able to do, I replaced the modelines, derived from
XFree86, with the specifications taken directly from the latest VESA DMT
specification (this is an update even compared to the current
xsrc/external/mit/xorg-server/dist/hw/xfree86/common/vesamodes).

One of the main difference is that I do not put the specification in the
XFree86 modeline format, but I take all the relevant informations from
the spec, from which the modelines can also---obviously---be derived
(the reverse is not true: a modeline doesn't distinguish between
back/front porch and borders; and the VESA identifiers are not
present even for VESA DMT modes).

I attach the "dmt" file for reference (the awk script is updated and
some modifications to other files are made in order to fit this in the
present code without modifying its behavior for now; so it is useless
alone).

To my surprise, some of the "Established timings" (there is a bitmap in
the EDID for these) are not specified in the VESA DMT.

So I added an "extradmt". I had to derive the pseudo DMT timings from
the XFree86 extramodes modelines, the problem being that, as said above,
the distinction between porch and border is not made. So some values are
fake ones.

A supplementary problem is that some of the Mac II modes are
not in the XFree86 modelines; but I found them in
linux/drivers/video/macmodes.c i.e. in a Linux source file, allowing
me to describe all the Established timing thus getting rid of the
disturbing DIAGNOSTIC "no data for est. mode %s\n".

The Linux source file is GPL 2.

The question is: when it comes to parameters/hardware specs (I'm not
taking code, I'm taking numbers), what is the license? Is it considered
as "public" information or is the license binding? Or is an acknowledge
of the source enough without being tied to the license of the file where
the information (not code) was found? Note: these go to an extradmt
file, i.e. is severed from the VESA DMT.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
$NetBSD$
# These values were typed by Thierry Laronde ,
# 2023-02-07, from:
# 
# --
# VESA and Industry Standards and Guidelines
# for Computer Display Monitor Timing (DMT)
# Version 1.0, Rev. 12
# November 17, 2008
# Copyright 1994--2008 Video Electronics Standards Association. All
# other rights reserved.
# --
# 
# In brief the document above states: USE AT YOUR OWN RISKS.
# 
# This master file has only values as given in the specification
# identified above.
# 
# The values should have been taken as is. From these values, others can
# be derived and there is even some redundancy (see the processing
# script for the computations). The records are in the same order as in
# the document: first line corresponds to page 15; last line to page
# 100. There hence should be 86 different records here.
# 
# In this file, empty lines or lines beginning with a '#' are ignored.
# Remaining are a sequence of line terminated records, with the
# following blank separated fields:
# 
# Timing_Name /* Hor_Pixels 'x' Ver_Pixels '@' Refresh_Rate 'Hz' suffix */
# Ids /* DMT_Id ',' STD_Id ',' CVT_Id (1 hexabyte, [2h] , [3h]) */
# Hor_Pixels
# Ver_Pixels
# Pixel_Clock /* MHz */
# Character_Width
# Flags /* Scan_Type ('I' | 'N') ',' Reduced_Blanking ('RB' | 'N') */
# Hor_Sync_Polarity /* '+' | '-' */
# Ver_Sync_Polarity /* '+' | '-' */
# H_Right_Border
# H_Front_Porch
# Hor_Sync_Time
# H_Back_Porch
# H_Left_Border
# V_Bottom_Border
# V_Front_Porch
# Ver_Sync_Time
# V_Back_Porch
# V_Top_Border
# 
640x350@85Hz 01,, 640 350 31.500 8 N,N + - 0 4 8 12 0 0 32 3 60 0
640x400@85Hz 02,3119, 640 400 31.500 8 N,N - + 0 4 8 12 0 0 1 3 41 0
720x400@85Hz 03,, 720 400 35.500 9 N,N - + 0 4 8 12 0 0 1 3 42 0
640x480@60Hz 04,3140, 640 480 25.175 8 N,N - - 1 1 12 5 1 8 2 2 25 8
640x480@72Hz 05,314C, 640 480 31.500 8 N,N - - 1 2 5 15 1 8 1 3 20 8
640x480@75Hz 06,314F, 640 480 31.500 8 N,N - - 0 2 8 15 0 0 1 3 16 0
640x480@85Hz 07,3159, 640 480 36.000 8 N,N - - 0 7 7 10 0 0 1 3 25 0
800x600@56Hz 08,, 800 600 36.000 8 N,N + + 0 3 9 16 0 0 1 2 22 0
800x600@60Hz 09,4540, 800 600 40.000 8 N,N + + 0 5 16 11 0 0 1 4 23 0
800x600@72Hz 0A,454C, 800 600 50.000 8 N,N + + 0 7 15 8 0 0 37 6 23 0
800x600@75Hz 0B,454F, 800 600 49.500 8 N,N + + 0 2 10 20 0 0 1 3 21 0
800x600@85Hz 0C,4559, 800 600 56.250 8 N,N + + 0 4 8 19 0 0 1 3 27 0
800x600@120Hz_rb 0D,, 800 600 73.25 8 N,RB + - 0 6 4 10 0 0 3 4 29 0
848x480@60Hz 0E,, 848 480 33.750 8 N,N + + 0 2 14 14 0 0 6 8 23 0
1024x768@43Hz_i 0F,, 1024 768 44.900 8 I,N + + 0 1 22 7 0 0 0 4 20 0
1024x768@60Hz 10,6140, 1024 768 65.000 8 N,N - - 0 3 17 20 0 0 3 6 29 

Re: NetBSD 10.0 BETA kernel testing: framebuffer

2023-01-29 Thread tlaronde
Le Sun, Jan 29, 2023 at 05:23:00PM +, Taylor R Campbell a écrit :
> > Date: Sun, 29 Jan 2023 16:44:08 +0100
> > From: tlaro...@polynum.com
> > 
> > I will look (silently) to dev/pci/radeonfb.c to understand better the
> > logics and try to find if there is a way to obtain a better console
> > display.
> 
> FYI, dev/pci/radeonfb.c is the legacy radeon framebuffer driver only
> for very old (~20-year-old) devices, not the modern drm driver.
> 

Yep. Realized that when adding debugging information in this file that
did not show up...

> > BTW, the problem is with VGA and DVI(-D) connections. With another monitor
> > connected with HDMI (so more recent than this present 16:9 monitor, that
> > have only VGA and DVI-D connectors and was manufactured in
> > 2012 according to the EDID), the framebuffer has a better resolution.
> 
> Comparing dmesg output from `boot -vx' with the two connectors may
> help to diagnose what's happening.
> 
> (If you already sent it, sorry -- haven't had time to look closely
> yet.)

Yes: I have already sent the various dmesg'es to you :-)

In the mean time, I will try to worm my way in the sources. Even if I
don't succeed in finding a cure, I will undoubtely learn things along
the way...
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: NetBSD 10.0 BETA kernel testing: framebuffer

2023-01-29 Thread tlaronde
Le Sun, Jan 29, 2023 at 03:59:45PM +0100, tlaro...@polynum.com a écrit :
> Le Sun, Jan 29, 2023 at 02:54:39PM +0100, tlaro...@polynum.com a écrit :
> > Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit :
> > > 
> > > Context: I'm testing NetBSD 10.0 BETA on an isolated node (not
> > > production). Only kernel and modules (not userland); and kernel is not
> > > GENERIC but a special config one matching the previous 9.2 config
> > > running on the node.
> > > 
> > > No problem so far. As a user (and as advertised), I had simply to use
> > > audiocfg(1) to set the new correct default for audio in order to have
> > > sound back where I used to expect it.
> > > 
> > > The main difference is about the framebuffer: previous kernel version
> > > picked the correct mode. NetBSD 10.0 does not and use "entry level"
> > > mode 640x480x67, resulting streched fat big characters; message:
> > > 
> > > no data for est. mode 640x480x67
> > 
> > I think we are looking at the wrong place. The problem is the depth
> > in the mode looked for: 67! The only depths the cards new about are
> > multiple of 2^3.
> > 
> > So where does this come from?
> 
> Replying to myself: it is not the depth, but the frequency and it comes
> from sys/dev/videomode/edid.c.
> 
> Now trying to find why, at least, it does not find 640x480x60, which
> exists---and 720x400x70 that exists also.

I have it backward: the failure is displayed, for DIAGNOSTIC, for one
mode that is not found, but this does not mean that others are not
found.

The monitor EDID advertizes only two modes: 640x480x60 and 720x400x70
(while it can do others).
The screen being 16:9 (nominal resolution is 1600x900), the VESA mode
chosen leads to this "ugly" rendering with stretched, fat
characters---which was not the case with 9.2. But is correct with the
logics implemented if I'm not (this time) mistaken.

I will look (silently) to dev/pci/radeonfb.c to understand better the
logics and try to find if there is a way to obtain a better console
display.

BTW, the problem is with VGA and DVI(-D) connections. With another monitor
connected with HDMI (so more recent than this present 16:9 monitor, that
have only VGA and DVI-D connectors and was manufactured in
2012 according to the EDID), the framebuffer has a better resolution.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: NetBSD 10.0 BETA kernel testing: framebuffer

2023-01-29 Thread tlaronde
Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit :
> 
> Context: I'm testing NetBSD 10.0 BETA on an isolated node (not
> production). Only kernel and modules (not userland); and kernel is not
> GENERIC but a special config one matching the previous 9.2 config
> running on the node.
> 
> No problem so far. As a user (and as advertised), I had simply to use
> audiocfg(1) to set the new correct default for audio in order to have
> sound back where I used to expect it.
> 
> The main difference is about the framebuffer: previous kernel version
> picked the correct mode. NetBSD 10.0 does not and use "entry level"
> mode 640x480x67, resulting streched fat big characters; message:
> 
> no data for est. mode 640x480x67

I think we are looking at the wrong place. The problem is the depth
in the mode looked for: 67! The only depths the cards new about are
multiple of 2^3.

So where does this come from?
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: NetBSD 10.0 BETA kernel testing: framebuffer

2023-01-29 Thread tlaronde
Le Sun, Jan 29, 2023 at 02:54:39PM +0100, tlaro...@polynum.com a écrit :
> Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit :
> > 
> > Context: I'm testing NetBSD 10.0 BETA on an isolated node (not
> > production). Only kernel and modules (not userland); and kernel is not
> > GENERIC but a special config one matching the previous 9.2 config
> > running on the node.
> > 
> > No problem so far. As a user (and as advertised), I had simply to use
> > audiocfg(1) to set the new correct default for audio in order to have
> > sound back where I used to expect it.
> > 
> > The main difference is about the framebuffer: previous kernel version
> > picked the correct mode. NetBSD 10.0 does not and use "entry level"
> > mode 640x480x67, resulting streched fat big characters; message:
> > 
> > no data for est. mode 640x480x67
> 
> I think we are looking at the wrong place. The problem is the depth
> in the mode looked for: 67! The only depths the cards new about are
> multiple of 2^3.
> 
> So where does this come from?

Replying to myself: it is not the depth, but the frequency and it comes
from sys/dev/videomode/edid.c.

Now trying to find why, at least, it does not find 640x480x60, which
exists---and 720x400x70 that exists also.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [VGA connector] NetBSD 10.0 BETA kernel testing: framebuffer

2023-01-23 Thread tlaronde
Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit :
> [Please feel free to redirect me to another list if this is not the
> correct one for kernel beta testing]
> 
> Context: I'm testing NetBSD 10.0 BETA on an isolated node (not
> production). Only kernel and modules (not userland); and kernel is not
> GENERIC but a special config one matching the previous 9.2 config
> running on the node.
> 
> No problem so far. As a user (and as advertised), I had simply to use
> audiocfg(1) to set the new correct default for audio in order to have
> sound back where I used to expect it.
> 
> The main difference is about the framebuffer: previous kernel version
> picked the correct mode. NetBSD 10.0 does not and use "entry level"
> mode 640x480x67, resulting streched fat big characters; message:
> 
> no data for est. mode 640x480x67
> 
> while in dmesg the framebuffer has the same dimensions as with the
> 9.2 kernel:
> 
> 9.2:
> -radeondrmkmsfb0: framebuffer at 0xb000aec89000, size 1600x900, depth 32, 
> stride 6400
> 
> 10.0:
> +radeondrmkmsfb0: framebuffer at 0xe034d000, size 1600x900, depth 32, stride 
> 6400
> 
> I have not updated the book blocks. Is the 10.0 kernel expecting to have
> hints about the modes from the bootloader i.e. a new install would
> have updated the boot blocks and I would not have seen this?

I wondered if the problem was linked to the connector between the
graphics card and the monitor.

My monitor is an "old" one with VGA and DVI connectors (DVI is DVI-D not
DVI-I). I'm using a VGA cable.

Since I don't have a DVI cable, I tested connecting a small monitor
(originally for a Raspberry) for which I have a cable with a DVI connector for 
the
graphics card and a HDMI connector for the monitor.

With this only monitor, nothing is displayed but what is "interesing" is
that if I connect both monitors on the same graphics card, one with the
VGA the other with the DVI (on card)--HDMI (on monitor), the kernel gets
the informations from the DVI connected monitor, and displays
"correctly" (for the size of the fonts)... on the VGA connected monitor.

And I have not the message about the mode not found.

If I try to connect the DVI-D (on the old monitor) to the HDMI (on
card), the monitor works, but the problem is the same as with the VGA
(but it is not that surprising since it is DVI-D with a cable
translating DVI to HDMI; but DVI-D is not the same as DVI-I and
something is probably lost in translation).

So it has something to do with the connection, apparently the VGA one
(and DVI-D).

For VGA, there was a change about the EDID (an enhanced version E-EDID
been designed in 2007). So was there a change in the VGA related code,
expecting E-EDID while old monitors "speak" only EDID (for VGA
connection)?
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: NetBSD 10.0 BETA kernel testing: framebuffer

2023-01-23 Thread tlaronde
Le Mon, Jan 23, 2023 at 05:17:29AM +0700, Robert Elz a écrit :
> Date:Sun, 22 Jan 2023 20:27:24 +0100
> From:tlaro...@polynum.com
> Message-ID:  
> 
> 
>   | +Zone  kernel: Available graphics memory: 9007199254079374 KiB
> 
> I see something like that too, but while it is obviously absurd,
> I'm not sure that it actually does any harm (maybe) - my system
> mostly works -- though I am still using wsfb - the last time I
> tried to start X with nouveau and no X server config at all
> (a week or so ago) the kernel crashed very soon after.
> 
> In every case I have looked that big number has been (when converted
> to bytes, which the actual value being printed is - the output simply
> divides by 2^10 (ie: >>10) for our convenience, a value of the same
> general form, in your case
> 
>9007199254079374 KiB == 9223372036177278976 bytes == 0x7FFFD79E3800
> 
> To me that suggests that probably something has a 64 bit value set to
> MAXINT, and then writes a 32 bit value on top of it (and then treats that
> as a 64 bit value).   The top 32 bits being 0x7FFF seems always there.
> [...]

Another possibility is a ptr diff'ing that gave the correct value
previously and is not pertinent anymore because the memory address hasi
changed:

9.2:
-radeondrmkmsfb0: framebuffer at 0xb000aec89000, size 1600x900, depth 32, 
stride 6400

while 10.0 is:
+radeondrmkmsfb0: framebuffer at 0xe034d000, size 1600x900, depth 32, stride 
6400

FWIW,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: NetBSD 10.0 BETA kernel testing: framebuffer

2023-01-22 Thread tlaronde
Hello,

Le Sun, Jan 22, 2023 at 04:59:19PM +0100, Martin Husemann a écrit :
> On Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com wrote:
> > no data for est. mode 640x480x67
> 
> [..]
> 
> > I have not updated the book blocks. Is the 10.0 kernel expecting to have
> > hints about the modes from the bootloader i.e. a new install would
> > have updated the boot blocks and I would not have seen this?
> 
> Boot blocks should be unrelated to this, but boot method (UEFI or BIOS)
> may play a role (that is not fully analyzed).
> 
> We need more details, like full dmesg.
> 
> Does the kernel probe the correct display connection?
> 
> There are a few i915 PRs open that are caused by the wrong connector being
> used or the proper connector not responding, so the display capabilities
> can not be read, but there may be other reasons why the kernel can not
> read the EDID data.

Please find attached the 10.0 dmesg and the diff from 9.2 dmesg to 10.0
dmesg (not edited while the huge majority of differences are that PCI
ids are translated to strings about vendor and product).

Best,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003,
2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013,
2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023
The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.

NetBSD 10.0_BETA (CONFIG) #0: Sun Jan 22 11:01:04 CET 2023

tlaronde@cauchy.polynum.local:/usr/obj/polynum.NODECONF-cauchy.polynum.local_netbsd-9.2-amd64_netbsd-amd64/netbsd/obj/sys/arch/amd64/compile/CONFIG
total memory = 8120 MB
avail memory = 7834 MB
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
mainbus0 (root)
ACPI: RSDP 0x000F04A0 24 (v02 ALASKA)
ACPI: XSDT 0xDDF9A078 74 (v01 ALASKA A M I01072009 AMI  
00010013)
ACPI: FACP 0xDDFA7AC8 00010C (v05 ALASKA A M I01072009 AMI  
00010013)
ACPI: DSDT 0xDDF9A188 00D940 (v02 ALASKA A M I0034 INTL 
20120711)
ACPI: FACS 0xDDFC7F80 40
ACPI: APIC 0xDDFA7BD8 62 (v03 ALASKA A M I01072009 AMI  
00010013)
ACPI: FPDT 0xDDFA7C40 44 (v01 ALASKA A M I01072009 AMI  
00010013)
ACPI: SSDT 0xDDFA7C88 000539 (v01 PmRef  Cpu0Ist  3000 INTL 
20120711)
ACPI: SSDT 0xDDFA81C8 000AD8 (v01 PmRef  CpuPm3000 INTL 
20120711)
ACPI: MCFG 0xDDFA8CA0 3C (v01 ALASKA A M I01072009 MSFT 
0097)
ACPI: HPET 0xDDFA8CE0 38 (v01 ALASKA A M I01072009 AMI. 
0005)
ACPI: SSDT 0xDDFA8D18 00036D (v01 SataRe SataTabl 1000 INTL 
20120711)
ACPI: SSDT 0xDDFA9088 0034E1 (v01 SaSsdt SaSsdt   3000 INTL 
20091112)
ACPI: ASF! 0xDDFAC570 A5 (v32 INTEL   HCG 0001 TFSM 
000F4240)
ACPI: 5 ACPI AML tables successfully acquired and loaded
ioapic0 at mainbus0 apid 8: pa 0xfec0, version 0x20, 24 pins
cpu0 at mainbus0 apid 0
cpu0: Use lfence to serialize rdtsc
cpu0: Intel(R) Pentium(R) CPU G3220 @ 3.00GHz, id 0x306c3
cpu0: node 0, package 0, core 0, smt 0
cpu1 at mainbus0 apid 2
cpu1: Intel(R) Pentium(R) CPU G3220 @ 3.00GHz, id 0x306c3
cpu1: node 0, package 0, core 1, smt 0
acpi0 at mainbus0: Intel ACPICA 20221020
acpi0: X/RSDT: OemId , AslId 
acpi0: MCFG: segment 0, bus 0-63, address 0xf800
ACPI: Dynamic OEM Table Load:
ACPI: SSDT 0x8E7E9B90F808 0005AA (v01 PmRef  ApIst3000 INTL 
20120711)
acpi0: SCI interrupting at int 9
acpi0: fixed power button present
timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000
hpet0 at acpi0: high precision event timer (mem 0xfed0-0xfed00400)
timecounter: Timecounter "hpet0" frequency 14318180 Hz quality 2000
acpiec0 at acpi0 (H_EC, PNP0C09-1): not present
TPMX (PNP0C01) at acpi0 not configured
FWHD (INT0800) at acpi0 not configured
attimer1 at acpi0 (TIMR, PNP0100): io 0x40-0x43,0x50-0x53 irq 0
com0 at acpi0 (UAR1, PNP0501-1): io 0x3f8-0x3ff irq 4
com0: ns16550a, 16-byte FIFO
lpt0 at acpi0 (LPTE, PNP0400): io 0x378-0x37f irq 5
acpiwmi0 at acpi0 (WMI1, PNP0C14-MXM2): ACPI WMI Interface
acpiwmibus at acpiwmi0 not configured
acpibut0 at acpi0 (PWRB, PNP0C0C-170): ACPI Power Button
acpiwmi1 at acpi0 (WMIO, PNP0C14-0): ACPI WMI Interface
acpiwmibus at acpiwmi1 not configured
acpifan0 at acpi0 (FAN0, PNP0C0B-0): ACPI Fan
acpifan1 at acpi0 (FAN1, PNP0C0B-1): ACPI Fan
acpifan2 at acpi0 (FAN2, PNP0C0B-2): ACPI Fan
acpifan3 at acpi0 (FAN3, PNP0C0B-3): ACPI Fan
acpifan4 at acpi0 (FAN4, PNP0C0B-4): ACPI Fan
acpitz0 at acpi0 (TZ00)
acpitz0: active cooling level 0: 80.0C
acpitz

Re: NetBSD 10.0 BETA kernel testing: framebuffer

2023-01-22 Thread tlaronde
Le Sun, Jan 22, 2023 at 02:56:47PM +0100, tlaro...@polynum.com a écrit :
> [...]
> 
> The main difference is about the framebuffer: previous kernel version
> picked the correct mode. NetBSD 10.0 does not and use "entry level"
> mode 640x480x67, resulting streched fat big characters; message:
> 
> no data for est. mode 640x480x67
> 
> while in dmesg the framebuffer has the same dimensions as with the
> 9.2 kernel:
> 
> 9.2:
> -radeondrmkmsfb0: framebuffer at 0xb000aec89000, size 1600x900, depth 32, 
> stride 6400
> 
> 10.0:
> +radeondrmkmsfb0: framebuffer at 0xe034d000, size 1600x900, depth 32, stride 
> 6400
> 

The differences between 9.2 (/^-/) and 10.0 (/^+/) extracted:

-kern info: [drm] initializing kernel modesetting (CEDAR 0x1002:0x68F9 
0x174B:0xE164).
+initializing kernel modesetting (CEDAR 0x1002:0x68F9 0x174B:0xE164 0x00).
-Zone  kernel: Available graphics memory: 2601178 kiB
-Zone   dma32: Available graphics memory: 2097152 kiB
+Zone  kernel: Available graphics memory: 9007199254079374 KiB
+Zone   dma32: Available graphics memory: 2097152 KiB

Note the value, on 10.0 about the "Zone kernel" and cf. with the correct
(9.2) one.

In PR #56847, this is mentionned about "nouveau" (and I have "radeon")
and about the problem been with UEFI and not BIOS: this is incorrect,
since my node is in legacy boot: it uses BIOS and the value is
incorrect. So the problem is not UEFI vs. BIOS.

There is also a third argument about CEDAR in 10.0 not existing in
9.2.  May be the same as for the sound: 10.0 is not enumerating in
the same order, and what succeeded previously because the first
entry was fortunately the correct one, is now failing.

Note: I stumbled upon PR #56847, previously, while searching
something else and had quite a time, now, remembering it, finding
it back with the PR search tools. And then, trying to find a way to find
it back... I stumbled on a page by D. Holland stating that the bug
report system should be revamped. It's difficult not to concur...

May I suggest that a future system should send candidates PR to a
mailing list so that keywords and sorting is done by knowledgeable
people in order to put in their vincinity PRs based on the moon they
are (probably) pointing to, instead of the finger of the reporter ?
(It's not a derision against the reporter---me included; the
reporter reports what he sees: symptoms.)
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


NetBSD 10.0 BETA kernel testing: framebuffer

2023-01-22 Thread tlaronde
[Please feel free to redirect me to another list if this is not the
correct one for kernel beta testing]

Context: I'm testing NetBSD 10.0 BETA on an isolated node (not
production). Only kernel and modules (not userland); and kernel is not
GENERIC but a special config one matching the previous 9.2 config
running on the node.

No problem so far. As a user (and as advertised), I had simply to use
audiocfg(1) to set the new correct default for audio in order to have
sound back where I used to expect it.

The main difference is about the framebuffer: previous kernel version
picked the correct mode. NetBSD 10.0 does not and use "entry level"
mode 640x480x67, resulting streched fat big characters; message:

no data for est. mode 640x480x67

while in dmesg the framebuffer has the same dimensions as with the
9.2 kernel:

9.2:
-radeondrmkmsfb0: framebuffer at 0xb000aec89000, size 1600x900, depth 32, 
stride 6400

10.0:
+radeondrmkmsfb0: framebuffer at 0xe034d000, size 1600x900, depth 32, stride 
6400

I have not updated the book blocks. Is the 10.0 kernel expecting to have
hints about the modes from the bootloader i.e. a new install would
have updated the boot blocks and I would not have seen this?

Best,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Pulling to netbsd-9 branch fixes for #54977

2023-01-19 Thread tlaronde
Hello,

I have experienced a USB failure with an excessive amount of file cache,
while the mounted filesystems shouldn't have this lot of blocks in
cache: this was likely due to a rsync(1) failure on an USB connected
disk. The USB was detached ("file system full") while rsync(1) was
operating but the files stayed in cache and the umass0 was not
reattachable when trying a "drvctl -r umass0".

Very likely PR #54977 (in my case: an ARM SoC with only 1GB of memory).

There are fixes in current (mentionned in PR).

Could they be pulled to the 9 branch?

Best,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


UEFI: caveats about not utf-8 dir entries

2023-01-12 Thread tlaronde
I don't know if this is for tech-kern or tech-userlevel (perhaps the
two).

I just read today, on the devel UEFI edk2 devel list, from patches for
ext4, a comment on the problem of the encoding of dir entries.

The problem is that, generally in fs, no encoding is specified: dir
entries are just a sequence of bytes, whether nul byte terminated or
with the length of the entry given (the later for ext4).

UEFI (edk2) deals, internally, with UCS-2 strings.

With ext4 (and I expect this is the same for other fs drivers),
conversion is attempted from utf-8. Here, if the "from utf-8" conversion
errors (not utf-8), the dir entry is skipped, meaning that not anything
on a fs read can be reached by the UEFI code.

This has to be kept in mind when populating a msdos partition for
booting and for people wandering in a filesystem using the UEFI shell:
even if the fs is readable, perhaps not everything will be accessible.

FWIW,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


UEFI edk2 NetBSD support

2022-11-07 Thread tlaronde
I'm about to start to commit modifications to the UEFI edk2 sources in
order to allow to build and test it under NetBSD.

Why is it related to the kernel? Because UEFI is not limited to one
arch (so it's not linked to some port); because the edk can be
compiled and used on a not UEFI hardware in order to provide some
UEFI support and it could be an alternative to Uboot; because on
an only remotely accessible machine, a persistent runtime UEFI
network driver could allow to explore the machine and, if the kernel
supports it, could allow to remotely debug the kernel on a machine
where there is no other direct mean to know what is going on
particularly in the early stages of booting.

Am I stepping on somebody else's toes for the UEFI edk2? (I don't speak
about UEFI kernel support: I'm not working on that.)
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: debugging a kernel that doesn't start

2022-09-13 Thread tlaronde
Le Mon, Sep 12, 2022 at 09:17:52PM +0200, Edgar Fuß a écrit :
> I'm trying to run NetBSD on a Dell PowerEdge R6515, and the kernel is being 
> loaded (PXE or USB) but then the machine hangs hard.
> 
> What's the way to debug a kernel that hangs so early that you can't printf 
> or drop into ddb? I guess that's a phenomenon quite common for a new port 
> or changes to locore.s (or whatever that's called today), but it's completely 
> new to me.
> 
> I have virtually no clue about PeCee hardware. At the point the kernel is 
> started, are BIOS routines still available?

Start by trying to boot without the KMS. I had the problem of a kernel
not reaching init, on a remote server, without any other access (no
serial, no IPMI). See:

http://notes.kergis.com/netbsd_on_OVH_baremetal.html

-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Notes about booting/debugging NetBSD on an OVH baremetal server

2022-08-29 Thread tlaronde
FWIW, I have put there notes about the installation, booting and dual
booting of NetBSD on an OVH baremetal server:

https://notes.kergis.com/netbsd_on_OVH_baremetal.html

The part that could be of interest to kernel developers is at the end:
what I found handy or could be handy in trying to get information about
what was going on and failing with very limited means to get
information.

It concerns UEFI, boot and the kernel. I'd like to know if some
suggestions make sense (or not) and if there is already work in progress
in some of these points.

When I will have a slot of time, I plan to tackle UEFI but to see first
if it could be installed to allow remote hardware exploration.

FWIW,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [SUCCESS] Debugging/fixing a kernel stalled not crashing

2022-08-21 Thread tlaronde
Le Sun, Aug 21, 2022 at 03:25:36PM +, Emmanuel Dreyfus a écrit :
> On Sun, Aug 21, 2022 at 02:16:58PM +0200, tlaro...@polynum.com wrote:
> > Addition (asked by Taylor R Campbell): a current GENERIC boots only
> > with i915drmkms disabled.
> > 
> > With the framebuffer stuff enabled, it does not boot, and does not even
> > panic and reboot. It freezes somewhere. The same as the 9.x series.
> 
> I have a machine that randomy crash during boot since we had the Linux 5.x
> DRM import. The feature is still an asset, since it supports the GPU
> that was not supported before, but it suggests booting with DRM based
> framebuffer is more fragile than booting without. Perhaps we need a boot
> flag to disable framebuffer?

This is my feeling too that a generic flag to disable it via userconf
would be a good thing instead of explicitely listing all the drivers.
And, at the very least, to advertise, for people
installing on a server, to try with framebuffer disabled first, to see
if NetBSD boots, and to try it with only after. When one installs on a
remote server, without seeing anything about the boot process[*], it is
quite frustating.

*: I plan to play a little with UEFI EDKII to see if installing it and
dealing with an ethernet card EFI Runtime driver (persistent after exiting boot)
could be a solution for remote debugging. But no schedule set so don't
hold your breath; it's vaporware for the moment. Other idea: write messages
to memory in a place kept untouched by UEFI and NetBSD so that rebooting
(in case of crash) in UEFI, an UEFI application could dump the
memory on some place on the disk, in the EFI partition, for
post-mortem inspection.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [SUCCESS] Debugging/fixing a kernel stalled not crashing

2022-08-21 Thread tlaronde
Addition (asked by Taylor R Campbell): a current GENERIC boots only
with i915drmkms disabled.

With the framebuffer stuff enabled, it does not boot, and does not even
panic and reboot. It freezes somewhere. The same as the 9.x series.

Le Sat, Aug 20, 2022 at 09:03:52PM +0200, tlaro...@polynum.com a écrit :
> A final point:
> 
> Context: I rent a baremetal server (OVH) that has an Intel Xeon
> quadcore, IvyBridge, with 16Gb of RAM, 3 2TB disks, an Intel PRO 1000
> ethernet card (but the bandwith is limited to 100Mib). It is an entry
> level offer, that I wanted only for an IPv4 address (there is an IPv6
> address too).
> 
> The images to install include no BSD but only Linux/Debian variants.
> 
> Following instructions from an helpful wiki page, I try to install using
> a Linux rescue disk (provided by OVH), running all in memory, and having
> qemu-system-x86_64 allowing to use a CDROM install image.
> 
> Nothing booted.
> 
> Since it was unclear from the web interface if the boot process was
> depending or not on the information about an image being installed (to
> allow booting from the disk), I then installed a Linux/Debian on only
> one disk (one can select, 1, 2 or 3 disks, but if multiple disks this
> is software RAID).
> 
> Using the rescue system, I then resized the Debian partition and
> installed NetBSD on another partition (dual booting) and, to bypass a
> possible limitation in the booting process (only booting GRUB and
> accessing directly GRUB), I chainloaded the NetBSD stage1 from the GRUB2
> menu, and verified, under qemu, this will boot, using GRUB2 boot once
> feature so that if the NetBSD crashed and reboots, I can go back to
> Debian to try something else.
> 
> Still no success.
> 
> It was almost certain there was a problem with the kernel.
> 
> So I wrote a special /boot.cfg to test various things, custom compiling
> a kernel (since the GENERIC installation one was not running), and tried
> to validate step by step the booting procedure in order to try, after
> to insert a cpu_reboot() instruction in the kernel to see where the
> problem occurred (since when rebooting, I will be able to connect to
> Debian, I would have known that before the instruction, it was OK).
> 
> In order to limit the work, I used once more qemu but to install NetBSD
> on another disk (so that I can in fact use qemu not with the rescue
> system, but directly under Debian without trashing the very disk Debian
> runs from).
> 
> The first test was to see if, indeed, NetBSD stage2 was loaded. The
> menu in /boot.cfg was simple: the instruction "quit".
> 
> => First lesson: this does not work, because the rebooting is not a
> total one, and mapping the drives (in GRUB2) to ensure that the booting
> succeeds, the stage2 reboots but finally back to itself, so the machine 
> was unendlessly rebooting and I had no connection.
> 
> It took me various modifications before realizing it was the case (under
> qemu) so I abandonned the idea and tried to boot a custom kernel,
> without SMP and without framebuffer (i915drmkms).
> 
> This succeeded.
> 
> I then get back to test letting the framebuffer. It didn't work.
> I then disable the framebuffer for everything, and tried with SMP. It
> worked.
> Then, I tried 9.2 GENERIC and 9.3 GENERIC without framebuffer. Both
> work.
> 
> So the final lesson: NetBSD can be installed on such machine but the
> framebuffer is a problem. And NetBSD is not far behind Linux, because
> the Debian distribution is a recent one, and the main clue was in the
> Linux dmesg: 
> 
> Command line: BOOT_IMAGE=/boot/vmlinuz-5.10.0-14-amd64 
> root=UUID=eea6d0a4-03b6-44e6-8588-ff6c4eba2095 ro nomodeset iommu=pt
> 
> The: nomodeset.
> 
> Linux doesn't work with the embedded graphics (HD 4000) either.
> 
> So it is partly a kernel problem (kernel stalling with framebuffer
> initializations) but mainly an install problem (framebuffer in such
> cases should be disabled).
> 
> If someone thinks there can be interest in how I set dual booting,
> chainloading NetBSD from GRUB2, and configuring the boot procedure, I
> can write a mini-page about it.
> 
> For the rest: problem solved. NetBSD can install on an OVH baremetal
> (at least this kind of machine).
> -- 
> Thierry Laronde 
>  http://www.kergis.com/
> http://kertex.kergis.com/
>http://www.sbfa.fr/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C

-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [SUCCESS] Debugging/fixing a kernel stalled not crashing

2022-08-20 Thread tlaronde
A final point:

Context: I rent a baremetal server (OVH) that has an Intel Xeon
quadcore, IvyBridge, with 16Gb of RAM, 3 2TB disks, an Intel PRO 1000
ethernet card (but the bandwith is limited to 100Mib). It is an entry
level offer, that I wanted only for an IPv4 address (there is an IPv6
address too).

The images to install include no BSD but only Linux/Debian variants.

Following instructions from an helpful wiki page, I try to install using
a Linux rescue disk (provided by OVH), running all in memory, and having
qemu-system-x86_64 allowing to use a CDROM install image.

Nothing booted.

Since it was unclear from the web interface if the boot process was
depending or not on the information about an image being installed (to
allow booting from the disk), I then installed a Linux/Debian on only
one disk (one can select, 1, 2 or 3 disks, but if multiple disks this
is software RAID).

Using the rescue system, I then resized the Debian partition and
installed NetBSD on another partition (dual booting) and, to bypass a
possible limitation in the booting process (only booting GRUB and
accessing directly GRUB), I chainloaded the NetBSD stage1 from the GRUB2
menu, and verified, under qemu, this will boot, using GRUB2 boot once
feature so that if the NetBSD crashed and reboots, I can go back to
Debian to try something else.

Still no success.

It was almost certain there was a problem with the kernel.

So I wrote a special /boot.cfg to test various things, custom compiling
a kernel (since the GENERIC installation one was not running), and tried
to validate step by step the booting procedure in order to try, after
to insert a cpu_reboot() instruction in the kernel to see where the
problem occurred (since when rebooting, I will be able to connect to
Debian, I would have known that before the instruction, it was OK).

In order to limit the work, I used once more qemu but to install NetBSD
on another disk (so that I can in fact use qemu not with the rescue
system, but directly under Debian without trashing the very disk Debian
runs from).

The first test was to see if, indeed, NetBSD stage2 was loaded. The
menu in /boot.cfg was simple: the instruction "quit".

=> First lesson: this does not work, because the rebooting is not a
total one, and mapping the drives (in GRUB2) to ensure that the booting
succeeds, the stage2 reboots but finally back to itself, so the machine 
was unendlessly rebooting and I had no connection.

It took me various modifications before realizing it was the case (under
qemu) so I abandonned the idea and tried to boot a custom kernel,
without SMP and without framebuffer (i915drmkms).

This succeeded.

I then get back to test letting the framebuffer. It didn't work.
I then disable the framebuffer for everything, and tried with SMP. It
worked.
Then, I tried 9.2 GENERIC and 9.3 GENERIC without framebuffer. Both
work.

So the final lesson: NetBSD can be installed on such machine but the
framebuffer is a problem. And NetBSD is not far behind Linux, because
the Debian distribution is a recent one, and the main clue was in the
Linux dmesg: 

Command line: BOOT_IMAGE=/boot/vmlinuz-5.10.0-14-amd64 
root=UUID=eea6d0a4-03b6-44e6-8588-ff6c4eba2095 ro nomodeset iommu=pt

The: nomodeset.

Linux doesn't work with the embedded graphics (HD 4000) either.

So it is partly a kernel problem (kernel stalling with framebuffer
initializations) but mainly an install problem (framebuffer in such
cases should be disabled).

If someone thinks there can be interest in how I set dual booting,
chainloading NetBSD from GRUB2, and configuring the boot procedure, I
can write a mini-page about it.

For the rest: problem solved. NetBSD can install on an OVH baremetal
(at least this kind of machine).
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


[PARTIAL SUCCESS] Debugging/fixing a kernel stalled not crashing

2022-08-20 Thread tlaronde
Le Thu, Aug 18, 2022 at 04:33:04PM +0200, tlaro...@polynum.com a écrit :
> Context: I rent a baremetal server and try to install NetBSD on it. I
> finally installed a Linux (Debian) and installed NetBSD as a dual boot.
> But NetBSD doesn't come up (in case there was a
> network misconfiguration, I verified that no log, no dmesg was written)
> and neither does it crashes and reboots (because I use GRUB2 boot once
> feature and, if it was the case, the server will go back to Debian, and
> it doesn't).
> 

So:

- I have installed a Linux/Debian and I'm using GRUB2 to chainload
the stage1 block in order to load the NetBSD kernel, using the booting
once feature of GRUB2 so that if something goes wrong, I can go back
to the Linux/Debian;

- I have set (since I can see nothing of the boot process) a /boot.cfg
with several choices, and set the default in order from the chainloading
done by GRUB2 to try various things (since I haven't found the
possibility to mount ffs rw under Linux, I use qemu-system-x86_64,
under Debian, to write and modify the NetBSD partitions);

- The machine is an Intel Xeon, quadcore, IvyBridge. Since the GENERIC
kernel does not boot, I have compiled a custom 9.3, stripping all
unneeded, and adding this feature (commented out in the GENERIC config):

acpismbus*  at acpi?# ACPI SMBus CMI (experimental)

since from x86/pci/imcsmb/imc.c, there are some pecularities about
the (Sandy,Ivy)bridge with the Xeon.

Disabling the framebuffer (i915drmkms) via userconf, and disabling the
SMP, NetBSD boots on the machine. The dmesg is here:

http://downloads.kergis.com/misc/rpt_netbsd9.3_monocore_no-fb.dmesg

Since I fought quite a lot with Debian, GRUB2 and so on for the 
installation and the boot process, I have to verify if an SMP version
of the same does boot or not.

If an SMP does not boot, I will go back to the list to have tips about
how I can best gain informations about what's going wrong in order to
try to fix or help to fix it.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Debugging/fixing a kernel stalled not crashing

2022-08-19 Thread tlaronde
Hello,

Le Fri, Aug 19, 2022 at 02:36:33PM +0100, David Brownlee a écrit :
> Tangentially...
> 
> If it's an issue picking up the root filesystem, you could boot an
> INSTALL type kernel with a built in ramdisk with dhcpcd and sshd
> enabled, and see if you can ssh into the box (I think someone had
> pre-built arm images which did just that, so the code should be out
> there :)

Yes, I plan to test this also, depending on at what stage my reboot
tactics indicates where the problem is. The aim being to be able to
connect to a running kernel. When it will be achieved, the harder will
have been made.

I have already built a custom kernel (with acpismbus* added since the
machine has IvyBridge and it is related, and it's not in GENERIC) and
will start to debug tomorrow.

Best,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Debugging/fixing a kernel stalled not crashing

2022-08-18 Thread tlaronde
Context: I rent a baremetal server and try to install NetBSD on it. I
finally installed a Linux (Debian) and installed NetBSD as a dual boot.
But NetBSD doesn't come up (in case there was a
network misconfiguration, I verified that no log, no dmesg was written)
and neither does it crashes and reboots (because I use GRUB2 boot once
feature and, if it was the case, the server will go back to Debian, and
it doesn't).

I can't "see" the boot process (no IPMI for this entry level offer), but
I have at least the dmesg from Linux for the description of the machine,
and I'd like to give it a try to see if I can find the culprit and,
this being identified, manage to correct it.

In order to bisect the problem, it seems that the simplest would be
to place a cpu_reboot() at various steps to identify the culprit since,
if it reboots, I will be back to Debian and hence will know that "until
this" it is OK.

Questions:

1) Is src/sys/kern/init_main.c the correct file to start the bisection
with?

2) Starting at what stage a problem would almost for sure cause a
reboot (DDB_ONPANIC being unset) so that I can know that the problem
is very likely before? I would then try perhaps to start back, from
this point;

3) Are there places where cpu_reboot() may leave the hardware in such a
state that a soft reset will perhaps not bring the machine back
allowing the boot sequence to succeed (or is cpu_reboot() immuned
from this)?

TIA,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: pcc [was Re: valgrind]

2022-03-21 Thread tlaronde
Le Mon, Mar 21, 2022 at 08:54:43AM -0400, Mouse a écrit :
> >> I've been making very-spare-time progress on building my own
> >> compiler on and off for some years now; perhaps I'll eventually get
> >> somewhere.  [...]
> > Have you looked at pcc?  http://pcc.ludd.ltu.se/ and in our source
> > tree in src/external/bsd/pcc .
> 
> No, I haven't.  I should - it may well end up being quicker to move an
> existing compiler in the directions I want to go than to write my own.
> 

And FWIW, there is also the collection of compilers in Plan9, that has
now been released under the MIT license: https://p9f.org/ (Plan9
foundation).
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Kernel 9.1 panic with azalia

2021-06-26 Thread tlaronde
On Sat, Jun 26, 2021 at 06:49:17AM +0200, Martin Husemann wrote:
> Also any reason to use 9.1 instead of 9.2 or 9.2_STABLE?
> (Not that I think it would make a difference for azalia)

Practical reason: I start to update the node I'm doing my main
programing/developing work on and I then, after having verified that
things are rolling and with some delay---specially if the node is a 
remote production server that it is not possible to update easily and
for safety only when I have physical access to it in case of problem
(this time: there was)---I put other nodes in sync to not have to
cross-compile between NetBSD versions.

When I updated the developing node, NetBSD was at 9.1.

Since, for what I know (not much), virtualization always(?) present a
defined common pseudo-hardware interface, I imagine that there is no
virtualization that will allow to test a kernel in a VM,
with access to an image of the real hardware present, so that one
can verify that a tentative kernel will run on the actual hardware
before switching kernels?

I have still to verify that an UEFI bootloader will allow to implement
by scripting a "boot once", so that if a new kernel (on a remote host)
crashes, it reboots with a kernel that is known to work. It is probably
possible to implement this with the existence of persistent storage of
UEFI variables.
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Kernel 9.1 panic with azalia

2021-06-26 Thread tlaronde
On Fri, Jun 25, 2021 at 09:32:40PM +, RVP wrote:
> On Fri, 25 Jun 2021, RVP wrote:
> 
> >On Fri, 25 Jun 2021, tlaro...@polynum.com wrote:
> >
> >>But if azalia is not supported anymore because it crashes the
> >>kernel, shouldn't it be removed and not simply be commented out?
> >>
> >
> >I think that your message is the first indication that azalia(4)
> >is slowly bit-rotting...
> >
> 
> Just checked, and azalia no longer exists in the 9.99[.82] tree.

Thanks to have checked!
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Kernel 9.1 panic with azalia

2021-06-25 Thread tlaronde
Hello,

On Fri, Jun 25, 2021 at 08:47:30PM +, RVP wrote:
> On Fri, 25 Jun 2021, tlaro...@polynum.com wrote:
> 
> >The new kernel panics at boot time with azalia (it is not crucial since
> >it is a server and I have no use with it but I have added the support
> >since it's here and 7.1.1 has no problem with it).
> >
> 
> You must've compiled a custom kernel. 9.1 GENERIC has the `azalia'
> driver commented out; hdaudio(4) is used instead. Try the same.

Sure. But if azalia is not supported anymore because it crashes the
kernel, shouldn't it be removed and not simply be commented out? (To
give some context, when I build a new kernel, I just adjust the
previous config for things that have been removed or changed, so
I'm mainly diffing GENERIC to GENERIC to see changes, while my
configs are not GENERIC---I remove support for whatever hardware is
not here or whatever filesystems the node will not use ever, for
example).

Thanks for the tip though.

Best,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Kernel 9.1 panic with azalia

2021-06-25 Thread tlaronde
Hello,

I was trying to update a server, running a NetBSD 7.1.1 (amd64) to
NetBSD 9.1.

The new kernel panics at boot time with azalia (it is not crucial since
it is a server and I have no use with it but I have added the support
since it's here and 7.1.1 has no problem with it).

It's a production server so I can not easily do tests. Here is the
message (reconstructed by hand from written info---may contain
blunders):

azalia0: codec[2]: 0x1106/0x0441 (rev. 1.0), HDA rev. 1.0
panic: kmem_free(0xce000801,11) != allocated size 1844660333743030159360
vpanic() at netbsd:vpanic +0x143
snprintf() at netbsd:snprintf
kmem_alloc() at netbsd:kmem_alloc
generic_mixer_ensure_capacity() at netbsd:generic_mixer_ensure_capacity +0x7b
generic_mixer_init() at netbsd:generic_mixer_init +0x1143
azalia_attach_intr() at netbsd:azalia_attach_intr +0xbf8
config_interrupts_thread() at netbsd:config_interrupts_thread +0x7e
cpu0: End Traceback
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip 0x8021cc2d cs 0x8 rflags 0x202 0 ilevel 0 rsp 
0xcc006741
curlwp 0x060092cacd80 pid 0.37 lowest kstack 0xcc00674192e0
stopped in pid 0.37 (system) at netbsd:breakpoint: 0x5: leave

Note: with the 7.1.1 kernel, for azalia I have:

azalia0: codec[2]: 0x1106/0x0441 (rev. 1.0), HDA rev. 1.0
azalia0: codec[3]: 0x8086/0x2806 (rev. 0.0), HDA rev. 1.0

The size in the panic is non sense.

Hoping this can give enough clue to debug.

TIA,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Ext4 support

2021-04-30 Thread tlaronde
On Fri, Apr 30, 2021 at 06:51:10AM -0500, Jonathan A. Kollasch wrote:
> On Fri, Apr 30, 2021 at 12:56:04PM +0200, tlaro...@polynum.com wrote:
> > There is excellent support, thanks to Reinoud Zandijk, in NetBSD for
> > UDF. And this is cross-system (I use it to share---not distribute: it's
> > not a NFS or a Samba---back-ups between NetBSD and MS Windows).
> > 
> 
> It's only excellent if you have access to a functional UDF fsck
> program.  NetBSD and Linux do not have a functional UDF fsck.

Yes, this is the lack. I have proposed some time ago to give some money
(this will not be thousands of euros but at least some hundreds) so that
someone(TM) maybe Reinoud Zandijk could work on this.

IMHO, it's something that is worth adding since a cross-system FS is the
solution for sharing (once more: not serving distributed data; but at
least sharing).
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Ext4 support

2021-04-30 Thread tlaronde
On Thu, Apr 29, 2021 at 10:06:05PM +0200, Vincent DEFERT wrote:
> 
> On 29/04/2021 20:34, Christos Zoulas wrote:
> >Some ext4 features were implemented as part of GSoC 2016 (extents,
> >htrees).
> >I am sure that there are other unimplemented features. What are you looking
> >for?
> >
> >christos
> >
> 
> I'd like to have full ext4 support so an ext4-formatted disk could be used
> to exchange data between Linux and NetBSD, for instance.
> 
> If some features have already been implemented, I guess it has been decided
> to put them in /usr/src/sys/ufs/ext2fs and to keep that name.
> So now, I know where to start. :)
> 
> There is also the question of the specifications: for now, I just have the
> Linux kernel sources and the wiki (https://ext4.wiki.kernel.org/).
> I'm not aware of a more formal specification, but if one exists it would
> help avoid the risk of being too influenced by GPL'd source code.

There is excellent support, thanks to Reinoud Zandijk, in NetBSD for
UDF. And this is cross-system (I use it to share---not distribute: it's
not a NFS or a Samba---back-ups between NetBSD and MS Windows).

So you might try this instead of a Linux only thing.

My 2 cents,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: fsync error reporting

2021-02-19 Thread tlaronde
On Fri, Feb 19, 2021 at 01:43:07AM +, David Holland wrote:
> [...]
> 
> (9) We need a model for what happens to the unwritten data. Throwing
> it away is clearly wrong (some may recall a furor a couple years ago
> when it was discovered that Linux did this) but retrying and likely
> failing on every subsequent fsync attempt isn't that useful either.
> My suggestion is to allow retrying up to some arbitrary fixed number
> of times and then mark the buffer broken, and provide some out-of-band
> way to either discard everything (umount -f?) or start retrying again,
> e.g. after manually reinserting accidentally ejected media.
> 

FWIW, perhaps the concept of a dedicated separate recovery data storage (not
specifying it as a physical local disk; could be a remote direct or indirect
storage, energy backed-up memory etc.) could be envisioned for high reliability:
writing unwritten blocks with informations allowing to know what,
where and when and to fix or replay later.

>From a superficial point of view, the problems seem all very complicated on the
kernel level. It would be far simpler to have a kernel only allowing
exclusive write to one process, and letting multiplexing be handled by
a file server in user space, this file server being, actually, the only
one to write and read and being the proxy for other processes,
delivering failure messages to whom interested and allowing partial file
locks too.

This is probably not worth more than 2 cts [and don't expect any code in
any reasonable future ;-)].
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: [FOUND] kernel 8.2 and 9.1 crashes

2020-11-13 Thread tlaronde
Hello,

On Fri, Nov 13, 2020 at 08:42:03AM +0100, Martin Husemann wrote:
> On Fri, Nov 13, 2020 at 07:35:24AM +0100, tlaro...@polynum.com wrote:
> > I tried to recompile a kernel, with 8.2 and with 9.1 and both
> > crash, 9.1 with:
> > 
> > unable to execute instruction 0x18 (SMEP)
> > 
> > (from memory)
> 
> This is (I guess) the kernel jumping through a NULL function pointer.

The problem is with:

options PCKBD_CNATTACH_MAY_FAIL

The option was commented out in my config.

That (my keyboard is USB) I will not have a keyboard during the boot
process, without the option, OK. But that it crashes...

Obviously this "option" is not an option anymore so it should be on and
not settable---unless someone can find why it crashes now, from 8.2,
while it didn't before (the framebuffer and related support seems to
be now a lot of code so the culprit is probably to be found there).

Best,
-- 
Thierry Laronde 
 http://www.kergis.com/
http://kertex.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


kernel 8.2 and 9.1 crashes

2020-11-12 Thread tlaronde
I tried to recompile a kernel, with 8.2 and with 9.1 and both
crash, 9.1 with:

unable to execute instruction 0x18 (SMEP)

(from memory)

The kernel enters debugging but the keyboard being unusable (no key
does whatever) I have to hard reboot.

The last message (via dmesg) from 9.1 is:

[   1.7964198] ahcisata0 port 3: device present, speed: 6.0Gb/s

It works with 8.0.

Here is the dmesg from 8.0:

Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
2018 The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.

NetBSD 8.0 (CONFIG) #0: Thu Apr 16 18:47:07 CEST 2020

tlaronde@cauchy.polynum.local:/usr/obj/polynum.NODECONF-cauchy.polynum.local_netbsd-8.0-amd64_netbsd-amd64/obj/sys/arch/amd64/compile/CONFIG
total memory = 8120 MB
avail memory = 7868 MB
cpu_rng: RDRAND
rnd: seeded with 128 bits
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
MSI MS-7823 (1.0)
mainbus0 (root)
ACPI: RSDP 0x000F04A0 24 (v02 ALASKA)
ACPI: XSDT 0xDDF9A078 74 (v01 ALASKA A M I01072009 AMI  
00010013)
ACPI: FACP 0xDDFA7AC8 00010C (v05 ALASKA A M I01072009 AMI  
00010013)
ACPI: DSDT 0xDDF9A188 00D940 (v02 ALASKA A M I0034 INTL 
20120711)
ACPI: FACS 0xDDFC7F80 40
ACPI: APIC 0xDDFA7BD8 62 (v03 ALASKA A M I01072009 AMI  
00010013)
ACPI: FPDT 0xDDFA7C40 44 (v01 ALASKA A M I01072009 AMI  
00010013)
ACPI: SSDT 0xDDFA7C88 000539 (v01 PmRef  Cpu0Ist  3000 INTL 
20120711)
ACPI: SSDT 0xDDFA81C8 000AD8 (v01 PmRef  CpuPm3000 INTL 
20120711)
ACPI: MCFG 0xDDFA8CA0 3C (v01 ALASKA A M I01072009 MSFT 
0097)
ACPI: HPET 0xDDFA8CE0 38 (v01 ALASKA A M I01072009 AMI. 
0005)
ACPI: SSDT 0xDDFA8D18 00036D (v01 SataRe SataTabl 1000 INTL 
20120711)
ACPI: SSDT 0xDDFA9088 0034E1 (v01 SaSsdt SaSsdt   3000 INTL 
20091112)
ACPI: ASF! 0xDDFAC570 A5 (v32 INTEL   HCG 0001 TFSM 
000F4240)
ACPI: Executed 1 blocks of module-level executable AML code
ACPI: 5 ACPI AML tables successfully acquired and loaded
ioapic0 at mainbus0 apid 8: pa 0xfec0, version 0x20, 24 pins
cpu0 at mainbus0 apid 0
cpu0: Intel(R) Pentium(R) CPU G3220 @ 3.00GHz, id 0x306c3
cpu0: package 0, core 0, smt 0
cpu1 at mainbus0 apid 2
cpu1: Intel(R) Pentium(R) CPU G3220 @ 3.00GHz, id 0x306c3
cpu1: package 0, core 1, smt 0
acpi0 at mainbus0: Intel ACPICA 20170303
acpi0: X/RSDT: OemId , AslId 
acpi0: MCFG: segment 0, bus 0-63, address 0xf800
ACPI: Dynamic OEM Table Load:
ACPI: SSDT 0xFE821BD9E010 0003D3 (v01 PmRef  Cpu0Cst  3001 INTL 
20120711)
ACPI: Dynamic OEM Table Load:
ACPI: SSDT 0xFE810E813810 0005AA (v01 PmRef  ApIst3000 INTL 
20120711)
ACPI: Dynamic OEM Table Load:
ACPI: SSDT 0xFE821BCFB1D0 000119 (v01 PmRef  ApCst3000 INTL 
20120711)
acpi0: SCI interrupting at int 9
timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000
hpet0 at acpi0: high precision event timer (mem 0xfed0-0xfed00400)
timecounter: Timecounter "hpet0" frequency 14318180 Hz quality 2000
acpiec0 at acpi0 (H_EC, PNP0C09-1)acpiec0: unable to evaluate _GPE: AE_NOT_FOUND
TPMX (PNP0C01) at acpi0 not configured
FWHD (INT0800) at acpi0 not configured
LDRC (PNP0C02) at acpi0 not configured
attimer1 at acpi0 (TIMR, PNP0100): io 0x40-0x43,0x50-0x53 irq 0
CWDT (INT3F0D) at acpi0 not configured
SIO1 (PNP0C02) at acpi0 not configured
com2 at acpi0 (UAR1, PNP0501-1): io 0x3f8-0x3ff irq 4
com2: ns16550a, working fifo
lpt2 at acpi0 (LPTE, PNP0400): io 0x378-0x37f irq 5
RMSC (PNP0C02) at acpi0 not configured
acpiwmi0 at acpi0 (WMI1, PNP0C14-MXM2): ACPI WMI Interface
acpiwmibus at acpiwmi0 not configured
PDRC (PNP0C02) at acpi0 not configured
acpibut0 at acpi0 (PWRB, PNP0C0C-170): ACPI Power Button
acpiwmi1 at acpi0 (WMIO, PNP0C14-0): ACPI WMI Interface
acpiwmibus at acpiwmi1 not configured
PTMD (INT3394) at acpi0 not configured
acpifan0 at acpi0 (FAN0, PNP0C0B-0): ACPI Fan
acpifan1 at acpi0 (FAN1, PNP0C0B-1): ACPI Fan
acpifan2 at acpi0 (FAN2, PNP0C0B-2): ACPI Fan
acpifan3 at acpi0 (FAN3, PNP0C0B-3): ACPI Fan
acpifan4 at acpi0 (FAN4, PNP0C0B-4): ACPI Fan
acpitz0 at acpi0 (TZ00)
acpitz0: active cooling level 0: 80.0C
acpitz0: active cooling level 1: 55.0C
acpitz0: active cooling level 2: 0.0C
acpitz0: active cooling level 3: 0.0C
acpitz0: active cooling level 4: 0.0C
acpitz0: levels: critical 105.0 C
acpitz1 at acpi0 (TZ01): cpu0 cpu1
acpitz1: levels: critical 105.0 C, passive 108.0 C, passive cooling
ACPI: Enabled 6 GPEs in block 00 to 3F
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 

Re: [FOUND] kernel 9.0 crash on amd64

2020-04-20 Thread tlaronde
Hello,

On Sun, Apr 19, 2020 at 08:17:06PM +0200, tlaro...@polynum.com wrote:
> Hello,
> 
> On Sun, Apr 19, 2020 at 05:10:37PM +, m...@netbsd.org wrote:
> > On Sun, Apr 19, 2020 at 05:29:40PM +0200, tlaro...@polynum.com wrote:
> > > Hello,
> > > 
> > > Mainly in order to be able to test wine, I'm compiling a NetBSD kernel
> > > from netbsd-9-0-RELEASE sources on an amd64 (Intel bicore).
> > > 
> > > My config has very minimal changes from NetBSD 8.* config, the only 
> > > important
> > > modification being USER_LDT (and I'm not putting option SVS).
> > > 
> > > When it crashes, keyboard is unavailable and the information repeated is:
> > > 
> > > prevented execution of 0x18 (SMEP)
> > > fatal page fault in supervisor mode
> > > trap type 6 code 0x10 rip 0x18 cs 0x8 rflags 0x10246 cr2 0x18 ilevel 0x8
> > > rsp 0xcc00ae0cbb40
> > > 
> > > The last bit registered and shown by dmesg (when rebooting with the 8.0
> > > kernel) is about enumerating ahcisata0.
> > > 
> > > Does this ring some bell to somebody?
> > > 
> > > TIA,
> > 
> > SMEP is a hardware method to stop execution user-memory.
> > It tried to execute a function pointer that isn't initialized, most
> > likely.
> > 
> > It would be interesting to see what the backtrace is
> > sysctl -w ddb.onpanic=2 will print the backtrace and reboot, which
> > should make it visible in the back of the dmesg.
> > 
> > Also, if a kernel core dump is done, it should be in /var/crash, gunzip
> > and crash -M netbsd.12 -N netbsd.12.core
> > crash> bt
> > 
> > should print a backtrace.

This the option DIAGNOSTIC that crashes the kernel.

Since the console is frozen and I have no core dump in /var/crash, I set
DDB so that it bt and reboots.

It goes too quick for me to get a full vision of the bt but it
chokes when wandering in usb_event.

If that may have something with it (since all USB attached devices are
keyboard and mouse), PCKBD_CNATTACH_MAY_FAIL is not set.

Best,
-- 
Thierry Laronde 
 http://www.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: kernel 9.0 crash on amd64

2020-04-19 Thread tlaronde
Hello,

On Sun, Apr 19, 2020 at 05:10:37PM +, m...@netbsd.org wrote:
> On Sun, Apr 19, 2020 at 05:29:40PM +0200, tlaro...@polynum.com wrote:
> > Hello,
> > 
> > Mainly in order to be able to test wine, I'm compiling a NetBSD kernel
> > from netbsd-9-0-RELEASE sources on an amd64 (Intel bicore).
> > 
> > My config has very minimal changes from NetBSD 8.* config, the only 
> > important
> > modification being USER_LDT (and I'm not putting option SVS).
> > 
> > When it crashes, keyboard is unavailable and the information repeated is:
> > 
> > prevented execution of 0x18 (SMEP)
> > fatal page fault in supervisor mode
> > trap type 6 code 0x10 rip 0x18 cs 0x8 rflags 0x10246 cr2 0x18 ilevel 0x8
> > rsp 0xcc00ae0cbb40
> > 
> > The last bit registered and shown by dmesg (when rebooting with the 8.0
> > kernel) is about enumerating ahcisata0.
> > 
> > Does this ring some bell to somebody?
> > 
> > TIA,
> 
> SMEP is a hardware method to stop execution user-memory.
> It tried to execute a function pointer that isn't initialized, most
> likely.
> 
> It would be interesting to see what the backtrace is
> sysctl -w ddb.onpanic=2 will print the backtrace and reboot, which
> should make it visible in the back of the dmesg.
> 
> Also, if a kernel core dump is done, it should be in /var/crash, gunzip
> and crash -M netbsd.12 -N netbsd.12.core
> crash> bt
> 
> should print a backtrace.

Since there was no dump in /var/crash and the messages were frozen with
keyboard not responding, I commented out all the pckbd* isa stuff,
keeping only the ws* stuff related to USB keyboard and mouse.

And this time, the kernel boots...

I will have (later this week) to try to re-establish some options to
pin-point what the offending bit is (but the pckbd* is the more likely
culprit; I had also a  COMPAT_BSDPTY left but I doubt it could have
any effect if the related stuff doesn't exist anymore in the 9.x
branch...).

FWIW, here is the diff between my 8.0 config and the 9.0 _booting_
one:

Index: node.mdec
===
RCS file: /data/cvs/priv/2/4/cauchy/node.mdec,v
retrieving revision 1.11
diff -u -r1.11 node.mdec
--- node.mdec   17 Aug 2019 12:49:01 -  1.11
+++ node.mdec   19 Apr 2020 17:47:50 -
@@ -55,6 +55,10 @@
 
 optionsINSECURE# disable kernel security levels - X needs this
 
+##9
+optionsAUDIO_BLK_MS=4 # make software with low latency needs 
performant
+   # no substantial CPU overhead 
on this platform
+
 optionsRTC_OFFSET=0# hardware clock is this many mins. west of GMT
 optionsNTP # NTP phase/frequency locked loop
 
@@ -73,6 +77,11 @@
 #options   PIPE_SOCKETPAIR # smaller, but slower pipe(2)
 optionsSYSCTL_INCLUDE_DESCR# Include sysctl descriptions in kernel
 
+# CPU-related options.
+optionsUSER_LDT# user-settable LDT; used by WINE
+##9
+#no options SVS
+
 # CPU features
 acpicpu*   at cpu? # ACPI CPU (including frequency scaling)
 coretemp*  at cpu? # Intel on-die thermal sensor
@@ -94,12 +103,13 @@
 #
 makeoptionsCOPTS="-O2 -fno-omit-frame-pointer"
 optionsDDB # in-kernel debugger
-optionsDIAGNOSTIC  # inexpensive kernel consistency checks
-#options   DDB_ONPANIC=1   # see also sysctl(8): `ddb.onpanic'
+optionsDDB_COMMANDONENTER="bt" # execute command when ddb is entered
+#options   DIAGNOSTIC  # inexpensive kernel consistency checks
+optionsDDB_ONPANIC=0   # see also sysctl(7): `ddb.onpanic'
 optionsDDB_HISTORY_SIZE=512# enable history editing in DDB
 #options   KGDB# remote debugger
 #options   KGDB_DEVNAME="\"com\"",KGDB_DEVADDR=0x3f8,KGDB_DEVRATE=9600
-#makeoptions   DEBUG="-g"  # compile full symbol table
+makeoptionsDEBUG="-g"  # compile full symbol table
 #options   SYSCALL_STATS   # per syscall counts
 #options   SYSCALL_TIMES   # per syscall times
 #options   SYSCALL_TIMES_HASCOUNTER# use 'broken' rdtsc (soekris)
@@ -108,6 +118,7 @@
 optionsCOMPAT_50   # NetBSD 5.0 compatibility,
 optionsCOMPAT_60   # NetBSD 6.0 compatibility.
 optionsCOMPAT_70   # NetBSD 7.0 binary compatibility.
+optionsCOMPAT_80   # de dicto
 
 optionsCOMPAT_OSSAUDIO
 optionsCOMPAT_NETBSD32
@@ -115,7 +126,7 @@
 optionsCOMPAT_LINUX32  # req. COMPAT_LINUX and COMPAT_NETBSD32
 optionsEXEC_ELF32
 # this one needed by xterm(1) which uses for now BSD ptys.
-optionsCOMPAT_BSDPTY   # /dev/[pt]ty?? ptys.
+#options   COMPAT_BSDPTY   # /dev/[pt]ty?? ptys.
 
 # Wedge support
 optionsDKWEDGE_AUTODISCOVER# Automatically add dk(4) instances
@@ -252,6 +263,9 @@
 acpiout*   at acpivga? # ACPI Display Output Device
 acpiwdrt*  at acpi?# ACPI Watchdog Resource Table
 

kernel 9.0 crash on amd64

2020-04-19 Thread tlaronde
Hello,

Mainly in order to be able to test wine, I'm compiling a NetBSD kernel
from netbsd-9-0-RELEASE sources on an amd64 (Intel bicore).

My config has very minimal changes from NetBSD 8.* config, the only important
modification being USER_LDT (and I'm not putting option SVS).

When it crashes, keyboard is unavailable and the information repeated is:

prevented execution of 0x18 (SMEP)
fatal page fault in supervisor mode
trap type 6 code 0x10 rip 0x18 cs 0x8 rflags 0x10246 cr2 0x18 ilevel 0x8
rsp 0xcc00ae0cbb40

The last bit registered and shown by dmesg (when rebooting with the 8.0
kernel) is about enumerating ahcisata0.

Does this ring some bell to somebody?

TIA,
-- 
Thierry Laronde 
 http://www.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: NULL pointer arithmetic issues

2020-02-24 Thread tlaronde
On Mon, Feb 24, 2020 at 05:35:22PM -0500, Mouse wrote:
> > Unless I remember wrong, older C standards explicitly say that the
> > integer 0 can be converted to a pointer, and that will be the NULL
> > pointer, and a NULL pointer cast as an integer shall give the value
> > 0.
> 
> The only one I have anything close to a copy of is C99, for which I
> have a very late draft.
> 
> Based on that:
> 
> You are not quite correct.  Any integer may be converted to a pointer,
> and any pointer may be converted to an integer - but the mapping is
> entirely implementation-dependent, except in the integer->pointer
> direction when the integer is a "null pointer constant", defined as
> "[a]n integer constant expression with the value 0" (or such an
> expression cast to void *, though not if we're talking specifically
> about integers), in which case "the resulting pointer, called a null
> pointer, is guaranteed to compare unequal to a pointer to any object or
> function".  You could have meant that, but what you wrote could also be
> taken as applying to the _run-time_ integer value 0, which C99's
> promise does not apply to.  (Quotes are from 6.3.2.3.)
> 
> I don't think there is any promise that converting a null pointer of
> any type back to an integer will necessarily produce a zero integer.
> 

The wording was the same for C89 and there is this paragraph in K
(second edition, p 102):

"Pointers and integers are not interchangeable. Zero is the sole
exception: the constant zero may be assigned to a pointer, and a pointer
may be compared with the constant zero. The symbolic constant NULL is
often used in place of zero, as a mnemonic to indicate more clearly that
this is a special value for a pointer. [...]"

I interpret this (the paragraph above and the standard) as: in comparing 
a pointer to the constant zero, the constant zero is converted to
a pointer of NULL value, thus comparing pointer to pointer and not
comparing an integer value (the integer value of the pointer) to
an integer value (0).

So defining NULL as the casting of 0 is (was?) in the C standard, the
actual value of the expression i.e. of an incorrect (NULL) pointer
being implementation defined.

FWIW,
-- 
Thierry Laronde 
 http://www.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Fonts for console/fb for various locales: a proposal

2019-09-30 Thread tlaronde
On Mon, Sep 30, 2019 at 02:23:02PM +0200, Piotr Meyer wrote:
> On Mon, Sep 30, 2019 at 11:01:51AM +0200, tlaro...@polynum.com wrote:
> > On Mon, Sep 30, 2019 at 10:32:40AM +0200, Martin Husemann wrote:
> > > I guess noone would object a metafont2wsfont converter tool.
> > > Look at the true type tool Michael mentioned in xsrc/local and do 
> > > something
> > > similar for metafont.
> > 
> > I have already planed to re-start with the Hershey fonts, for reasons
> > explained in my initial mail and for others and this will be combined
> > with TeX (kerTeX). So there will probably be something in this
> > line, at the end, even if it is only for my own use.
> 
> Sorry for late comment but I would like to suggest mlterm-fb as
> - probably - easiest solution for Your case (if I understood problem
> correctly, of course). mlterm running in framebuffer console is
> capable to use wide range of standard X fonts[1] without hassle.
> 
> If You want to convert fonts to wsfont You may take a look at some
> additional resources. In addition to already mentioned there is also
> my small tool[2], created for my own work for bitmap terminus fonts
> (see [3] for gallery) - it isn't useful for converting vectors, but
> may provide a hints about your own methods of mapping from UTF codes
> or Adobe names to particular code pages (wide range of definitions is
> provided by original terminus package, for my case I made only one,
> for cp437).
> 
> 1 - https://www.mail-archive.com/netbsd-users@netbsd.org/msg10136.html
> 2 - https://github.com/aniou/bdf2wsfont
> 3 - http://smutek.pl/netbsd/wsfont/terminus/
> 4 - http://terminus-font.sourceforge.net/
> 

Thank you for the links. When I will tackle the task I will also provide
short explanations about what different pieces achieve and comparisons
between solutions (for example, I guess, since this has been totally
lost in the huge hay stack of TeXLive, that very few people know about
virtual fonts, even less know how it works; few people can make a link
between METAFONT, freetype or Cairo; few people know how DVI
compares to PDF, or that one can compare METAFONT/TeX/DVI with PS, doing
in three what is done in one with the full fledge PS programming
language---leading after to a drop of a part of PS to keep only PDF in
a wide range of cases; etc.).
-- 
Thierry Laronde 
 http://www.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Fonts for console/fb for various locales: a proposal

2019-09-30 Thread tlaronde
On Mon, Sep 30, 2019 at 10:32:40AM +0200, Martin Husemann wrote:
> I guess noone would object a metafont2wsfont converter tool.
> Look at the true type tool Michael mentioned in xsrc/local and do something
> similar for metafont.

I have already planed to re-start with the Hershey fonts, for reasons
explained in my initial mail and for others and this will be combined
with TeX (kerTeX). So there will probably be something in this
line, at the end, even if it is only for my own use.

The next visible step will be on the users mailing list, to hopefully
find japanese speaking users able to "sort" the oriental glyphes when I
will produce the whole rendering of the Hershey fonts.
-- 
Thierry Laronde 
 http://www.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Proposal, again: Disable autoload of compat_xyz modules

2019-09-27 Thread tlaronde
On Fri, Sep 27, 2019 at 08:30:40AM +0200, Martin Husemann wrote:
> On Thu, Sep 26, 2019 at 09:40:22PM +0200, tlaro...@polynum.com wrote:
> > If the vulnerabilities can only be exploited by running Linux binaries,
> > IMHO, the point is moot: the ones that don't run Linux binaries are not
> > affected; the ones that do need to run some Linux binaries will have to
> > add the feature so this adds a user's intervention for the very same
> > result at the end.
> 
> I guess the main fear is that the attacker can put a malicious (and likely
> explicitly crafted for a certain bug in NetBSD's linux compat) binary on
> your machine and exectue it. If you have no untrusted local users
> and no admin installed linux binaries, the risc should be quite small.

Well, I don't think "trusted local users" exist anymore. Because they
bring with them (or is it the reverse? The device brings them)
i-phones or whatever and connect them, and download applications...

Slightly related: is NetBSD providing build services so that someone,
not wanting to open his sources, could at least build his program for
NetBSD without installing it? Because the best way to avoid the
compatibility is to have native NetBSD binaries.
-- 
Thierry Laronde 
 http://www.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Proposal, again: Disable autoload of compat_xyz modules

2019-09-26 Thread tlaronde
On Thu, Sep 26, 2019 at 10:17:51AM +0200, Maxime Villard wrote:
> I recently made a big set of changes to fix many bugs and vulnerabilities in
> compat_linux and compat_linux32, the majority of which have a security impact
> bigger than the Intel CPU bugs we hear about so much. These compat layers are
> enabled by default, so everybody is affected.
> 

I'm just an user, so I have just a question about the scope of the
problem:

Are the bugs and vulnerabilities in the compat_linux*, due to the compat
glue added, opening code paths that can be exploited by a non-linux
program for security threats or are the vulnerabilities only problems
if a linux binary is run---and perhaps other (SCO) binaries?

Because, as I see it, if this opens security problems even for the ones
that do _not_ use linux (or other alien) binaries, as long as the 
features are still easily added (even by a post-install fix for pkgsrc
programs) by loading
a module for the ones who have to run alien programs, not including by
default the compat_linux* modules (you don't speak about the NetBSD ABI
compatibility, right?), seems reasonable.

If the vulnerabilities can only be exploited by running Linux binaries,
IMHO, the point is moot: the ones that don't run Linux binaries are not
affected; the ones that do need to run some Linux binaries will have to
add the feature so this adds a user's intervention for the very same
result at the end.

Furthermore, if the compatibility code is adding/opening problems that
do not exist in the linux system emulated, it will mean that the
security problem can be only exerted by someone creating a special
version of a Linux program hoping it will be run under compat on
NetBSD... Security is a matter of probabilities for me, and it seems
that bugs (crashes) that is: not intentional "unfelicities" are more
probable than malice in this case...

Once more, I'm just a user. So the only thing I'm looking for is a
precision about the scope of the problem---I will obviously cope with
whatever decision is reached since I'm definitively not prepared to fork
:-^

Best,
-- 
Thierry Laronde 
 http://www.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: bounty for fsck_udf(8) for shared disks

2019-06-20 Thread tlaronde
Hello Reinoud,

On Thu, Jun 20, 2019 at 03:12:12PM +0200, Reinoud Zandijk wrote:
> Hi Thierry,
> 
> On Fri, Jun 14, 2019 at 12:19:11PM +0200, tlaro...@polynum.com wrote:
> > So I'd like to see the good work made by Reinoud Zandijk put a step 
> > further with a robust fsck_udf(8) for using indeed UDF with non optical
> > disks.
> 
> I've started fsck_udf by first refactoring newfs_udf, makefs -t udf to use a
> common core that i would like to use with fsck_udf for its patch-up work.
> 
> I presume you format the discs for UDF v2.01? UDF v2.50 doesn't have the
> support for resizing the metadata partition yet, so better use UDF v2.01 for
> now, the default.
> 
> I'll try to tackle fsck_udf for UDF v2.01 on discs first then, next to
> recordable media.
> 

Thanks for the very good work you have already done! (It is the most
advanced support amongst BSDs if I'm not mistaken.)

Yes I use v2.01 by default (Windows uses this too) and it works
quite well and IMHO this is the best "portable" format that should be
encouraged.

Thanks for tackling this! and if the NetBSD foundation wants to
sollicite me for some extra donation and to allow you to devote some
time on it, I'm OK.

Best regards,
-- 
Thierry Laronde 
 http://www.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


bounty for fsck_udf(8) for shared disks

2019-06-14 Thread tlaronde
Hello,

Context: I have a NetBSD fileserver serving files to mainly various
MS/Windows nodes and some NetBSD ones. The fileserver is making also
various backups among which, in order to plan for disaster, one backup 
is made on USB removable disks that have to be directly readable by 
Windows nodes so that work could continue even if the fileserver
was totally inaccessible for some time.

I have been using UDF for removable USB disks shared with MS/Windows
and, if the GPT partitioning is done according to MS/Windows
expectations (or simply done initially under MS/Windows) it
works more satisfactorily than trying to use ntfs-3g (that is even
not available for NetBSD 8.x since the modification of fuse or
refuse or whatever it depends upon), this latter being particularily
slow.

And since UDF is an open specification, it should be preferred.

The principle lack is that of fsck_udf(8). I had a problem (due more to
USB I think than UDF) and I had to recover the disk with MS/Windows 
chkdsk on the command line; NetBSD was unable to recover it.

So I'd like to see the good work made by Reinoud Zandijk put a step 
further with a robust fsck_udf(8) for using indeed UDF with non optical
disks.

I'm willing to donate some money to support the effort.

Best,
-- 
Thierry Laronde 
 http://www.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Reboot resistant USB bug

2018-10-13 Thread tlaronde
Hello,

On Sat, Oct 13, 2018 at 08:31:43AM +0100, Iain Hibbert wrote:
> On Thu, 11 Oct 2018, Emmanuel Dreyfus wrote:
> 
> > Hello
> > 
> > On both netbsd-8 and -current, I have a problem with USB devices that
> > get stuck in a non-functionning state even after a reboot.
> > 
> > This happens after interrupting transfer with different NFC readers 
> > from different vendors, and the only way to recover the device is 
> > to power-cycle it. I wonder if there could be a missing step in the 
> > way we initialize USB devices that could explain that situation.
> 
> This is a 'state' issue which does not change unless the device is power 
> cycled, which we do not generally do as part of the init AFAIK. I noticed 
> this with Bluetooth adaptors many years ago and we issue a reset because 
> of that but it doesn't affect the USB part of the device and adaptors 
> sometimes do fail to restart on reboot.
> 
> What do other OSs do in this way?  It seems difficult to guess the state 
> and we just assume that it is in post-cold boot when we attach which may 
> not always be optimal.
> 
> iain

FWIW, my main workstation is multi-booted. I mainly use NetBSD but
occasionnally have to go to MS Windows (I also use Plan9).

When rebooting from Windows (so no power off), there are numerous issues
with USB attached devices, obviously because of a persistent state
established by Windows and not cleared off and that confuses NetBSD.

The reverse is not true (rebooting from NetBSD to Windows), whether
because NetBSD "clean" things even when rebooting or because Windows
always re-establish a known (to it) state.

When exiting Windows, I have to power down in order for NetBSD to
restart correctly with USB devices.
-- 
Thierry Laronde 
 http://www.kergis.com/
   http://www.sbfa.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


USB, NetBSD 7/amd64: crashes

2015-07-07 Thread tlaronde
On Thu, Jul 02, 2015 at 10:18:23AM +0100, Nick Hudson wrote:
 On 07/02/15 10:07, tlaro...@polynum.com wrote:
 Hello,
 
 On an NetBSD 6.1.5/amd64, when I connect a second USB connected disk to
 the machine, NetBSD freezes. Unable to connect remotely; hard reboot
 required.
 
 
 Can you try netbsd-7 or better still -current?

I have tried a netbsd-7 kernel and it crashes as well and the problem is
still with locking.

Here is the bt:

mutex_oncpu.part.0() at netbsd:mutex_oncpu.part.0+0x8
mutex_vector_enter() at netbsd:mutex_vector_enter+0x93
sdopen() at netbsd:sdopen+0x87
cdev_open() at netbsd:cdev_open+0xb2
spec_open() at netbsd:spec_open+0x250
VOP_OPEN() at netbsd:VOP_OPEN+0x33
vn_open() at netbsd:vn_open+0x1ea
do_open() at netbsd:do_open+0x112
do_sys_openat() at netbsd:do_sys_openat+0x68
sys_open() at netbsd:sys_open+0x24
syscall() at netbsd:syscall+0x9c
---syscall (number 5)---

There is no problem if the two disks are connected when booting (How can
concurrency been achieved when the numbering of devices depends on the
number of devices connected? How can two concurrent devices be named
when they have the same rights to claim the very same name---sd0 for
example? If the not problematic obviously sequential enumeration when
both connected does not lead to problem, how can a dynamic concurrent
attachment be managed if one needs to remember how many are already
connected, since the number depends on that, while the already connected
may be concurrently detached---not the case here? Would it not be 
simpler to affect a USB port fixed name? No pun intended: I'm just
trying to understand how it works).

Desaster occurs when one disk is added concurrently to another one.
FWIW, when rebooting after the crash, the two disks being then
connected, the second one (the added one) is detected as sd0 while the
first one is then sd1 (for the case where the variable enumeration had
something to do with the resulting havoc).

For reference, on 6.1.5 this was the same:

---8---
umass0: at uhub3 port 1 (addr 3) disconnected
umass0 at uhub3 port 1 configuration 1 interface 0
umass0: Western Digital Elements 10A2, rev 2.10/10.42, addr 3
umass0: using SCSI over Bulk-Only
scsibus0 at umass0: 2 targets, 1 lun per target
sd0 at scsibus0 target 0 lun 0: WD, Elements 10A2, 1042 disk fixed
sd0: fabricating a geometry
sd0: 931 GB, 953837 cyl, 64 head, 32 sec, 512 bytes/sect x 1953458176 sectors
sd0: fabricating a geometry
sd0: GPT GUID: 960d762c-1cf3-11e5-b5f3-448a5b9b9f0f
dk0 at sd0: Basic data partition
dk0: 1953454080 blocks at 2048, type: 
umass1 at uhub2 port 6 configuration 1 interface 0
umass1: Western Digital Elements 10A8, rev 2.10/10.42, addr 3
umass1: using SCSI over Bulk-Only
scsibus1 at umass1: 2 targets, 1 lun per target
sd1 at scsibus1 target 0 lun 0: WD, Elements 10A8, 1042 disk fixed
sd1(umass1:0:0:0):  Check Condition on CDB: 0x00 00 00 00 00 00
SENSE KEY:  Not Ready
 ASC/ASCQ:  Logical Unit Is in Process Of Becoming Ready

sd1: drive offline
sd1: fabricating a geometry
sd1: GPT GUID: f3d6ceb3-2183-11e5-8a35-448a5b9b9f0f
sd1: detached
uvm_fault(0x80771320, 0x0, 1) - e
fatal page fault in supervisor mode
trap type 6 code 0 rip 80238c1f cs 8 rflags 10287 cr2  8 cpl 0 rsp 
fe8976b0
panic: trap
cpu1: Begin traceback...
printf_nolog() at netbsd:printf_nolog
startlwp() at netbsd:startlwp
alltraps() at netbsd:alltraps+0x96
dkwedge_add() at netbsd:dkwedge_add+0x1d1
dkwedge_discover_gpt() at netbsd:dkwedge_discover_gpt+0x492
dkwedge_discover() at netbsd:dkwedge_discover+0x128
sdattach() at netbsd:sdattach+0x1cb
config_attach_loc() at netbsd:config_attach_loc+0x1bb
scsi_probe_bus() at netbsd:scsi_probe_bus+0x537
scsibus_config() at netbsd:scsibus_config+0x74
scsipi_completion_thread() at netbsd:scsipi_completion_thread+0x23
cpu1: End traceback...
---8---

Dropping in ddb on panic, more precisely there is:

Stopped in pid 1.57 (system) at netbsd:mutex_vector_enter+0x80: movq 
18(%r15),%rax

This has nothing to do with MBR or GPT since I have tested with both. It
is systematic whenever one disk is first connected and then a second is
added.

Once rebooted, the two disks being connected, they are both correctly
accessible.

Note: FWIW, the first (and sole) disk is sd0. When rebooting, the
device nodes are reversed, the second one being sd0 and the first
one being sd1.

Question: is there some way to named partitions independantly from
hardware random enumeration (via wedges names? But this would imply
keeping persistently the name, so I guess in the GPT? Is there such 
a thing?)

-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
 http://www.arts-po.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
 http://www.arts-po.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95

Re: USB, NetBSD 7/amd64: crashes

2015-07-07 Thread tlaronde
On Tue, Jul 07, 2015 at 01:44:52PM +0200, Edgar Fuss wrote:
  It's not clear why sd1 is detaching so early.
 Because there's insufficiant current to power two drives at once?

But in my case, the first disk is idle even not mounted when the second
one is connected.

Furthermore, the two disks are connected to two distinct root usb* and
when testing, X was even not running but the bare minimum (and the
problem happens on two distinct machines).

And when the disks are both connected when booting, there is no crash.

So it seems to me that this can not be a power problem.
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
 http://www.arts-po.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: USB, NetBSD 7/amd64: crashes

2015-07-07 Thread tlaronde
On Tue, Jul 07, 2015 at 10:32:12AM +0200, Manuel Bouyer wrote:
 On Tue, Jul 07, 2015 at 08:40:11AM +0200, tlaro...@polynum.com wrote:
  I have tried a netbsd-7 kernel and it crashes as well and the problem is
  still with locking.
 
 I'm not sure it is a locking problem. In the dmesg you provide there
 is sd1: detached; so it looks like the device is gone while still trying
 to access it.

Whether one or the other. May the kernel release the faulting device
before faulting? Furthermore, it is a bi-core, if the fault is on cpu1
are messages from cpu0 and cpu1 guaranteed to be ordered in dmesg? i.e.
can one be sure that if sd1: detached appears before uvm_fault
there is a resp. time ordering? And a cause/consequence link is
not guaranteed either with two cores?

I don't have, unfortunately, a single CPU node on which I could test
whether it happens or not in this not concurrent case. This could give
a supplementary indication about the level the problem is. (Can one
instruct NetBSD to use only one CPU without an ad-hoc kernel?)
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
 http://www.arts-po.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


[Was USB; is] dkwedge_add(), 6.1.5/amd64: freezes when 2 umass connected

2015-07-03 Thread tlaronde
On Thu, Jul 02, 2015 at 11:07:19AM +0200, tlaronde wrote:
 
 On an NetBSD 6.1.5/amd64, when I connect a second USB connected disk to
 the machine, NetBSD freezes. Unable to connect remotely; hard reboot
 required.
 

Indeed, the system doesn't freeze but crashes (some long time without
response is caused by backtracing but this can only be seen on the
console).

When a first USB disk is connected (umass0), adding another USB disk
crashes everything. Here are the excerpts from dmesg for the crash:

---8---
umass0: at uhub3 port 1 (addr 3) disconnected
umass0 at uhub3 port 1 configuration 1 interface 0
umass0: Western Digital Elements 10A2, rev 2.10/10.42, addr 3
umass0: using SCSI over Bulk-Only
scsibus0 at umass0: 2 targets, 1 lun per target
sd0 at scsibus0 target 0 lun 0: WD, Elements 10A2, 1042 disk fixed
sd0: fabricating a geometry
sd0: 931 GB, 953837 cyl, 64 head, 32 sec, 512 bytes/sect x 1953458176 sectors
sd0: fabricating a geometry
sd0: GPT GUID: 960d762c-1cf3-11e5-b5f3-448a5b9b9f0f
dk0 at sd0: Basic data partition
dk0: 1953454080 blocks at 2048, type: 
umass1 at uhub2 port 6 configuration 1 interface 0
umass1: Western Digital Elements 10A8, rev 2.10/10.42, addr 3
umass1: using SCSI over Bulk-Only
scsibus1 at umass1: 2 targets, 1 lun per target
sd1 at scsibus1 target 0 lun 0: WD, Elements 10A8, 1042 disk fixed
sd1(umass1:0:0:0):  Check Condition on CDB: 0x00 00 00 00 00 00
SENSE KEY:  Not Ready
 ASC/ASCQ:  Logical Unit Is in Process Of Becoming Ready

sd1: drive offline
sd1: fabricating a geometry
sd1: GPT GUID: f3d6ceb3-2183-11e5-8a35-448a5b9b9f0f
sd1: detached
uvm_fault(0x80771320, 0x0, 1) - e
fatal page fault in supervisor mode
trap type 6 code 0 rip 80238c1f cs 8 rflags 10287 cr2  8 cpl 0 rsp 
fe8976b0
panic: trap
cpu1: Begin traceback...
printf_nolog() at netbsd:printf_nolog
startlwp() at netbsd:startlwp
alltraps() at netbsd:alltraps+0x96
dkwedge_add() at netbsd:dkwedge_add+0x1d1
dkwedge_discover_gpt() at netbsd:dkwedge_discover_gpt+0x492
dkwedge_discover() at netbsd:dkwedge_discover+0x128
sdattach() at netbsd:sdattach+0x1cb
config_attach_loc() at netbsd:config_attach_loc+0x1bb
scsi_probe_bus() at netbsd:scsi_probe_bus+0x537
scsibus_config() at netbsd:scsibus_config+0x74
scsipi_completion_thread() at netbsd:scsipi_completion_thread+0x23
cpu1: End traceback...
---8---

Dropping in ddb on panic, more precisely there is:

Stopped in pid 1.57 (system) at netbsd:mutex_vector_enter+0x80: movq 
18(%r15),%rax

This has nothing to do with MBR or GPT since I have tested with both. It
is systematic whenever one disk is first connected and then a second is
added.

Once rebooted, the two disks being connected, they are both correctly
accessible.

Note: FWIW, the first (and sole) disk is sd0. When rebooting, the
device nodes are reversed, the second one being sd0 and the first
one being sd1.

Question: is there some way to named partitions independantly from
hardware random enumeration (via wedges names? But this would imply
keeping persistently the name, so I guess in the GPT? Is there such 
a thing?)

-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
 http://www.arts-po.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: USB, NetBSD 6.1.5/amd64: freezes when 2 umass connected

2015-07-02 Thread tlaronde
On Thu, Jul 02, 2015 at 10:18:23AM +0100, Nick Hudson wrote:
 On 07/02/15 10:07, tlaro...@polynum.com wrote:
 Hello,
 
 On an NetBSD 6.1.5/amd64, when I connect a second USB connected disk to
 the machine, NetBSD freezes. Unable to connect remotely; hard reboot
 required.
 
 Questions:
 
 1) The machine has two usb ports, with uhub0 and uhub1 first attached
 resp. to these ones; the uhub2 cascading from uhub0 and uhub3 from
 uhub1.
 uhub2 has 6 ports removable;
 uhub3 has 8 ports removable;
 Since in /dev/ there are only 8 devices (from usb0 to usb7) could this
 be the problem? (6 + 8 = 14, even if I have only one USB device---first
 disk---and the second disk is only the second device; but how are the
 device nodes assigned to one USB port?)
 
 2) The two USB disks are from the same vendor (Western Digital) but not
 exactly the same model (not the same capacity). Could the USB driver be
 confused by two similar devices connected to the same(?) USB tree?
 
 3) Physically, on the machine, there are USB ports on the rear, and USB
 ports on the front. Does somebody know if front ports could be
 duplicating rear ports, that is slots on the front be in fact
 connected to the same ports as the rear ones causing conflict?
 
 I'm trying to find what is causing this misbehavior. And a freeze is
 rather annoying for a node that is mainly supposed to be administrated
 from remote...
 
 TIA,
 
 Can you try netbsd-7 or better still -current?
 

This will be difficult on this node since during the time I have
accessed to, it serves the files (SAMBA).

I will try to get the offending USB disk and do test on my personnal
machine, running 6.1.5 too (on amd64) and if the same behavior happens,
I will try first to get a clue about what is going on, and second try a
netbsd-7 or -current.

But has something be made concerning USB and umass on post-6.1.x
kernels that could give a clue about what the problem is/was?
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
 http://www.arts-po.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


USB, NetBSD 6.1.5/amd64: freezes when 2 umass connected

2015-07-02 Thread tlaronde
Hello,

On an NetBSD 6.1.5/amd64, when I connect a second USB connected disk to
the machine, NetBSD freezes. Unable to connect remotely; hard reboot
required.

Questions:

1) The machine has two usb ports, with uhub0 and uhub1 first attached
resp. to these ones; the uhub2 cascading from uhub0 and uhub3 from
uhub1.
uhub2 has 6 ports removable;
uhub3 has 8 ports removable;
Since in /dev/ there are only 8 devices (from usb0 to usb7) could this
be the problem? (6 + 8 = 14, even if I have only one USB device---first
disk---and the second disk is only the second device; but how are the
device nodes assigned to one USB port?)

2) The two USB disks are from the same vendor (Western Digital) but not
exactly the same model (not the same capacity). Could the USB driver be
confused by two similar devices connected to the same(?) USB tree?

3) Physically, on the machine, there are USB ports on the rear, and USB
ports on the front. Does somebody know if front ports could be
duplicating rear ports, that is slots on the front be in fact
connected to the same ports as the rear ones causing conflict?

I'm trying to find what is causing this misbehavior. And a freeze is
rather annoying for a node that is mainly supposed to be administrated
from remote...

TIA,
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
 http://www.arts-po.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: USB, NetBSD 6.1.5/amd64: freezes when 2 umass connected

2015-07-02 Thread tlaronde
On Thu, Jul 02, 2015 at 05:22:27PM +0800, Paul Goyette wrote:
 On Thu, 2 Jul 2015, tlaro...@polynum.com wrote:
 
 Hello,
 
 On an NetBSD 6.1.5/amd64, when I connect a second USB connected disk to
 the machine, NetBSD freezes. Unable to connect remotely; hard reboot
 required.
 
 Questions:
 
 1) The machine has two usb ports, with uhub0 and uhub1 first attached
 resp. to these ones; the uhub2 cascading from uhub0 and uhub3 from
 uhub1.
 uhub2 has 6 ports removable;
 uhub3 has 8 ports removable;
 Since in /dev/ there are only 8 devices (from usb0 to usb7) could this
 be the problem? (6 + 8 = 14, even if I have only one USB device---first
 disk---and the second disk is only the second device; but how are the
 device nodes assigned to one USB port?)
 
 2) The two USB disks are from the same vendor (Western Digital) but not
 exactly the same model (not the same capacity). Could the USB driver be
 confused by two similar devices connected to the same(?) USB tree?
 
 3) Physically, on the machine, there are USB ports on the rear, and USB
 ports on the front. Does somebody know if front ports could be
 duplicating rear ports, that is slots on the front be in fact
 connected to the same ports as the rear ones causing conflict?
 
 Unlikely.  All of the motherboards i've played with have the rear ports 
 hard-wired internally, while the front-panel ports are connected via a 
 riser cable to sockets on the motherboard.
 
 
 I'm trying to find what is causing this misbehavior. And a freeze is
 rather annoying for a node that is mainly supposed to be administrated
 from remote...
 
 I've had problems in the past with only a single umass hard-drive being 
 connected.  I use the external WesternDigital hard drive for backups, 
 and as long as only a single process is writing heavily to the drive, 
 all is well.  But if I try to have two different backups running from 
 two different filesystems (whether or not on the same wdn physical 
 drive), the external umass/scsi drive hands the entire system and needs 
 a hard-boot.
 
 I have a gut feeling (without any hard evidence, FWIW!) that there's 
 something not quite MP-safe with umass/scsi

Well, in my case (the USB disks are used for backup too), the first disk
is not even mounted and is not used when I try to connect the second
one. So no write nor even read operation is attempted on _both_ disks.
And it freezes the whole system (and I have nothing in the messages
after rebooting, indicating whatever...)
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
 http://www.arts-po.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


NetBSD 6.1.5/amd64 and USB poor performance

2015-06-24 Thread tlaronde
Hello,

I have NetBSD 6.1.5 on a amd64 with USB 3.0 ports.

When writing files to an external USB (3.0) connected disk, using
ntfs-3g, the write performance is abyssal : it is only USB 1.0 (12
Mbps or 1.5MB/s).

From the manual pages (ehci(4)), NetBSD 6.x supports only USB 2.0 via
ehci(4). The ehci connectors have also companion controllers (ohci(4)
and uhci(4)) that support USB 1.0.

My kernel config had only ehci support. Nonetheless, the write
performance is only USB 1.0.

The disk is attached with umass.

Is the problem with umass? With ntfs-3g ? (but for what reason shall the
performance of a filesystem driver depend on the way the device is
connected?) Problem with librefuse ?

Any clue would be welcomed.

TIA,
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
 http://www.arts-po.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Groff

2015-06-04 Thread tlaronde
On Thu, Jun 04, 2015 at 08:05:16AM +0300, Aleksej Saushev wrote:
 tlaro...@polynum.com writes:
 
 [pledge for TeX---not TexLive]
 
 There's a lot better approach that beats all the above on all accounts.
 
 Import libxml2, libxslt, w3m that are all readily available, convert man
 pages to a human-readable and human-writable format, which is XML,
 and stop using archaic formats.
 
 This has a number of significant benefits over TeX or roff:
 1. XML is well-known, the syntax doesn't require anything special to learn.
 2. There's abundancy of software to process it.
 3. XML can be used immediately, without preprocessing step (just point
 web browser at it, and it will load stylesheet and perform XSL
 transformation for you).
 4. Desktop users will have really good rendering as provided by Firefox
 or Webkit.

That there may be not software but bloatware: Firefox and al. to
succeed, more or less, to provide a rendering has nothing to appeal to
me. That this bloat format has to be processed by tools that depend on
gigabytes of software needing C++ compiler and al. to ---try to--- be
compiled is definitively not what I call a system typesetting. Needing
gigs of memory to try to run firefox or chrome or whatever has nothing
to appeal to me; not to mention that the last time I gave a try to
compile chromium it retrieved half of the Google cache as dependencies,
took hours of compilation (on a rather decent computer) to finally fail
to _link_ the objects because 4 gigs of memory was not enough!

Furthermore, all the text tools provided by the system (and even only
the POSIX.2 text tools) can be readily used on a TeX file.

Finally, my idea would be the reverse: use the lean TeX engine (and even
the METAFONT engine) to format and rasterize for a hypertext page viewer
(a browser) to display. A hypertext page viewer, able to render
including state of the art mathematical typesetting and figures, with a
small pure C program with enuncombered licence.

But I guess that I'm one of the few that still use Plan9 or NetBSD (or
*BSD) because small is beautiful and because for me freedom means
depending the least possible on not maintenable (not holding in
one's---mine--hand) things.

And the irony is that I'm convinced that the undercover actual Third
World War will become an open Third World War and that anything
depending on external and world inteconnection will simply cease to 
exist and that my line of choice is more sustainable than others.
I'm out of the trend now; but the trend changes, and changes
independently from the ones who follow it... (That's why, for the
very same reason as stated above, I do not follow trend, I simply
do what I feel correct to do. I may be wrong, but my error is
neither caused by wanting to be sync with fashion nor by wanting
systematically to be out of fashion: I simply ignore fashion.)

I gather that I will not convince you; but you can surely conclude that
you will never convince me ;)

-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
 http://www.arts-po.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Groff

2015-06-04 Thread tlaronde
On Thu, Jun 04, 2015 at 11:44:25AM +0100, Robert Swindells wrote:
 
 Johnny Billquist b...@softjar.se wrote:
 
 What happened to the original roff? I mean, groff is just a gnu 
 replacement for roff. Maybe switch back to the original?
 
 The sources to all of DWB are available from ATT:
 
 http://www2.research.att.com/~astopen/download/
 
 It needs a bit of work to get it to build on NetBSD though.
 

FWIW, to show that I'm not a sectarist: John Hobby derived MetaPost from
METAFONT for drawing pictures. For text, it uses TeX but can also use 
roff. The roff support is still there in kerTeX. So MetaPost can also be
used to generate PS figures with roff text formatted.
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
 http://www.arts-po.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Groff (was: Removing ARCNET stuffs)

2015-06-01 Thread tlaronde
On Mon, Jun 01, 2015 at 05:50:07PM +, David Holland wrote:
 On Sun, May 31, 2015 at 09:24:48PM -0400, Andrew Cagney wrote:
 
   (oh and please delete C++ groff,  just replace it with that AWK script)
 
 which awk script? :-)
 
 (quite seriously, I've been looking for a while for an alternative to
 groff for typesetting the miscellaneous articles in base.
 

(Delenda Carthago...) Once more, I will re-advertise that the complete
Donald E. Knuth typesetting system is available, that can be even 
restricted to strictly just D.E.K.'s work (even with the fonts,
this is a matter of far less than 10 MB); that is pure C89
(some auxiliaries invoke POSIX.2 utilities, mainly sh(1) but these
are just auxiliaries); that comes with the fonts, the ability to
design the fonts, the formatting (TeX) and a format dvi à la PDF
that can be used to generate a formatted text version; the means
to use also mathematics; the means to draw figures rasterized with
METAFONT (more general figures with MetaPOST, supplementary, but
this generates PS); and with a compiling framework that is not GPLn
but BSD.

Since for a system written in C the main human language is CEE
that is a kind of technical english, the limitation to 8 bits (that
could be changed by dealing with font directories and not font
files, i.e. a directory of 256 glyphes sub-font) is not an immediate 
problem.

The conversion from roff to tex should be easier than the
reverse and I expect relatively simple for 95% of the work (the man
pages).

IMHO, the main tasks remaining are (could be GSoC by the way):
- give a DVI viewer (starting from scratch);
- extend with the minimal changes TeX to be able to use UTF-8 (meaning,
as UTF-8, that ASCII can be fed as is, but that this is just 8 bits
still at entry---mouth);
- whether develop a C SmallScript to be able to interpret the limited
MetaPOST PostScript; or extend DVI and METAFONT to handle MetaPOST
capabilities and rasterize the figures, in order for the system to
be totally self-sufficient (no PDF viewer or PostScript interpreter to
be able to render the pages).

It is here:

http://www.kergis.com/en/kertex.html

It is not orphaned but stalled for the moment due to ETIME.

Best,
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


USB (ehci) mouse attachment problem

2014-12-21 Thread tlaronde
Hello,

This is probably related to PR #44706. But, precisely, in this PR, the
offending mouse is a Logitech one, and this also the case here.

But on my NetBSD amd64 6.1.5 system, I have also a Logitech USB 
keyboard, and at least every odd time, if this USB keyboard is connected
directly via USB (and not with a USB/PS2 converter), the logitech
_mouse_ is recognized as a keyboard leading to the lost of the real one.

The kernel is compiled with ehci, uhci and ohci, since some USB ports
are supposed to be for low speed devices (keyboard and mouse) so I
expected the necessity of USB 1.0 support, which seems to request, from
ehci(4), uchi or ohci.

Concerning both keyboard and mouse (both appearing as Logitech keyboard,
the only difference being the second number of the iclass), here is an 
excerpt of dmesg:

Intel product 0x8c31 (USB serial bus, interface 0x30, revision 0x05) at pci0 
dev 20 function 0 not configured
Intel product 0x8c3a (miscellaneous communications, revision 0x04) at pci0 dev 
22 function 0 not configured
ehci0 at pci0 dev 26 function 0: Intel product 0x8c2d (rev. 0x05)
ehci0: interrupting at ioapic0 pin 16
ehci0: EHCI version 1.0
usb0 at ehci0: USB revision 2.0
usb1 at ehci1: USB revision 2.0
isa0 at pcib0
pckbc0 at isa0 port 0x60-0x64
uhub0 at usb0: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub1 at usb1: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhub2 at uhub1 port 1: vendor 0x8087 product 0x8000, class 9/0, rev 2.00/0.05, 
addr 2
uhub2: single transaction translator
uhub3 at uhub0 port 1: vendor 0x8087 product 0x8008, class 9/0, rev 2.00/0.05, 
addr 2
uhub3: single transaction translator
uhub2: 6 ports with 6 removable, self powered
uhub3: 6 ports with 6 removable, self powered
uhidev0 at uhub2 port 3 configuration 1 interface 0
uhidev0: Logitech Logitech USB Keyboard, rev 1.10/23.00, addr 3, iclass 3/1
ukbd0 at uhidev0
wskbd0 at ukbd0: console keyboard, using wsdisplay0
uhidev1 at uhub2 port 3 configuration 1 interface 1
uhidev1: Logitech Logitech USB Keyboard, rev 1.10/23.00, addr 3, iclass 3/0
uhidev1: 2 report ids
uhid0 at uhidev1 reportid 1: input=2, output=0, feature=0
uhid1 at uhidev1 reportid 2: input=1, output=0, feature=0
uhub2: device problem, disabling port 4
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
 http://www.arts-po.fr/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: ahcisata0: BSY never cleared, TD 0x80

2013-06-25 Thread tlaronde
On Tue, Jun 25, 2013 at 03:48:08PM +0200, Manuel Bouyer wrote:
  ahcisata0: BSY never cleared, TD 0x80
[...] 
  
  messages too. (Furthermore, there are, when trying to get smart
  informations via atactl(8):
  
  wd1: dos partition I/O error
 
 at this point it's only trying to read the MBR, and fails. 
 any other message before this ?

No. Only that it fails to read the very first sector when I finally
manage to kill the reading process (takes minutes).

-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
  http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Atheros Ethernet product 0x1091

2013-06-05 Thread tlaronde
Hello,

This ethernet device is embedded in a Gigabyte motherboard.

The pcidb says:

AR8161/8165 PCI-E Gigabit Ethernet Controller

It is neither recognized by age(4), alc(4), ale(4) or lii(4) (dealing
with L1, L2 or other).

Does anybody know if there is support for this in the planning, or if
there is a driver for this on a *BSD flavor that could be ported to
NetBSD?

TIA
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
  http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Re: Atheros Ethernet product 0x1091

2013-06-05 Thread tlaronde
On Wed, Jun 05, 2013 at 06:50:52PM +0200, tlaro...@polynum.com wrote:
 
   AR8161/8165 PCI-E Gigabit Ethernet Controller
 

So it is a new chipset and there are sources (the licence is not GPL)
for a collaborative work for Linux and FreeBSD family. The name is
alx:

http://www.linuxfoundation.org/collaborate/workgroups/networking/alx

Does NetBSD participate to this also?
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
  http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


  1   2   >