Re: Boot(8) timeouts take excessively long on OnLogic Helix 500.

2022-06-27 Thread Miod Vallat
> >Synopsis: Boot(8) timeouts take excessively long on OnLogic Helix 500.
> >Category:  boot, amd64

The change in sys/stand/boot/cmd.c r1.44 has been reverted.

Miod



Re: Boot(8) timeouts take excessively long on OnLogic Helix 500.

2022-05-28 Thread Dan Cross
On Tue, Apr 26, 2022 at 8:46 AM Dan Cross  wrote:
> On Mon, Apr 25, 2022 at 5:49 PM Mark Kettenis  wrote:
> > > From: Dan Cross 
> > > Date: Sun, 24 Apr 2022 21:12:29 -0400
> >
> > On a machine of this vintage you probably shouldn't boot using the
> > legacy BIOS.  Try UEFI mode instead.
>
> Sure.  I gave that a go with the same result.

A ping this again?  The suggestion to use UEFI did not change the
observed behavior: unsurprising, as the functionality for both UEFI
and BIOS are almost certainly built on the same machinery.

The patch I sent does fix the behavior, though. What would it take
to get that applied, or move forward with something else?

Thanks.

- Dan C.



Re: Boot(8) timeouts take excessively long on OnLogic Helix 500.

2022-04-26 Thread Dan Cross
On Mon, Apr 25, 2022 at 5:49 PM Mark Kettenis 
wrote:
> > From: Dan Cross 
> > Date: Sun, 24 Apr 2022 21:12:29 -0400
>
> On a machine of this vintage you probably shouldn't boot using the
> legacy BIOS.  Try UEFI mode instead.

Sure.  I gave that a go with the same result.

- Dan C.


Re: Boot(8) timeouts take excessively long on OnLogic Helix 500.

2022-04-25 Thread Mark Kettenis
> From: Dan Cross 
> Date: Sun, 24 Apr 2022 21:12:29 -0400

On a machine of this vintage you probably shouldn't boot using the
legacy BIOS.  Try UEFI mode instead.

> >Synopsis: Boot(8) timeouts take excessively long on OnLogic Helix 500.
> >Category:  boot, amd64
> >Environment:
> System  : OpenBSD 7.1
> Details : OpenBSD 7.1-current (GENERIC.MP) #9: Thu Apr  7
> 15:59:04 UTC 2022
>  cr...@samudra.gajendra.net:
> /usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
> Architecture: OpenBSD.amd64
> Machine : amd64
> >Description:
> On the OnLogic Helix 500, and possibly other models in
> the series of, industrial machines (amd64), the timeout
> from at the 'boot>' prompt takes excessively long: on
> the order of 30 *minutes*.
> 
> What is happening is that the code in sys/stand/boot/cmd.c
> has logic to only sample the time source every 1000
> iterations of the keystroke probe loop.  However, on
> these machines, the keystroke probe function (`cnischar`
> defined in /sys/lib/libsa/cons.c) takes a very long
> time: one or two seconds.
> 
> It is not entirely clear why the `cnischar` is so slow;
> this function results in a call to `pc_getc` such that
> it makes the BIOS "int 16h" call with `%ah` set to 1,
> which "gets the state of the keyboard buffer".  That
> BIOS call clears the zero flag if a key was pressed and
> `pc_getc` sets %ax if Z is not set (via a `setnz`
> instruction in inline assembler).  The function returns
> this result (actually the low byte of that result,
> but the result is the same).  One must assume that the
> BIOS call is slow on this machine.
> 
> >How-To-Repeat:
> Install OpenBSD/amd64 on an OnLogic Helix 500.  Reboot.
> Observe that the timeout at the 'boot>' prompt takes
> many minutes.  A keystroke will be recognized reasonably
> quickly, however.
> 
> Note: I have not tried all configurations of local PC
> console and serial console to see if there's some
> configuration that is faster.
> 
> >Fix:
> The logic in cmd.c limiting probing the BIOS clock to
> every thousand iterations of the loop was added in 1999
> (CVS commit #1.44 of that file:
> 
> https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/stand/boot/cmd.c.diff?r1=1.43&r2=1.44&f=h
> ).
> 
> That commit added a comment saying, "check for timeout
> expiration less often (for some very constrained
> archs)".  Sadly, I had no luck trying to track down the
> context around this change.
> 
> However, One wonders how relevant that remains almost a
> quarter century later.  Moreover, this is in
> single-threaded, early boot code.  What else does the
> machine have to do at this point?  It was not clear what
> was wrong with calling the BIOS clock routine so often,
> so my solution was to effectively undo revision 1.44, and
> simply call check the timeout on each iteration of the
> loop.  Please see the following patch:
> 
> -->BEGIN PATCH<--
> Index: cmd.c
> ===
> RCS file: /cvs/src/sys/stand/boot/cmd.c,v
> retrieving revision 1.68
> diff -u -p -r1.68 cmd.c
> --- cmd.c   24 Oct 2021 17:49:19 -  1.68
> +++ cmd.c   25 Apr 2022 00:57:24 -
> @@ -248,7 +248,6 @@ readline(char *buf, size_t n, int to)
> 
> /* Only do timeout if greater than 0 */
> if (to > 0) {
> -   u_long i = 0;
> time_t tt = getsecs() + to;
>  #ifdef DEBUG
> if (debug > 2)
> @@ -256,9 +255,8 @@ readline(char *buf, size_t n, int to)
>  #endif
> /* check for timeout expiration less often
>(for some very constrained archs) */
> -   while (!cnischar())
> -   if (!(i++ % 1000) && (getsecs() >= tt))
> -   break;
> +   while (getsecs() < tt && !cnischar())
> +   ;
> 
> if (!cnischar()) {
> strlcpy(buf, "boot", 5);
> -->END PATCH<--
&

Boot(8) timeouts take excessively long on OnLogic Helix 500.

2022-04-24 Thread Dan Cross
>Synopsis: Boot(8) timeouts take excessively long on OnLogic Helix 500.
>Category:  boot, amd64
>Environment:
System  : OpenBSD 7.1
Details : OpenBSD 7.1-current (GENERIC.MP) #9: Thu Apr  7
15:59:04 UTC 2022
 cr...@samudra.gajendra.net:
/usr/src/sys/arch/amd64/compile/GENERIC.MP

Architecture: OpenBSD.amd64
Machine : amd64
>Description:
On the OnLogic Helix 500, and possibly other models in
the series of, industrial machines (amd64), the timeout
from at the 'boot>' prompt takes excessively long: on
the order of 30 *minutes*.

What is happening is that the code in sys/stand/boot/cmd.c
has logic to only sample the time source every 1000
iterations of the keystroke probe loop.  However, on
these machines, the keystroke probe function (`cnischar`
defined in /sys/lib/libsa/cons.c) takes a very long
time: one or two seconds.

It is not entirely clear why the `cnischar` is so slow;
this function results in a call to `pc_getc` such that
it makes the BIOS "int 16h" call with `%ah` set to 1,
which "gets the state of the keyboard buffer".  That
BIOS call clears the zero flag if a key was pressed and
`pc_getc` sets %ax if Z is not set (via a `setnz`
instruction in inline assembler).  The function returns
this result (actually the low byte of that result,
but the result is the same).  One must assume that the
BIOS call is slow on this machine.

>How-To-Repeat:
Install OpenBSD/amd64 on an OnLogic Helix 500.  Reboot.
Observe that the timeout at the 'boot>' prompt takes
many minutes.  A keystroke will be recognized reasonably
quickly, however.

Note: I have not tried all configurations of local PC
console and serial console to see if there's some
configuration that is faster.

>Fix:
The logic in cmd.c limiting probing the BIOS clock to
every thousand iterations of the loop was added in 1999
(CVS commit #1.44 of that file:

https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/stand/boot/cmd.c.diff?r1=1.43&r2=1.44&f=h
).

That commit added a comment saying, "check for timeout
expiration less often (for some very constrained
archs)".  Sadly, I had no luck trying to track down the
context around this change.

However, One wonders how relevant that remains almost a
quarter century later.  Moreover, this is in
single-threaded, early boot code.  What else does the
machine have to do at this point?  It was not clear what
was wrong with calling the BIOS clock routine so often,
so my solution was to effectively undo revision 1.44, and
simply call check the timeout on each iteration of the
loop.  Please see the following patch:

-->BEGIN PATCH<--
Index: cmd.c
===
RCS file: /cvs/src/sys/stand/boot/cmd.c,v
retrieving revision 1.68
diff -u -p -r1.68 cmd.c
--- cmd.c   24 Oct 2021 17:49:19 -  1.68
+++ cmd.c   25 Apr 2022 00:57:24 -
@@ -248,7 +248,6 @@ readline(char *buf, size_t n, int to)

/* Only do timeout if greater than 0 */
if (to > 0) {
-   u_long i = 0;
time_t tt = getsecs() + to;
 #ifdef DEBUG
if (debug > 2)
@@ -256,9 +255,8 @@ readline(char *buf, size_t n, int to)
 #endif
/* check for timeout expiration less often
   (for some very constrained archs) */
-   while (!cnischar())
-   if (!(i++ % 1000) && (getsecs() >= tt))
-   break;
+   while (getsecs() < tt && !cnischar())
+   ;

if (!cnischar()) {
strlcpy(buf, "boot", 5);
-->END PATCH<--

Of course, there could be other approaches, such as
tracking down why the BIOS call is slow in the first
place, but for such a special case it hardly seemed
worth it, and with this in place, boot time is
acceptably fast again.  Given that the use case might
be rather long in the tooth at this point anyhow, it
seemed useful to send it upstream instead of floating
a patch locally.


dmesg:
OpenBSD 7.1-current (GENERIC.MP) #9: Thu Apr  7 15:59:04 UTC 2022
cr...@samudra.gajendra.net:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 68538777600 (65363MB)
avail mem = 66444283904