Re: vm.conf.5: Use the proper macros

2022-09-23 Thread Jason McIntyre
On Fri, Sep 23, 2022 at 05:41:41PM -0400, Josiah Frentsos wrote:
> Index: vm.conf.5
> ===

hi. some comments, inline:

> RCS file: /cvs/src/usr.sbin/vmd/vm.conf.5,v
> retrieving revision 1.59
> diff -u -p -r1.59 vm.conf.5
> --- vm.conf.5 13 Sep 2022 10:28:19 -  1.59
> +++ vm.conf.5 23 Sep 2022 19:36:01 -
> @@ -25,8 +25,7 @@
>  .Nm
>  is the configuration file to configure the virtual machine monitor
>  (VMM) subsystem.
> -A VMM manages virtual machines (VMs) on a
> -.Ar host .
> +A VMM manages virtual machines (VMs) on a host.
>  The VMM subsystem is responsible for creating, destroying, and
>  executing VMs.
>  .Pp
> @@ -45,8 +44,7 @@ Configuration for each individual virtua
>  Configuration for virtual switches.
>  .El
>  .Pp
> -Within the sections,
> -the
> +Within the sections, the
>  .Ar bytes
>  argument can be specified with a human-readable scale,
>  using the format described in
> @@ -91,7 +89,11 @@ vm "vm1.example.com" {
>  .Sh GLOBAL CONFIGURATION
>  The following setting can be configured globally:
>  .Bl -tag -width Ds
> -.It Ic agentx Oo Ic context Ar context Oc Oo Ic path Ar path Oc
> +.It Xo
> +.Ic agentx
> +.Op Ic context Ar context
> +.Op Ic path Ar path
> +.Xc

in the past, we had to use Xo/Xc when dealing with long argument lists.
that is no longer the case, so Xo/Xc now is used less. in cases like
this, using them is basically adding extra macros, but gaining the
benefit of potentially (depending on where you stand) making the doc
source look tidier.

personally i would make a change like this only if all the other list
items were formatted this way. but here they are not, so i would not.

you could however replace the final Oo/Oc with a single Op.

>  Export vm metrics via an AgentX compatible
>  .Pq snmp
>  daemon by connecting to
> @@ -102,9 +104,9 @@ If
>  .Ar path
>  is omitted it will default to
>  .Pa /var/agentx/master .
> -.Ar Context
> +.Ar context
>  is the SNMPv3 context and can usually be omitted.
> -.It Ic local prefix Ar address Ns Li / Ns Ar prefix
> +.It Ic local prefix Ar address Ns / Ns Ar prefix
>  Set the network prefix that is used to allocate subnets for
>  local interfaces, see
>  .Ic local interface
> @@ -112,22 +114,23 @@ in the
>  .Sx VM CONFIGURATION
>  section below.
>  The default is
> -.Ar 100.64.0.0/10 .
> -.It Ic local inet6 Op Ic prefix Ar address Ns Li / Ns Ar prefix
> +.Cm 100.64.0.0/10 .

i don;t think either Ar or Cm really make sense. i would either quote it
(without using a macro) or not add any punctuation or mark up at all.

> +.It Ic local inet6 Op Ic prefix Ar address Ns / Ns Ar prefix
>  Enable IPv6 on local interfaces and allocate routable subnets.
>  If the prefix is not specified,
>  a random prefix from the
>  .Dq unique local
> -network range
> -.Ar fd00::/8
> -will be generated on startup.
> +network range fd00::/8 will be generated on startup.

so here you have done it without punctuation/mark up ;) that's what i'd
do too, above.

>  The specified prefix length must be /64 or smaller.
> -.It Cm socket owner Ar user : Ns Ar group
> -Set the control socket owner to the specified user and group.
> +.It Ic socket owner Ar user : Ns Ar group

i'd have to look seperately at the Ic/Cm change to try and decide what
makes sense where. to that end, i'd rather a diff for such changes was
mailed seperately.

> +Set the control socket owner to the specified
> +.Ar user
> +and
> +.Ar group .
>  Users with access to the control socket will be allowed to use
> -.Nm vmctl
> +.Xr vmctl 8
>  for restricted access to
> -.Nm vmd .
> +.Xr vmd 8 .
>  If only
>  .Ar user
>  is given,
> @@ -137,7 +140,7 @@ If only
>  is given,
>  only the group is set.
>  The default is
> -.Ar root : Ns Ar wheel .
> +.Cm root:wheel .

again, i would use no mark up, or quote it if you must.

>  .It Ic staggered start parallel Ar parallelism Ic delay Ar seconds
>  Start all configured VMs in a staggered fashion with
>  .Ar parallelism
> @@ -166,28 +169,28 @@ Typically this is a hostname.
>  .Pp
>  Followed by a block of parameters that is enclosed in curly brackets:
>  .Bl -tag -width Ds
> -.It Cm allow instance Brq ...
> +.It Ic allow instance Brq ...
>  Set the permissions to create VM instances.
>  See
>  .Sx VM INSTANCES .
> -.It Cm boot Ar path
> +.It Ic boot Ar path
>  Kernel or BIOS image to load when booting the VM.
>  If not specified, the default is to boot using the BIOS image in
>  .Pa /etc/firmware/vmm-bios .
> -.It Cm boot device Ar device
> +.It Ic boot device Ar device
>  Force VM to boot from
>  .Ar device .
>  Valid values are:
>  .Bl -tag -width "cdrom"
> -.It Ar cdrom
> +.It Cm cdrom
>  Boot the ISO image file specified using the
>  .Ic cdrom
>  parameter.
> -.It Ar disk
> +.It Cm disk
>  Boot from the disk image file specified using the
>  .Ic disk
>  parameter.
> -.It Ar net
> +.It Cm net
>  Boot the kernel specified using the
>  .Ic boot
>  parameter as if the VM was network boote

Re: [please test] tsc: derive frequency on AMD CPUs from MSRs

2022-09-23 Thread Timo Myyrä
Scott Cheloha  [2022-09-23, 14:51 -0500]:

> On Fri, Sep 23, 2022 at 10:40:19PM +0300, Timo Myyr?? wrote:
>
>> Scott Cheloha  [2022-09-23, 09:16 -0500]:
>> 
>> > [...]
>> >
>> > Test results?  Clues on reading the configuration space?
>> >
>> > [...]
>> 
>> Hi,
>> 
>> Here's a dmesg from thinkpad e485:
>
> Thanks for testing.
>
>> Does these timers affect the booting of kernel? Once I select the kernel
>> to boot by pressing enter on "bsd>" line, the boot process takes about
>> 18s to proceed from the "booting sr0a:/bsd".
>
> The patch reads a couple MSRs and prints ~10 additional lines during
> boot from the primary CPU.  The computed TSC frequency is not used by
> the kernel, only printed so I can check whether my code is correct.
>
> It should have zero impact on the length of the boot.  It should not
> change any runtime behavior whatsoever.
>
> Your boot probably should not be taking that long, but I can't imagine
> how my patch would cause such a dramatic change.
>
> If you reverse the patch, what happens?
>

I haven't been keeping track of boot times but I doubt it is new issue
with this laptop. Current cold boot seemed to pass the "counters" part
in about 5s so it varies a bit. I'll see if I find the time to dig
through the code to see what boot process is actually doing at that
point.

>> OpenBSD 7.2 (GENERIC.MP) #20: Fri Sep 23 22:27:31 EEST 2022
>> t...@asteroid.bittivirhe.fi:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>> [...]
>> cpu0 at mainbus0: apid 0 (boot processor)
>> cpu0: MSR C001_0064: en 1 base 2 mul 100 div 10 freq 20 Hz
>> cpu0: MSR C001_0065: en 1 base 2 mul 102 div 12 freq 17 Hz
>> cpu0: MSR C001_0066: en 1 base 2 mul 96 div 12 freq 16 Hz
>> cpu0: MSR C001_0067: en 0
>> cpu0: MSR C001_0068: en 0
>> cpu0: MSR C001_0069: en 0
>> cpu0: MSR C001_006A: en 0
>> cpu0: MSR C001_006B: en 0
>> cpu0: AMD Ryzen 5 2500U with Radeon Vega Mobile Gfx, 1996.30 MHz, 17-11-00
>> cpu0:
>> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
>> cpu0: 32KB 64b/line 8-way D-cache, 64KB 64b/line 4-way I-cache, 512KB 
>> 64b/line 8-way L2 cache, 4MB 64b/line 16-way L3 cache
>> tsc: calibrating with acpihpet0: 1996264149 Hz
>
> Your family 17h CPU has a computed P0 frequency of 2000MHz.  The
> calibrated TSC frequency is 1996264149 Hz.
>
> That seems right to me, thank you for testing.



Re: [please test] tsc: derive frequency on AMD CPUs from MSRs

2022-09-23 Thread Theo de Raadt
Scott Cheloha  wrote:

> > And it is the wrong time in the release cycle for this.
> 
> This doesn't need to make release, I'm just gauging interest and
> testing code.

But you didn't say that in your email.

But Worse, you didn't think that you need to say it.



Re: [please test] tsc: derive frequency on AMD CPUs from MSRs

2022-09-23 Thread Mike Larkin
On Sat, Sep 24, 2022 at 11:06:24AM +1000, Jonathan Gray wrote:
> On Fri, Sep 23, 2022 at 09:16:25AM -0500, Scott Cheloha wrote:
> > Hi,
> >
> > TL;DR:
> >
> > I want to compute the TSC frequency on AMD CPUs using the methods laid
> > out in the AMD manuals instead of calibrating the TSC by hand.
> >
> > If you have an AMD CPU with an invariant TSC, please apply this patch,
> > recompile/boot the resulting kernel, and send me the resulting dmesg.
> >
> > Family 10h-16h CPUs are especially interesting.  If you've got one,
> > don't be shy!
> >
> > Long explanation:
> >
> > On AMD CPUs we calibrate the TSC with a separate timer.  This is slow
> > and introduces error.  I also worry about a future where legacy timers
> > are absent or heavily gated (read: useless).
> >
> > This patch adds most of the code needed to compute the TSC frequency
> > on AMD family 10h+ CPUs.  CPUs prior to family 10h did not support an
> > invariant TSC so they are irrelevant.
> >
> > I have riddled the code with printf(9) calls so I can work out what's
> > wrong by hand if a test result makes no sense.
> >
> > The only missing piece is code to read the configuration space on
> > family 10h-16h CPUs to determine how many boosted P-states we need to
> > skip to get to the MSR describing the software P0 state.  I would
> > really appreciate it if someone could explain how to do this at this
> > very early point in boot.  jsg@ pointed me to pci_conf_read(9), but
> > I'm a little confused about how I get the needed pci* inputs at this
> > point in boot.
>
> I also said you shouldn't be looking at pci devices for this.

What you want to look at is section 2.1.4 of this:

https://developer.amd.com/wp-content/resources/56255_3_03.PDF

It describes what you need to do. It's for family 17 but I would guess
that there is an equivalent family 10/12/etc doc, and I'd be surprised
if any of this has changed in a long time.

If you can't figure it out, I'd suggest that we don't do this for
family 10/12/etc and use the old method for CPUs that don't have the
MSRs you need. I also sorta share jsg's opinion below, this feels
like a solution for a problem that really doesn't exist.

-ml

>
> I remain unconvinced that all of this is worth it compared to
> calibrating off a timer with a known rate.  And it is the wrong time in
> the release cycle for this.
>
> Boost could be disabled for the measurement if need by.
>
> AMD64 Architecture Programmer's Manual
> Volume 2: System Programming
> Publication No. 24593
> Revision 3.38
>
> "17.2 Core Performance Boost
> ...
> CPB can be disabled using the CPBDis field of the Hardware Configuration
> Register (HWCR MSR) on the appropriate core. When CPB is disabled,
> hardware limits the frequency and voltage of the core to those defined
> by P0.
>
> Support for core performance boost is indicated by
> CPUID Fn8000_0007_EDX[CPB] = 1."
>
> "3.2.10 Hardware Configuration Register (HWCR)
> ...
> CpbDis. Bit 25. Core performance boost disable. When set to 1, core 
> performance boost is disabled.
> "
>
> Processor Programming Reference (PPR)
> for AMD Family 17h Model 01h, Revision B1 Processors
> 54945 Rev 1.14 - April 15, 2017
>
> "MSRC001_0015 [Hardware Configuration] (HWCR)
>
> 25 CpbDis: core performance boost disable. Read-write.
> Reset: 0.  0=CPB is requested to be enabled.  1=CPB is disabled.
> Specifies whether core performance boost is requested to be enabled or
> disabled. If core performance boost is disabled while a core is in a
> boosted P-state, the core automatically transitions to the highest
> performance non-boosted P-state."
>
> also mentioned in
>
> BIOS and Kernel Developer's Guide (BKDG)
> For AMD Family 10h Processors
> 31116 Rev 3.48 - April 22, 2010
>
> >
> > --
> >
> > Test results?  Clues on reading the configuration space?
> >
> > -Scott
> >
> > Index: tsc.c
> > ===
> > RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
> > retrieving revision 1.29
> > diff -u -p -r1.29 tsc.c
> > --- tsc.c   22 Sep 2022 04:57:08 -  1.29
> > +++ tsc.c   23 Sep 2022 14:04:22 -
> > @@ -100,6 +100,253 @@ tsc_freq_cpuid(struct cpu_info *ci)
> > return (0);
> >  }
> >
> > +uint64_t
> > +tsc_freq_msr(struct cpu_info *ci)
> > +{
> > +   uint64_t base, def, did, did_lsd, did_msd, divisor, fid, multiplier;
> > +   uint32_t msr, off = 0;
> > +
> > +   if (strcmp(cpu_vendor, "AuthenticAMD") != 0)
> > +   return 0;
> > +
> > +   /*
> > +* All family 10h+ CPUs have MSR_HWCR and the TscFreqSel bit.
> > +* If TscFreqSel is not set the TSC does not advance at the P0
> > +* frequency, in which case something is wrong and we need to
> > +* calibrate by hand.
> > +*/
> > +#define HWCR_TSCFREQSEL (1 << 24)
> > +   if (!ISSET(rdmsr(MSR_HWCR), HWCR_TSCFREQSEL))   /* XXX specialreg.h */
> > +   return 0;
> > +#undef HWCR_TSCFREQSEL
> > +
> > +   /*
> > +* For families 10h, 12h, 14h, 15h, and 16h, we need to skip 

Re: [please test] tsc: derive frequency on AMD CPUs from MSRs

2022-09-23 Thread Scott Cheloha
On Fri, Sep 23, 2022 at 07:46:55PM -0600, Theo de Raadt wrote:
> > And it is the wrong time in the release cycle for this.
> 
> No kidding.
> 
> As this makes absolutely no difference for any existing code in 7.2,
> except the strong hazard of accidentally breaking a machine.

It does not need to make release.



Re: [please test] tsc: derive frequency on AMD CPUs from MSRs

2022-09-23 Thread Scott Cheloha
On Sat, Sep 24, 2022 at 11:06:24AM +1000, Jonathan Gray wrote:
> On Fri, Sep 23, 2022 at 09:16:25AM -0500, Scott Cheloha wrote:
> > [...]
> > 
> > The only missing piece is code to read the configuration space on
> > family 10h-16h CPUs to determine how many boosted P-states we need to
> > skip to get to the MSR describing the software P0 state.  I would
> > really appreciate it if someone could explain how to do this at this
> > very early point in boot.  jsg@ pointed me to pci_conf_read(9), but
> > I'm a little confused about how I get the needed pci* inputs at this
> > point in boot.
> 
> I also said you shouldn't be looking at pci devices for this.

Right, but the manual says that's where the information I want is
located.

I might be wrong, of course.  Can't know until I get a test on a CPU
in one of the relevant families.

> I remain unconvinced that all of this is worth it compared to
> calibrating off a timer with a known rate.

For Intel CPUs we use CPUID to determine the TSC frequency where the
leaf is available.  It seems "fair," for lack of a better word, to
make an effort to do the same for AMD CPUs.

The other available timers with known frequencies are not great and
they might be getting worse.  The ISA timer is heavily gated out of
the box on many contemporary machines where it is available.  You can
toggle the gating in the BIOS for now.The PM Timer and the HPET have
been slow to read for years.

I doubt these timers will improve.  At minimum, I think it's safe to
say that they are not a priority.  They are considered "legacy"
hardware, and you know what happens to the legacy stuff.

Calibrating the TSC with one of these other timers introduces error:

1. jmc@'s machine (-0.187% error):

cpu0: MSR C001_0064: en 1 base 2 mul 100 div 10 freq 20 Hz
tsc: calibrating with acpihpet0: 1996260074 Hz

2. robert@'s machine (-0.187% error):

cpu0: MSR C001_0064: en 1 base 2 mul 156 div 8 freq 39 Hz
tsc: calibrating with acpihpet0: 3892696616 Hz

3. Timo Myrra's machine (-0.187% error):

cpu0: MSR C001_0064: en 1 base 2 mul 100 div 10 freq 20 Hz
tsc: calibrating with acpihpet0: 1996264149 Hz

The calibration code can be improved, and I have a patch waiting in
the wings which does so, but you can't beat just *knowing* the
frequency.

... I think we need to make the TSC "just work" in as many contexts as
possible, especially on newer machines.

> And it is the wrong time in the release cycle for this.

This doesn't need to make release, I'm just gauging interest and
testing code.

> Boost could be disabled for the measurement if need by.
> 
> AMD64 Architecture Programmer's Manual
> Volume 2: System Programming
> Publication No. 24593
> Revision 3.38
> 
> "17.2 Core Performance Boost
> ...
> CPB can be disabled using the CPBDis field of the Hardware Configuration
> Register (HWCR MSR) on the appropriate core. When CPB is disabled,
> hardware limits the frequency and voltage of the core to those defined
> by P0.
> 
> Support for core performance boost is indicated by
> CPUID Fn8000_0007_EDX[CPB] = 1."
> 
> "3.2.10 Hardware Configuration Register (HWCR)
> ...
> CpbDis. Bit 25. Core performance boost disable. When set to 1, core 
> performance boost is disabled.
> "
> 
> Processor Programming Reference (PPR)
> for AMD Family 17h Model 01h, Revision B1 Processors
> 54945 Rev 1.14 - April 15, 2017
> 
> "MSRC001_0015 [Hardware Configuration] (HWCR)
> 
> 25 CpbDis: core performance boost disable. Read-write.
> Reset: 0.  0=CPB is requested to be enabled.  1=CPB is disabled.
> Specifies whether core performance boost is requested to be enabled or
> disabled. If core performance boost is disabled while a core is in a
> boosted P-state, the core automatically transitions to the highest
> performance non-boosted P-state."
> 
> [...]

(Caveat: I might be wrong.)

I believe this is only a toggle for whether the CPU can enter or
remain in a boosted P-state.  I do not think that toggling the feature
on or off rewrites the P-state voltage/frequency MSRs on the fly.
Toggling on or toggled off, we will still need a way to
programmatically decide whether a given MSR describes a boosted
P-state or P0.

I have a line on a Sempron machine (family 10h) south of Austin, TX,
$100.  If it works when I pick it up I will probably have it set up to
test within a few days.



Re: [please test] tsc: derive frequency on AMD CPUs from MSRs

2022-09-23 Thread Theo de Raadt
> And it is the wrong time in the release cycle for this.

No kidding.

As this makes absolutely no difference for any existing code in 7.2,
except the strong hazard of accidentally breaking a machine.



Re: [please test] tsc: derive frequency on AMD CPUs from MSRs

2022-09-23 Thread Jonathan Gray
On Fri, Sep 23, 2022 at 09:16:25AM -0500, Scott Cheloha wrote:
> Hi,
> 
> TL;DR:
> 
> I want to compute the TSC frequency on AMD CPUs using the methods laid
> out in the AMD manuals instead of calibrating the TSC by hand.
> 
> If you have an AMD CPU with an invariant TSC, please apply this patch,
> recompile/boot the resulting kernel, and send me the resulting dmesg.
> 
> Family 10h-16h CPUs are especially interesting.  If you've got one,
> don't be shy!
> 
> Long explanation:
> 
> On AMD CPUs we calibrate the TSC with a separate timer.  This is slow
> and introduces error.  I also worry about a future where legacy timers
> are absent or heavily gated (read: useless).
> 
> This patch adds most of the code needed to compute the TSC frequency
> on AMD family 10h+ CPUs.  CPUs prior to family 10h did not support an
> invariant TSC so they are irrelevant.
> 
> I have riddled the code with printf(9) calls so I can work out what's
> wrong by hand if a test result makes no sense.
> 
> The only missing piece is code to read the configuration space on
> family 10h-16h CPUs to determine how many boosted P-states we need to
> skip to get to the MSR describing the software P0 state.  I would
> really appreciate it if someone could explain how to do this at this
> very early point in boot.  jsg@ pointed me to pci_conf_read(9), but
> I'm a little confused about how I get the needed pci* inputs at this
> point in boot.

I also said you shouldn't be looking at pci devices for this.

I remain unconvinced that all of this is worth it compared to
calibrating off a timer with a known rate.  And it is the wrong time in
the release cycle for this.

Boost could be disabled for the measurement if need by.

AMD64 Architecture Programmer's Manual
Volume 2: System Programming
Publication No. 24593
Revision 3.38

"17.2 Core Performance Boost
...
CPB can be disabled using the CPBDis field of the Hardware Configuration
Register (HWCR MSR) on the appropriate core. When CPB is disabled,
hardware limits the frequency and voltage of the core to those defined
by P0.

Support for core performance boost is indicated by
CPUID Fn8000_0007_EDX[CPB] = 1."

"3.2.10 Hardware Configuration Register (HWCR)
...
CpbDis. Bit 25. Core performance boost disable. When set to 1, core performance 
boost is disabled.
"

Processor Programming Reference (PPR)
for AMD Family 17h Model 01h, Revision B1 Processors
54945 Rev 1.14 - April 15, 2017

"MSRC001_0015 [Hardware Configuration] (HWCR)

25 CpbDis: core performance boost disable. Read-write.
Reset: 0.  0=CPB is requested to be enabled.  1=CPB is disabled.
Specifies whether core performance boost is requested to be enabled or
disabled. If core performance boost is disabled while a core is in a
boosted P-state, the core automatically transitions to the highest
performance non-boosted P-state."

also mentioned in

BIOS and Kernel Developer's Guide (BKDG)
For AMD Family 10h Processors
31116 Rev 3.48 - April 22, 2010

> 
> --
> 
> Test results?  Clues on reading the configuration space?
> 
> -Scott
> 
> Index: tsc.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
> retrieving revision 1.29
> diff -u -p -r1.29 tsc.c
> --- tsc.c 22 Sep 2022 04:57:08 -  1.29
> +++ tsc.c 23 Sep 2022 14:04:22 -
> @@ -100,6 +100,253 @@ tsc_freq_cpuid(struct cpu_info *ci)
>   return (0);
>  }
>  
> +uint64_t
> +tsc_freq_msr(struct cpu_info *ci)
> +{
> + uint64_t base, def, did, did_lsd, did_msd, divisor, fid, multiplier;
> + uint32_t msr, off = 0;
> +
> + if (strcmp(cpu_vendor, "AuthenticAMD") != 0)
> + return 0;
> +
> + /*
> +  * All family 10h+ CPUs have MSR_HWCR and the TscFreqSel bit.
> +  * If TscFreqSel is not set the TSC does not advance at the P0
> +  * frequency, in which case something is wrong and we need to
> +  * calibrate by hand.
> +  */
> +#define HWCR_TSCFREQSEL (1 << 24)
> + if (!ISSET(rdmsr(MSR_HWCR), HWCR_TSCFREQSEL))   /* XXX specialreg.h */
> + return 0;
> +#undef HWCR_TSCFREQSEL
> +
> + /*
> +  * For families 10h, 12h, 14h, 15h, and 16h, we need to skip past
> +  * the boosted P-states (Pb0, Pb1, etc.) to find the MSR describing
> +  * P0, i.e. the highest performance unboosted P-state.  The number
> +  * of boosted states is kept in the "Core Performance Boost Control"
> +  * configuration space register.
> +  */
> +#ifdef __not_yet__
> + uint32_t reg;
> + switch (ci->ci_family) {
> + case 0x10:
> + /* XXX How do I read config space at this point in boot? */
> + reg = read_config_space(F4x15C);
> + off = (reg >> 2) & 0x1;
> + break;
> + case 0x12:
> + case 0x14:
> + case 0x15:
> + case 0x16:
> + /* XXX How do I read config space at this point in boot? */
> + reg = read_config_space(D18F4x15C);
> + off = (reg >> 2) & 0x7;

Re: memory barrier in counters_zero

2022-09-23 Thread Alexander Bluhm
Anyone?

On Sat, Sep 17, 2022 at 04:28:15PM +0200, Alexander Bluhm wrote:
> Hi,
> 
> Inspired by Taylor's talk at EuroBSDCon I think a memory barrier
> in counters_zero() is missing.  Reading uses two consumer barriers,
> so writing should also have two.
> 
> Following code would have no barrier between writing generation
> number and writing counters.
> 
> counters_leave();
> counters_zero();
> 
> counters_leave() writes to generation number at the end, so
> counters_zero() needs a barrier at the start.
> 
> ok?
> 
> bluhm
> 
> Index: kern/subr_percpu.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/kern/subr_percpu.c,v
> retrieving revision 1.9
> diff -u -p -r1.9 subr_percpu.c
> --- kern/subr_percpu.c10 Mar 2021 10:21:47 -  1.9
> +++ kern/subr_percpu.c17 Sep 2022 14:17:34 -
> @@ -213,6 +213,7 @@ counters_zero(struct cpumem *cm, unsigne
>   unsigned int i;
> 
>   counters = cpumem_first(&cmi, cm);
> + membar_producer();
>   do {
>   for (i = 0; i < n; i++)
>   counters[i] = 0;



Re: protocol attach wait

2022-09-23 Thread Alexander Bluhm
On Sat, Sep 03, 2022 at 03:39:00AM +0300, Vitaliy Makkoveev wrote:
> On Fri, Sep 02, 2022 at 11:56:02AM +0200, Alexander Bluhm wrote:
> I'm not blocking this, may be something other has the different opinion.

I strongly believe that userland should not care about short time
memory shortage in kernel.  This is the idea of the M_WAIT flag.

Is there any reason why I should not commit this and keep unreliable
socket(2) system call?

> 
> > ok?
> > 
> > bluhm
> > 
> > Index: kern/uipc_socket.c
> > ===
> > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_socket.c,v
> > retrieving revision 1.286
> > diff -u -p -r1.286 uipc_socket.c
> > --- kern/uipc_socket.c  28 Aug 2022 18:43:12 -  1.286
> > +++ kern/uipc_socket.c  1 Sep 2022 18:47:36 -
> > @@ -138,11 +138,12 @@ soinit(void)
> >  }
> >  
> >  struct socket *
> > -soalloc(int prflags)
> > +soalloc(int wait)
> >  {
> > struct socket *so;
> >  
> > -   so = pool_get(&socket_pool, prflags);
> > +   so = pool_get(&socket_pool, (wait == M_WAIT ? PR_WAITOK : PR_NOWAIT) |
> > +   PR_ZERO);
> > if (so == NULL)
> > return (NULL);
> > rw_init_flags(&so->so_lock, "solock", RWL_DUPOK);
> > @@ -174,7 +175,7 @@ socreate(int dom, struct socket **aso, i
> > return (EPROTONOSUPPORT);
> > if (prp->pr_type != type)
> > return (EPROTOTYPE);
> > -   so = soalloc(PR_WAITOK | PR_ZERO);
> > +   so = soalloc(M_WAIT);
> > klist_init(&so->so_rcv.sb_sel.si_note, &socket_klistops, so);
> > klist_init(&so->so_snd.sb_sel.si_note, &socket_klistops, so);
> > sigio_init(&so->so_sigio);
> > @@ -193,7 +194,7 @@ socreate(int dom, struct socket **aso, i
> > so->so_rcv.sb_timeo_nsecs = INFSLP;
> >  
> > solock(so);
> > -   error = pru_attach(so, proto);
> > +   error = pru_attach(so, proto, M_WAIT);
> > if (error) {
> > so->so_state |= SS_NOFDREF;
> > /* sofree() calls sounlock(). */
> > Index: kern/uipc_socket2.c
> > ===
> > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_socket2.c,v
> > retrieving revision 1.127
> > diff -u -p -r1.127 uipc_socket2.c
> > --- kern/uipc_socket2.c 13 Aug 2022 21:01:46 -  1.127
> > +++ kern/uipc_socket2.c 1 Sep 2022 18:47:36 -
> > @@ -168,7 +168,7 @@ soisdisconnected(struct socket *so)
> >   * Connstatus may be 0 or SS_ISCONNECTED.
> >   */
> >  struct socket *
> > -sonewconn(struct socket *head, int connstatus)
> > +sonewconn(struct socket *head, int connstatus, int wait)
> >  {
> > struct socket *so;
> > int persocket = solock_persocket(head);
> > @@ -185,7 +185,7 @@ sonewconn(struct socket *head, int conns
> > return (NULL);
> > if (head->so_qlen + head->so_q0len > head->so_qlimit * 3)
> > return (NULL);
> > -   so = soalloc(PR_NOWAIT | PR_ZERO);
> > +   so = soalloc(wait);
> > if (so == NULL)
> > return (NULL);
> > so->so_type = head->so_type;
> > @@ -238,7 +238,7 @@ sonewconn(struct socket *head, int conns
> > sounlock(head);
> > }
> >  
> > -   error = pru_attach(so, 0);
> > +   error = pru_attach(so, 0, wait);
> >  
> > if (persocket) {
> > sounlock(so);
> > Index: kern/uipc_usrreq.c
> > ===
> > RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_usrreq.c,v
> > retrieving revision 1.181
> > diff -u -p -r1.181 uipc_usrreq.c
> > --- kern/uipc_usrreq.c  31 Aug 2022 21:23:02 -  1.181
> > +++ kern/uipc_usrreq.c  1 Sep 2022 18:47:36 -
> > @@ -302,7 +302,7 @@ const struct sysctl_bounded_args unpdgct
> >  };
> >  
> >  int
> > -uipc_attach(struct socket *so, int proto)
> > +uipc_attach(struct socket *so, int proto, int wait)
> >  {
> > struct unpcb *unp;
> > int error;
> > @@ -330,7 +330,8 @@ uipc_attach(struct socket *so, int proto
> > if (error)
> > return (error);
> > }
> > -   unp = pool_get(&unpcb_pool, PR_NOWAIT|PR_ZERO);
> > +   unp = pool_get(&unpcb_pool, (wait == M_WAIT ? PR_WAITOK : PR_NOWAIT) |
> > +   PR_ZERO);
> > if (unp == NULL)
> > return (ENOBUFS);
> > refcnt_init(&unp->unp_refcnt);
> > @@ -855,7 +856,7 @@ unp_connect(struct socket *so, struct mb
> > solock(so2);
> >  
> > if ((so2->so_options & SO_ACCEPTCONN) == 0 ||
> > -   (so3 = sonewconn(so2, 0)) == NULL) {
> > +   (so3 = sonewconn(so2, 0, M_WAIT)) == NULL) {
> > error = ECONNREFUSED;
> > }
> >  
> > Index: net/pfkeyv2.c
> > ===
> > RCS file: /data/mirror/openbsd/cvs/src/sys/net/pfkeyv2.c,v
> > retrieving revision 1.248
> > diff -u -p -r1.248 pfkeyv2.c
> > --- net/pfkeyv2.c   31 Aug 2022 21:23:02 -  1.248
> > +++ net/pfkeyv2.c   2 Sep 20

vm.conf.5: Use the proper macros

2022-09-23 Thread Josiah Frentsos
Index: vm.conf.5
===
RCS file: /cvs/src/usr.sbin/vmd/vm.conf.5,v
retrieving revision 1.59
diff -u -p -r1.59 vm.conf.5
--- vm.conf.5   13 Sep 2022 10:28:19 -  1.59
+++ vm.conf.5   23 Sep 2022 19:36:01 -
@@ -25,8 +25,7 @@
 .Nm
 is the configuration file to configure the virtual machine monitor
 (VMM) subsystem.
-A VMM manages virtual machines (VMs) on a
-.Ar host .
+A VMM manages virtual machines (VMs) on a host.
 The VMM subsystem is responsible for creating, destroying, and
 executing VMs.
 .Pp
@@ -45,8 +44,7 @@ Configuration for each individual virtua
 Configuration for virtual switches.
 .El
 .Pp
-Within the sections,
-the
+Within the sections, the
 .Ar bytes
 argument can be specified with a human-readable scale,
 using the format described in
@@ -91,7 +89,11 @@ vm "vm1.example.com" {
 .Sh GLOBAL CONFIGURATION
 The following setting can be configured globally:
 .Bl -tag -width Ds
-.It Ic agentx Oo Ic context Ar context Oc Oo Ic path Ar path Oc
+.It Xo
+.Ic agentx
+.Op Ic context Ar context
+.Op Ic path Ar path
+.Xc
 Export vm metrics via an AgentX compatible
 .Pq snmp
 daemon by connecting to
@@ -102,9 +104,9 @@ If
 .Ar path
 is omitted it will default to
 .Pa /var/agentx/master .
-.Ar Context
+.Ar context
 is the SNMPv3 context and can usually be omitted.
-.It Ic local prefix Ar address Ns Li / Ns Ar prefix
+.It Ic local prefix Ar address Ns / Ns Ar prefix
 Set the network prefix that is used to allocate subnets for
 local interfaces, see
 .Ic local interface
@@ -112,22 +114,23 @@ in the
 .Sx VM CONFIGURATION
 section below.
 The default is
-.Ar 100.64.0.0/10 .
-.It Ic local inet6 Op Ic prefix Ar address Ns Li / Ns Ar prefix
+.Cm 100.64.0.0/10 .
+.It Ic local inet6 Op Ic prefix Ar address Ns / Ns Ar prefix
 Enable IPv6 on local interfaces and allocate routable subnets.
 If the prefix is not specified,
 a random prefix from the
 .Dq unique local
-network range
-.Ar fd00::/8
-will be generated on startup.
+network range fd00::/8 will be generated on startup.
 The specified prefix length must be /64 or smaller.
-.It Cm socket owner Ar user : Ns Ar group
-Set the control socket owner to the specified user and group.
+.It Ic socket owner Ar user : Ns Ar group
+Set the control socket owner to the specified
+.Ar user
+and
+.Ar group .
 Users with access to the control socket will be allowed to use
-.Nm vmctl
+.Xr vmctl 8
 for restricted access to
-.Nm vmd .
+.Xr vmd 8 .
 If only
 .Ar user
 is given,
@@ -137,7 +140,7 @@ If only
 is given,
 only the group is set.
 The default is
-.Ar root : Ns Ar wheel .
+.Cm root:wheel .
 .It Ic staggered start parallel Ar parallelism Ic delay Ar seconds
 Start all configured VMs in a staggered fashion with
 .Ar parallelism
@@ -166,28 +169,28 @@ Typically this is a hostname.
 .Pp
 Followed by a block of parameters that is enclosed in curly brackets:
 .Bl -tag -width Ds
-.It Cm allow instance Brq ...
+.It Ic allow instance Brq ...
 Set the permissions to create VM instances.
 See
 .Sx VM INSTANCES .
-.It Cm boot Ar path
+.It Ic boot Ar path
 Kernel or BIOS image to load when booting the VM.
 If not specified, the default is to boot using the BIOS image in
 .Pa /etc/firmware/vmm-bios .
-.It Cm boot device Ar device
+.It Ic boot device Ar device
 Force VM to boot from
 .Ar device .
 Valid values are:
 .Bl -tag -width "cdrom"
-.It Ar cdrom
+.It Cm cdrom
 Boot the ISO image file specified using the
 .Ic cdrom
 parameter.
-.It Ar disk
+.It Cm disk
 Boot from the disk image file specified using the
 .Ic disk
 parameter.
-.It Ar net
+.It Cm net
 Boot the kernel specified using the
 .Ic boot
 parameter as if the VM was network booted.
@@ -201,45 +204,50 @@ but rather a simulated network boot.
 .El
 .Pp
 Currently
-.Ar disk
+.Cm disk
 and
-.Ar cdrom
+.Cm cdrom
 only work with VMs booted using BIOS.
-.It Cm cdrom Ar path
+.It Ic cdrom Ar path
 ISO image file.
-.It Cm enable
+.It Ic enable
 Automatically start the VM.
 This is the default if neither
-.Cm enable
+.Ic enable
 nor
-.Cm disable
+.Ic disable
 is specified.
-.It Cm disable
+.It Ic disable
 Do not start this VM.
-.It Cm disk Ar path Op Cm format Ar fmt
+.It Ic disk Ar path Op Ic format Ar fmt
 Disk image file (may be specified multiple times to add multiple disk images).
 The format may be specified as either
-.Ar qcow2
+.Cm qcow2
 or
-.Ar raw .
+.Cm raw .
 If left unspecified, the format defaults to
-.Pa raw
+.Cm raw
 if it cannot be derived automatically.
-.It Oo Cm local Oc Cm interface Oo name Oc Op Brq ...
+.It Xo
+.Op Ic local
+.Ic interface
+.Op Ar name
+.Op Brq ...
+.Xc
 Network interface to add to the VM.
 The optional
 .Ar name
 can be either
-.Sq tap
+.Cm tap
 to select the next available
 .Xr tap 4
 interface on the VM host side (the default) or
-.Ar tapN
+.Cm tap Ns Ar N
 to select a specific one.
 .Pp
 Valid options are:
 .Bl -tag -width Ds
-.It Cm group Ar group-name
+.It Ic group Ar group-name
 Assign the interface to a specific interface

httpd: default to text/plain for .diff and .patch

2022-09-23 Thread Klemens Nanni
In the few places I run httpd(8) patches usually float around and I do
look at them with Firefox.

But unless httpd.conf(5) contains `types { text/plain diff patch }' the
browser will download the file instead of displaying it as text.

Would it be sensible to add them to the default list so this line is not
needed?

Not sure what implications this could have and/or what other web
browers do by default, that should probably be evaluated first.

At least GitHub delivers .diff and .patch links with this content type,
but it is also a slightly different scenario since GitHub, cgit, etc.
generate and know the actual content type whereas httpd like type merely
matches based on the filename suffix.

$ curl -sI https://github.com/openbsd/src/commit/HEAD.patch |
grep -i content-type
content-type: text/plain; charset=utf-8
x-content-type-options: nosniff

Feedback?

Index: http.h
===
RCS file: /cvs/src/usr.sbin/httpd/http.h,v
retrieving revision 1.16
diff -u -p -r1.16 http.h
--- http.h  12 Sep 2020 07:34:17 -  1.16
+++ http.h  23 Sep 2022 21:26:23 -
@@ -217,6 +217,8 @@ struct http_mediatype {
{ "css","text", "css" },\
{ "html",   "text", "html" },   \
{ "txt","text", "plain" },  \
+   { "diff",   "text", "plain" },  \
+   { "patch",  "text", "plain" },  \
{ "gif","image","gif" },\
{ "jpeg",   "image","jpeg" },   \
{ "jpg","image","jpeg" },   \



Re: [please test] tsc: derive frequency on AMD CPUs from MSRs

2022-09-23 Thread Scott Cheloha
On Fri, Sep 23, 2022 at 10:40:19PM +0300, Timo Myyr?? wrote:
> Scott Cheloha  [2022-09-23, 09:16 -0500]:
> 
> > [...]
> >
> > Test results?  Clues on reading the configuration space?
> >
> > [...]
> 
> Hi,
> 
> Here's a dmesg from thinkpad e485:

Thanks for testing.

> Does these timers affect the booting of kernel? Once I select the kernel
> to boot by pressing enter on "bsd>" line, the boot process takes about
> 18s to proceed from the "booting sr0a:/bsd".

The patch reads a couple MSRs and prints ~10 additional lines during
boot from the primary CPU.  The computed TSC frequency is not used by
the kernel, only printed so I can check whether my code is correct.

It should have zero impact on the length of the boot.  It should not
change any runtime behavior whatsoever.

Your boot probably should not be taking that long, but I can't imagine
how my patch would cause such a dramatic change.

If you reverse the patch, what happens?

> OpenBSD 7.2 (GENERIC.MP) #20: Fri Sep 23 22:27:31 EEST 2022
> t...@asteroid.bittivirhe.fi:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> [...]
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: MSR C001_0064: en 1 base 2 mul 100 div 10 freq 20 Hz
> cpu0: MSR C001_0065: en 1 base 2 mul 102 div 12 freq 17 Hz
> cpu0: MSR C001_0066: en 1 base 2 mul 96 div 12 freq 16 Hz
> cpu0: MSR C001_0067: en 0
> cpu0: MSR C001_0068: en 0
> cpu0: MSR C001_0069: en 0
> cpu0: MSR C001_006A: en 0
> cpu0: MSR C001_006B: en 0
> cpu0: AMD Ryzen 5 2500U with Radeon Vega Mobile Gfx, 1996.30 MHz, 17-11-00
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
> cpu0: 32KB 64b/line 8-way D-cache, 64KB 64b/line 4-way I-cache, 512KB 
> 64b/line 8-way L2 cache, 4MB 64b/line 16-way L3 cache
> tsc: calibrating with acpihpet0: 1996264149 Hz

Your family 17h CPU has a computed P0 frequency of 2000MHz.  The
calibrated TSC frequency is 1996264149 Hz.

That seems right to me, thank you for testing.



Re: [please test] tsc: derive frequency on AMD CPUs from MSRs

2022-09-23 Thread Timo Myyrä
Scott Cheloha  [2022-09-23, 09:16 -0500]:

> Hi,
>
> TL;DR:
>
> I want to compute the TSC frequency on AMD CPUs using the methods laid
> out in the AMD manuals instead of calibrating the TSC by hand.
>
> If you have an AMD CPU with an invariant TSC, please apply this patch,
> recompile/boot the resulting kernel, and send me the resulting dmesg.
>
> Family 10h-16h CPUs are especially interesting.  If you've got one,
> don't be shy!
>
> Long explanation:
>
> On AMD CPUs we calibrate the TSC with a separate timer.  This is slow
> and introduces error.  I also worry about a future where legacy timers
> are absent or heavily gated (read: useless).
>
> This patch adds most of the code needed to compute the TSC frequency
> on AMD family 10h+ CPUs.  CPUs prior to family 10h did not support an
> invariant TSC so they are irrelevant.
>
> I have riddled the code with printf(9) calls so I can work out what's
> wrong by hand if a test result makes no sense.
>
> The only missing piece is code to read the configuration space on
> family 10h-16h CPUs to determine how many boosted P-states we need to
> skip to get to the MSR describing the software P0 state.  I would
> really appreciate it if someone could explain how to do this at this
> very early point in boot.  jsg@ pointed me to pci_conf_read(9), but
> I'm a little confused about how I get the needed pci* inputs at this
> point in boot.
>
> --
>
> Test results?  Clues on reading the configuration space?
>
> -Scott
>
> Index: tsc.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
> retrieving revision 1.29
> diff -u -p -r1.29 tsc.c
> --- tsc.c 22 Sep 2022 04:57:08 -  1.29
> +++ tsc.c 23 Sep 2022 14:04:22 -
> @@ -100,6 +100,253 @@ tsc_freq_cpuid(struct cpu_info *ci)
>   return (0);
>  }
>  
> +uint64_t
> +tsc_freq_msr(struct cpu_info *ci)
> +{
> + uint64_t base, def, did, did_lsd, did_msd, divisor, fid, multiplier;
> + uint32_t msr, off = 0;
> +
> + if (strcmp(cpu_vendor, "AuthenticAMD") != 0)
> + return 0;
> +
> + /*
> +  * All family 10h+ CPUs have MSR_HWCR and the TscFreqSel bit.
> +  * If TscFreqSel is not set the TSC does not advance at the P0
> +  * frequency, in which case something is wrong and we need to
> +  * calibrate by hand.
> +  */
> +#define HWCR_TSCFREQSEL (1 << 24)
> + if (!ISSET(rdmsr(MSR_HWCR), HWCR_TSCFREQSEL))   /* XXX specialreg.h */
> + return 0;
> +#undef HWCR_TSCFREQSEL
> +
> + /*
> +  * For families 10h, 12h, 14h, 15h, and 16h, we need to skip past
> +  * the boosted P-states (Pb0, Pb1, etc.) to find the MSR describing
> +  * P0, i.e. the highest performance unboosted P-state.  The number
> +  * of boosted states is kept in the "Core Performance Boost Control"
> +  * configuration space register.
> +  */
> +#ifdef __not_yet__
> + uint32_t reg;
> + switch (ci->ci_family) {
> + case 0x10:
> + /* XXX How do I read config space at this point in boot? */
> + reg = read_config_space(F4x15C);
> + off = (reg >> 2) & 0x1;
> + break;
> + case 0x12:
> + case 0x14:
> + case 0x15:
> + case 0x16:
> + /* XXX How do I read config space at this point in boot? */
> + reg = read_config_space(D18F4x15C);
> + off = (reg >> 2) & 0x7;
> + break;
> + default:
> + break;
> + }
> +#endif
> +
> +/* DEBUG Let's look at all the MSRs to check my math. */
> +for (; off < 8; off++) {
> +
> + /*
> +  * In family 10h+, core P-state voltage/frequency definitions
> +  * are kept in MSRs C001_006[4:B] (eight registers in total).
> +  * All MSRs in the range are readable, but if the EN bit isn't
> +  * set the register doesn't define a valid P-state.
> +  */
> + msr = 0xc0010064 + off; /* XXX specialreg.h */
> + def = rdmsr(msr);
> + printf("%s: MSR %04X_%04X: en %d",
> + ci->ci_dev->dv_xname, msr >> 16, msr & 0x,
> + !!ISSET(def, 1ULL << 63));
> + if (!ISSET(def, 1ULL << 63)) {  /* XXX specialreg.h */
> + printf("\n");
> + continue;
> + }
> + switch (ci->ci_family) {
> + case 0x10:
> + /* AMD Family 10h Processor BKDG, Rev 3.62, p. 429 */
> + base = 1;   /* 100.0 MHz */
> + did = (def >> 6) & 0x7;
> + divisor = 1ULL << did;
> + fid = def & 0x1f;
> + multiplier = fid + 0x10;
> + printf(" base %llu did %llu div %llu fid %llu mul %llu",
> + base, did, divisor, fid, multiplier);
> + break;
> + case 0x11:
> + /* AMD Family 11h Processor BKDG, Rev 3.62, p. 236 */
> + base = 1;   /* 100.0 MHz */
> + did = (def >> 6) & 0x7;
> + divisor = 1ULL << did;
> + fid = def 

Re: grdc: show timezone when TZ is set

2022-09-23 Thread Florian Obser
deraadt objected to the time zone validation. I don't care about the
feature and I agree with the point that I shouldn't do it because there
is no API for it. I don't even know where the time zone files are.

To make this all more symmetric always print tm_zone, even if TZ is not
set.

OK?

diff --git grdc.6 grdc.6
index 16e1758c6d2..5aa6e84a2d2 100644
--- grdc.6
+++ grdc.6
@@ -34,8 +34,11 @@ key exits the program.
 .Bl -tag -width Ds
 .It Ev TZ
 The time zone to use for displaying the time.
-It is specified as a pathname relative to
-.Pa /usr/share/zoneinfo .
+It is normally specified as a pathname relative to
+.Pa /usr/share/zoneinfo ,
+though see
+.Xr tzset 3
+for more information.
 If this variable is not set, the time zone is determined based on
 .Pa /etc/localtime .
 .El
diff --git grdc.c grdc.c
index 66e5eee79e6..01d29b08d64 100644
--- grdc.c
+++ grdc.c
@@ -12,6 +12,7 @@
  */
 
 #include 
+#include 
 
 #include 
 #include 
@@ -20,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -78,9 +80,14 @@ main(int argc, char *argv[])
int xbase;
int ybase;
int wintoosmall;
+   int tz_len = 0;
+   int h, m;
+   int prev_tm_gmtoff;
char *tz;
 
tz = getenv("TZ");
+   if (tz != NULL)
+   tz_len = strlen(tz);
 
scrol = wintoosmall = 0;
while ((i = getopt(argc, argv, "sh")) != -1) {
@@ -135,6 +142,7 @@ main(int argc, char *argv[])
 
curs_set(0);
sigwinched = 1; /* force initial sizing */
+   prev_tm_gmtoff = 24 * 3600; /* force initial header printing */
 
clock_gettime(CLOCK_REALTIME, &now);
if (n) {
@@ -152,9 +160,11 @@ main(int argc, char *argv[])
set(tm->tm_hour / 10, 24);
set(10, 7);
set(10, 17);
-   if (sigwinched) {
+   /* force repaint if window size changed or DST changed */
+   if (sigwinched || prev_tm_gmtoff != tm->tm_gmtoff) {
sigwinched = 0;
wintoosmall = 0;
+   prev_tm_gmtoff = tm->tm_gmtoff;
getwinsize(&i, &j);
if (i >= XLENGTH + 2)
xbase = (i - XLENGTH) / 2;
@@ -184,11 +194,19 @@ main(int argc, char *argv[])
move(ybase, xbase + XLENGTH);
vline(ACS_VLINE, YDEPTH);
 
-   if (tz != NULL) {
-   move(ybase - 1, xbase);
-   printw("[ %s %+d ]", tz,
-   tm->tm_gmtoff / 60 / 60 );
-   }
+   move(ybase - 1, xbase);
+
+   h = tm->tm_gmtoff / 3600;
+   m = abs((int)tm->tm_gmtoff % 3600 / 60);
+
+   if (tz_len > 0 && tz_len <= XLENGTH -
+   strlen("[  () + ]") -
+   strlen(tm->tm_zone))
+   printw("[ %s (%s) %+2.2d%02d ]", tz,
+   tm->tm_zone, h, m);
+   else
+   printw("[ %s %+2.2d%02d ]",
+   tm->tm_zone, h, m);
 
attrset(COLOR_PAIR(2));
}

-- 
I'm not entirely sure you are real.



rad.conf(5) is not rad(8)

2022-09-23 Thread Josiah Frentsos
Index: rad.conf.5
===
RCS file: /cvs/src/usr.sbin/rad/rad.conf.5,v
retrieving revision 1.17
diff -u -p -r1.17 rad.conf.5
--- rad.conf.5  16 May 2020 16:58:12 -  1.17
+++ rad.conf.5  23 Sep 2022 18:52:37 -
@@ -158,7 +158,7 @@ Example configuration file.
 .El
 .Sh EXAMPLES
 With the following example configuration,
-.Nm
+.Xr rad 8
 will pick a prefix from the ix1 interface and send router advertisements on it:
 .Pp
 .Dl interface ix1



Re: grdc: show timezone when TZ is set

2022-09-23 Thread Florian Obser
So, with the tzset(3) restriction in place I'd like to fix grdc, because
what we currently have is wrong:

There are time zones that have minute offsets, display those
correctly. Pointed out by pjanzen@.
To display the offset, use ISO 8601, as suggested by David Goerger.

Take a guess if tzset(3) will accept the time zone specified in TZ as
a relative path and only display it if that is true and it's not too
long. All time zones currently in /usr/share/zoneinfo fit.

Lastly check if tm->tm_gmtoff changed which probably means that we
moved in or out of daylight savings time.

OK?

p.s. I don't know what to do about wintoosmall at this time, the diff is
big enough as it is. And quite frankly I don't care about that.

diff --git grdc.6 grdc.6
index 16e1758c6d2..5aa6e84a2d2 100644
--- grdc.6
+++ grdc.6
@@ -34,8 +34,11 @@ key exits the program.
 .Bl -tag -width Ds
 .It Ev TZ
 The time zone to use for displaying the time.
-It is specified as a pathname relative to
-.Pa /usr/share/zoneinfo .
+It is normally specified as a pathname relative to
+.Pa /usr/share/zoneinfo ,
+though see
+.Xr tzset 3
+for more information.
 If this variable is not set, the time zone is determined based on
 .Pa /etc/localtime .
 .El
diff --git grdc.c grdc.c
index 66e5eee79e6..4bb7fa1b1af 100644
--- grdc.c
+++ grdc.c
@@ -12,6 +12,7 @@
  */
 
 #include 
+#include 
 
 #include 
 #include 
@@ -20,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -45,6 +47,7 @@ void getwinsize(int *, int *);
 void set(int, int);
 void standt(int);
 void __dead usage(void);
+int check_tz(const char *);
 
 void
 sigalrm(int signo)
@@ -64,6 +67,36 @@ sigresize(int signo)
sigwinched = signo;
 }
 
+/* Take a guess if tzset(3) will accept TZ as a relative path */
+int
+check_tz(const char *tz)
+{
+   struct stat  sb;
+   char fullname[PATH_MAX];
+   int  i;
+
+   if (tz == NULL)
+   return 0;
+
+   if (tz[0] == ':')
+   tz++;
+
+   if (tz[0] == '/' || strstr(tz, "../") != NULL)
+   return 0;
+
+   i = snprintf(fullname, sizeof(fullname), "/usr/share/zoneinfo/%s", tz);
+   if (i < 0 || i >= sizeof(fullname))
+   return 0;
+
+   if (stat(fullname, &sb) == -1)
+   return 0;
+
+   if (!S_ISREG(sb.st_mode))
+   return 0;
+
+   return 1;
+}
+
 int
 main(int argc, char *argv[])
 {
@@ -78,9 +111,19 @@ main(int argc, char *argv[])
int xbase;
int ybase;
int wintoosmall;
+   int tz_valid;
+   int tz_len = 0;
+   int prev_tm_gmtoff;
char *tz;
 
tz = getenv("TZ");
+   tz_valid = check_tz(tz);
+
+   if (tz_valid) {
+   if (tz[0] == ':')
+   tz++;
+   tz_len = strlen(tz);
+   }
 
scrol = wintoosmall = 0;
while ((i = getopt(argc, argv, "sh")) != -1) {
@@ -135,6 +178,7 @@ main(int argc, char *argv[])
 
curs_set(0);
sigwinched = 1; /* force initial sizing */
+   prev_tm_gmtoff = 24 * 3600; /* force initial header printing */
 
clock_gettime(CLOCK_REALTIME, &now);
if (n) {
@@ -152,9 +196,11 @@ main(int argc, char *argv[])
set(tm->tm_hour / 10, 24);
set(10, 7);
set(10, 17);
-   if (sigwinched) {
+   /* force repaint if window size changed or DST changed */
+   if (sigwinched || prev_tm_gmtoff != tm->tm_gmtoff) {
sigwinched = 0;
wintoosmall = 0;
+   prev_tm_gmtoff = tm->tm_gmtoff;
getwinsize(&i, &j);
if (i >= XLENGTH + 2)
xbase = (i - XLENGTH) / 2;
@@ -185,9 +231,20 @@ main(int argc, char *argv[])
vline(ACS_VLINE, YDEPTH);
 
if (tz != NULL) {
+   int h, m;
+   h = tm->tm_gmtoff / 3600;
+   m = abs((int)tm->tm_gmtoff % 3600 / 60);
+   if (tz_valid && tz_len > XLENGTH -
+   strlen("[  () + ]") -
+   strlen(tm->tm_zone))
+   tz_valid = 0;
move(ybase - 1, xbase);
-   printw("[ %s %+d ]", tz,
-   tm->tm_gmtoff / 60 / 60 );
+   if (tz_valid)
+   printw("[ %s (%s) %+2.2d%02d ]",
+   tz, tm->tm_zone, h, m);
+   else
+   printw("[ %s %+2.2d%02d ]",
+ 

[please test] tsc: derive frequency on AMD CPUs from MSRs

2022-09-23 Thread Scott Cheloha
Hi,

TL;DR:

I want to compute the TSC frequency on AMD CPUs using the methods laid
out in the AMD manuals instead of calibrating the TSC by hand.

If you have an AMD CPU with an invariant TSC, please apply this patch,
recompile/boot the resulting kernel, and send me the resulting dmesg.

Family 10h-16h CPUs are especially interesting.  If you've got one,
don't be shy!

Long explanation:

On AMD CPUs we calibrate the TSC with a separate timer.  This is slow
and introduces error.  I also worry about a future where legacy timers
are absent or heavily gated (read: useless).

This patch adds most of the code needed to compute the TSC frequency
on AMD family 10h+ CPUs.  CPUs prior to family 10h did not support an
invariant TSC so they are irrelevant.

I have riddled the code with printf(9) calls so I can work out what's
wrong by hand if a test result makes no sense.

The only missing piece is code to read the configuration space on
family 10h-16h CPUs to determine how many boosted P-states we need to
skip to get to the MSR describing the software P0 state.  I would
really appreciate it if someone could explain how to do this at this
very early point in boot.  jsg@ pointed me to pci_conf_read(9), but
I'm a little confused about how I get the needed pci* inputs at this
point in boot.

--

Test results?  Clues on reading the configuration space?

-Scott

Index: tsc.c
===
RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
retrieving revision 1.29
diff -u -p -r1.29 tsc.c
--- tsc.c   22 Sep 2022 04:57:08 -  1.29
+++ tsc.c   23 Sep 2022 14:04:22 -
@@ -100,6 +100,253 @@ tsc_freq_cpuid(struct cpu_info *ci)
return (0);
 }
 
+uint64_t
+tsc_freq_msr(struct cpu_info *ci)
+{
+   uint64_t base, def, did, did_lsd, did_msd, divisor, fid, multiplier;
+   uint32_t msr, off = 0;
+
+   if (strcmp(cpu_vendor, "AuthenticAMD") != 0)
+   return 0;
+
+   /*
+* All family 10h+ CPUs have MSR_HWCR and the TscFreqSel bit.
+* If TscFreqSel is not set the TSC does not advance at the P0
+* frequency, in which case something is wrong and we need to
+* calibrate by hand.
+*/
+#define HWCR_TSCFREQSEL (1 << 24)
+   if (!ISSET(rdmsr(MSR_HWCR), HWCR_TSCFREQSEL))   /* XXX specialreg.h */
+   return 0;
+#undef HWCR_TSCFREQSEL
+
+   /*
+* For families 10h, 12h, 14h, 15h, and 16h, we need to skip past
+* the boosted P-states (Pb0, Pb1, etc.) to find the MSR describing
+* P0, i.e. the highest performance unboosted P-state.  The number
+* of boosted states is kept in the "Core Performance Boost Control"
+* configuration space register.
+*/
+#ifdef __not_yet__
+   uint32_t reg;
+   switch (ci->ci_family) {
+   case 0x10:
+   /* XXX How do I read config space at this point in boot? */
+   reg = read_config_space(F4x15C);
+   off = (reg >> 2) & 0x1;
+   break;
+   case 0x12:
+   case 0x14:
+   case 0x15:
+   case 0x16:
+   /* XXX How do I read config space at this point in boot? */
+   reg = read_config_space(D18F4x15C);
+   off = (reg >> 2) & 0x7;
+   break;
+   default:
+   break;
+   }
+#endif
+
+/* DEBUG Let's look at all the MSRs to check my math. */
+for (; off < 8; off++) {
+
+   /*
+* In family 10h+, core P-state voltage/frequency definitions
+* are kept in MSRs C001_006[4:B] (eight registers in total).
+* All MSRs in the range are readable, but if the EN bit isn't
+* set the register doesn't define a valid P-state.
+*/
+   msr = 0xc0010064 + off; /* XXX specialreg.h */
+   def = rdmsr(msr);
+   printf("%s: MSR %04X_%04X: en %d",
+   ci->ci_dev->dv_xname, msr >> 16, msr & 0x,
+   !!ISSET(def, 1ULL << 63));
+   if (!ISSET(def, 1ULL << 63)) {  /* XXX specialreg.h */
+   printf("\n");
+   continue;
+   }
+   switch (ci->ci_family) {
+   case 0x10:
+   /* AMD Family 10h Processor BKDG, Rev 3.62, p. 429 */
+   base = 1;   /* 100.0 MHz */
+   did = (def >> 6) & 0x7;
+   divisor = 1ULL << did;
+   fid = def & 0x1f;
+   multiplier = fid + 0x10;
+   printf(" base %llu did %llu div %llu fid %llu mul %llu",
+   base, did, divisor, fid, multiplier);
+   break;
+   case 0x11:
+   /* AMD Family 11h Processor BKDG, Rev 3.62, p. 236 */
+   base = 1;   /* 100.0 MHz */
+   did = (def >> 6) & 0x7;
+   divisor = 1ULL << did;
+   fid = def & 0x1f;
+   multiplier = fid + 0x8;
+   printf(" base %llu did %llu div %llu fid %llu mul %llu",
+   base, did, di

execve.2: Use the proper macros for the #! line

2022-09-23 Thread Josiah Frentsos
Index: execve.2
===
RCS file: /cvs/src/lib/libc/sys/execve.2,v
retrieving revision 1.56
diff -u -p -r1.56 execve.2
--- execve.231 Mar 2022 17:27:16 -  1.56
+++ execve.223 Sep 2022 14:01:14 -
@@ -58,22 +58,19 @@ with zero data; see
 .Xr elf 5 .
 .Pp
 An interpreter file begins with a line of the form:
-.Bd -filled -offset indent
-.Sy #!\&
-.Em interpreter
-.Bq Em arg
-.Ed
+.Pp
+.D1 #! Ar interpreter Op Ar arg
 .Pp
 When an interpreter file is passed to
 .Fn execve ,
 the system instead calls
 .Fn execve
 with the specified
-.Em interpreter .
+.Ar interpreter .
 If the optional
-.Em arg
+.Ar arg
 is specified, it becomes the first argument to the
-.Em interpreter ,
+.Ar interpreter ,
 and the original
 .Fa path
 becomes the second argument;
@@ -207,9 +204,8 @@ see
 When a program is executed as a result of an
 .Fn execve
 call, it is entered as follows:
-.Bd -literal -offset indent
-main(int argc, char **argv, char **envp)
-.Ed
+.Pp
+.Dl main(int argc, char **argv, char **envp)
 .Pp
 where
 .Fa argc
@@ -291,7 +287,7 @@ is allowed by the imposed maximum
 The number of bytes in the new process's argument list
 is larger than the system-imposed limit.
 The limit in the system as released is 524288 bytes
-.Pf ( Dv ARG_MAX ) .
+.Pq Dv ARG_MAX .
 .It Bq Er EFAULT
 The new process file is not as long as indicated by
 the size values in its header.
@@ -309,7 +305,7 @@ did not contain at least one element.
 An I/O error occurred while reading from the file system.
 .It Bq Er ENFILE
 During startup of an
-.Em interpreter ,
+.Ar interpreter ,
 the system file table was found to be full.
 .El
 .Sh SEE ALSO



Re: bgpd, kill net/route.h dependency in bgpd.h

2022-09-23 Thread Theo Buehler
On Fri, Sep 23, 2022 at 12:10:45PM +0200, Claudio Jeker wrote:
> Linux is driving me nuts. The mix of net/, netinet/ includes and the need
> to also include some linux/ headers like linux/if.h and linux/in6.h result
> in absolute madness. Try to trim the includes in bgpd.h by defining our
> own label size for route labels.
> 
> With this the net/route.h compat shim can die. Won't probably help my real
> issue but one thing less to juggle.
> 
> Naming is pain, I went with ROUTELABEL_LEN other option is BGPDLABEL_LEN
> or maybe someone else has a good idea

Name seems fine to me, I don't have a better suggestion

ok

> -- 
> :wq Claudio
> 
> Index: bgpd.h
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/bgpd.h,v
> retrieving revision 1.453
> diff -u -p -r1.453 bgpd.h
> --- bgpd.h21 Sep 2022 21:12:03 -  1.453
> +++ bgpd.h23 Sep 2022 09:51:23 -
> @@ -22,7 +22,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -41,6 +40,7 @@
>  #define  PEER_DESCR_LEN  32
>  #define  REASON_LEN  256 /* includes NUL 
> terminator */
>  #define  PFTABLE_LEN 32
> +#define  ROUTELABEL_LEN  32
>  #define  TCP_MD5_KEY_LEN 80
>  #define  IPSEC_ENC_KEY_LEN   32
>  #define  IPSEC_AUTH_KEY_LEN  20
> @@ -705,7 +705,7 @@ struct ktable {
>  struct kroute_full {
>   struct bgpd_addrprefix;
>   struct bgpd_addrnexthop;
> - charlabel[RTLABEL_LEN];
> + charlabel[ROUTELABEL_LEN];
>   uint32_tmplslabel;
>   uint16_tflags;
>   u_short ifindex;
> @@ -1117,7 +1117,7 @@ struct filter_set {
>   struct nexthop  *nh_ref;
>   struct community community;
>   char pftable[PFTABLE_LEN];
> - char rtlabel[RTLABEL_LEN];
> + char rtlabel[ROUTELABEL_LEN];
>   uint8_t  origin;
>   }   action;
>   enum action_types   type;
> 



bgpd, kill net/route.h dependency in bgpd.h

2022-09-23 Thread Claudio Jeker
Linux is driving me nuts. The mix of net/, netinet/ includes and the need
to also include some linux/ headers like linux/if.h and linux/in6.h result
in absolute madness. Try to trim the includes in bgpd.h by defining our
own label size for route labels.

With this the net/route.h compat shim can die. Won't probably help my real
issue but one thing less to juggle.

Naming is pain, I went with ROUTELABEL_LEN other option is BGPDLABEL_LEN
or maybe someone else has a good idea
-- 
:wq Claudio

Index: bgpd.h
===
RCS file: /cvs/src/usr.sbin/bgpd/bgpd.h,v
retrieving revision 1.453
diff -u -p -r1.453 bgpd.h
--- bgpd.h  21 Sep 2022 21:12:03 -  1.453
+++ bgpd.h  23 Sep 2022 09:51:23 -
@@ -22,7 +22,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -41,6 +40,7 @@
 #definePEER_DESCR_LEN  32
 #defineREASON_LEN  256 /* includes NUL 
terminator */
 #definePFTABLE_LEN 32
+#defineROUTELABEL_LEN  32
 #defineTCP_MD5_KEY_LEN 80
 #defineIPSEC_ENC_KEY_LEN   32
 #defineIPSEC_AUTH_KEY_LEN  20
@@ -705,7 +705,7 @@ struct ktable {
 struct kroute_full {
struct bgpd_addrprefix;
struct bgpd_addrnexthop;
-   charlabel[RTLABEL_LEN];
+   charlabel[ROUTELABEL_LEN];
uint32_tmplslabel;
uint16_tflags;
u_short ifindex;
@@ -1117,7 +1117,7 @@ struct filter_set {
struct nexthop  *nh_ref;
struct community community;
char pftable[PFTABLE_LEN];
-   char rtlabel[RTLABEL_LEN];
+   char rtlabel[ROUTELABEL_LEN];
uint8_t  origin;
}   action;
enum action_types   type;



OpenBSD Errata: September 23, 2022 (expat)

2022-09-23 Thread Alexander Bluhm
Errata patches for libexpat have been released for OpenBSD 7.0 and 7.1.

Binary updates for the amd64, i386 and arm64 platform are available
via the syspatch utility.  Source code patches can be found on the
respective errata page:

  https://www.openbsd.org/errata70.html
  https://www.openbsd.org/errata71.html