Re: add pkg-config files for readline, editline, ncurses

2019-08-02 Thread Ingo Schwarze
Hi Stephen,

Stephen Gregoratto wrote on Fri, Jul 26, 2019 at 01:23:21AM +1000:

> mrsh[1], a cross-platform shell, can use readline in interactive mode.
> It's configure script detects the presence of readline using
> pkg-config(1). Thus, this patch adds a pkg-config file for our readline.

It is not obvious to me whether or not that is desirable.  It depends
on which version of readline mrsh requires.  If mrsh builds and is
usable with our base system readline (which is quite old), what you
propose may help.  On the other hand, if it requires a newer readline
or doesn't work well with the base readline, adding a library dependency
on ports readline might be better.

> I just copied over the generate_pkgconfig.sh script/make rules found in
> other libraries and edited to fit.
> 
> I also added pkg-config files for libedit and libcurses,

I think that ought to be a completely separate change because a program
either uses base libedit or base readline or ports readline but never
more than one of these.  I know relatively little about pkg-config(1);
the reason i'm speaking up here is that nobody else spoke up so far and
that i occasionally worked on libedit in the past.  Now, as far as i
understand (take this with a grain of salt), the effect of adding a
pkg-config file for libedit would possibly be that additional ports
might pick up our base libedit that now think that no libedit is
available and consequently compile without it.  That might improve
some ports - and/or break others.  So i suspect that adding libedit.pc
to base would probably require a bulk build before commit to make sure
nothing breaks.  Also - even though it seems less likely that anything
in the base system looks at *.pc files - it would probably require a
complete make build & release of base and xenocara.

For adding readline.pc, the same caveats apply, only even more so.
I suspect that the risk of breaking ports by adding readline.pc to
base is larger than for editline because probaly more ports use readline
than editline and those that do are probably more likely to use
pkg-config(1) and those that do both are probably more likely to be
unhappy with our base readline, whereas ports looking for editline
probably have a better chance to be happy with our base editline.

As far as i understand, adding editline.pc does not necessarily
require also adding ncurses.pc, nor does it require Requires:
lines.  So far, we don't have a single non-empty Requires: line
in base even though libssl depends on libcrypto.  Again, the effect
of adding ncurses.pc to base might be that some ports suddenly
pick up our base curses, for good or ill.  If there is a benefit
to that, i suspect it might make sense to test that change
independently from libedit.pc.  I don't think we want a bulk build
with three combined changes and lots of fallout, because then we
would be forced to figure out which fallout was caused by which
part of the changes, which sounds like gratuitious additional pain.

> since libedit requires linking with libcurses.  I started getting
> build errors before I realised that there are a couple editline
> libraries floating around, each differing slightly with each other:

*If* we want editline.pc, then i expect we probably just want one,
for our own editline.  Additional ones would only make sense to me
in the context of a port of one of the other editline implementations.
But i don't really see the point of such a port.  If some port does
not work with our editline but does work with a different editline,
we should probably either fix our editline or patch the port to
work with our editline, depending on what exactly to port is doing.

> Some Linux systems (e.g. Arch) generate pkg-config files for the
> different "types" of ncurses:
> 
>   /usr/lib/pkgconfig/ncurses++.pc -> ncurses++w.pc
>   /usr/lib/pkgconfig/ncurses++w.pc
>   /usr/lib/pkgconfig/ncurses.pc -> ncursesw.pc
>   /usr/lib/pkgconfig/ncursesw.pc
>   /usr/lib/pkgconfig/tic.pc -> ncursesw.pc
>   /usr/lib/pkgconfig/tinfo.pc -> ncursesw.pc
> 
> Not sure if this is something I should do here.

I doubt it.  In any case, the base system should contain one *curses*.pc
at most, i think, because we only have one curses in base.  Any other
*curses*.pc would be a matter of ports - and like for editline, i doubt
that having a port of a different version of curses makes sense.

That said, even though you are changing the base system, i suspect
discussing this kind of patches may be more fruitful on ports@
because that's where all the ports developers are that will be
affected by such changes.

I suggest that you decide which of the three changes you care about
most, then send a patch to ports@ with only that one change,
explaining what exactly it is supposed to improve (and how you
tested that it indeed does), and what exactly you already tested
in addition to that (for example, building which ports that use
pkg-config(1) and the library in question), asking for feedback.
The feedback you 

Re: arm64: softraid boot

2019-08-02 Thread Patrick Wildt
On Fri, Aug 02, 2019 at 03:16:55PM -0400, sven falempin wrote:
> Dear ARM gurus,
> 
> I m trying to tftp boot last snaphots on pine64+.
> Not sure if it s possible.
> So far I tried loading minitroot like an idiot and then, a boolader,
> now I cant get the efi bootloader to do some tftp ( which I guess is normal  )
> But maybe I am just missing a small pieces.

Hi,

our efiboot/BOOTAA64.EFI checks for the DHCP ack received bit to see
if we should look at the the contents of the DHCP ack for the address
information.

U-Boot unfortunately does not set it, and I have been trying to up-
stream that change on and off a few times:

https://marc.info/?l=u-boot=155488833610034=2

Thus, my bet is that it doesn't work because of the missing bit.

Patrick

> I also created etc/boot.conf in the tftp directory.. just in case.
> 
> getting some bsd code to boot :
> 
> => dhcp
> BOOTP broadcast 1
> DHCP client bound to address 172.16.65.55 (1445 ms)
> Using ethernet@01c3 device
> TFTP from server 172.16.65.1; our IP address is 172.16.65.55
> Filename 'BOOTAA64.EFI'.
> Load address: 0x4200
> Loading: ###
>  5.1 MiB/s
> done
> => bootefi 0x4200
> ## Starting EFI application at 4200 ...
> Scanning disks on usb...
> Disk usb0 not ready
> Disk usb1 not ready
> Disk usb2 not ready
> Disk usb3 not ready
> Scanning disks on mmc...
> MMC Device 1 not found
> MMC Device 2 not found
> MMC Device 3 not found
> Found 3 disks
> WARNING: Invalid device tree, expect boot to fail
> disks: sd0
> >> OpenBSD/arm64 BOOTAA64 0.16
> open(tftp0a:/etc/boot.conf): Operation not permitted
> boot>
> booting tftp0a:/bsd: open tftp0a:/bsd: Operation not permitted
>  failed(1). will try /bsd
> boot>
> booting tftp0a:/bsd: open tftp0a:/bsd: Operation not permitted
>  failed(1). will try /bsd
> Turning timeout off.
> boot> l
> stat(tftp0a:/.): Operation not permitted
> boot> l
> stat(tftp0a:/.): Operation not permitted
> boot> l
> stat(tftp0a:/.): Operation not permitted
> boot> boot tftp:bsd.rd
> cannot open tftp:/etc/random.seed: bad drive number
> booting tftp:bsd.rd: open tftp:bsd.rd: bad drive number
>  failed(98). will try /bsd
> boot> boot 172.16.65.1:/bsd.rd
> boot> boot tftp:/bsd.rd
> cannot open tftp:/etc/random.seed: bad drive number
> booting tftp:/bsd.rd: open tftp:/bsd.rd: bad drive number
>  failed(98). will try /bsd
> boot> boot tftp0a:/bsd.rd
> booting tftp0a:/bsd.rd: open tftp0a:/bsd.rd: Operation not permitted
>  failed(1). will try /bsd
> 
> is there some kind of trick to boot this ?
> 
> 
> On Thu, Jan 31, 2019 at 7:54 AM Mark Kettenis  wrote:
> >
> > > Date: Wed, 30 Jan 2019 22:31:31 +0100
> > > From: Patrick Wildt 
> > >
> > > Hi,
> > >
> > > to boot a pinebook with softraid crypto there are three parts that need
> > > to be implemented, which this diff hopefully does well enough.
> > >
> > > First, our EFI bootloader so far only supports one disk, so we need to
> > > extend this so we can see all connected block devices.  The second part
> > > is adding the softraid code, which probes for softraid partitions on the
> > > block devices and "spawns" its own block device.  At last, we need to
> > > push the softraid uuid and key up to the kernel.
> > >
> > > Pushing the informaton to the kernel is the easiest part.  We can add
> > > another openbsd-specific property in the device tree which we read out
> > > early on, so that we can zero the space in the device tree so it can not
> > > be read by anyone else.
> > >
> > > For the other part there's a bit more to do.  Very early on efiboot
> > > looks for the boot device.  While doing that, we create a list of block
> > > devices.  While there, already try to read the disklabel because after
> > > this the softraid probing needs to know the partition table.
> > >
> > > We then call srprobe() which will go over the disklist to look for a
> > > disk that contains RAID volumes, but calling the disk's diskio and
> > > strategy function to read from the device.  This creates the sr_volumes
> > > list of softraid volumes.  The MI boot code will then look for the boot
> > > device name by calling devboot().  If we don't have the device path for
> > > the boot device, it's a PXE boot.  Otherwise we will find it in either
> > > the disklist or sr_volumes list.
> > >
> > > The next part is the MI code opening the device, looking for a file-
> > > system.  The boot device name now is e.g. sr0a, which will then be
> > > looked up in the devsw[] array.  From there it's straight forward.
> > > Softraid maps the unit number to a volume, the volume already has a
> > > pointer to the diskinfo structure from the actual device and then
> > > calls the IO functions of that device.
> > >
> > > softraid_arm64.[ch] is copied from amd64 and adjusted a little bit
> > > for arm64, including adding the small devsw[] abstraction.
> > >
> > > With this I can successfully boot from softraid crypto on bluhm@'s
> > > Pinebook.
> > >
> > > Feedback?
> >
> > Apart from the property 

Re: arm64: softraid boot

2019-08-02 Thread sven falempin
Dear ARM gurus,

I m trying to tftp boot last snaphots on pine64+.
Not sure if it s possible.
So far I tried loading minitroot like an idiot and then, a boolader,
now I cant get the efi bootloader to do some tftp ( which I guess is normal  )
But maybe I am just missing a small pieces.

I also created etc/boot.conf in the tftp directory.. just in case.

getting some bsd code to boot :

=> dhcp
BOOTP broadcast 1
DHCP client bound to address 172.16.65.55 (1445 ms)
Using ethernet@01c3 device
TFTP from server 172.16.65.1; our IP address is 172.16.65.55
Filename 'BOOTAA64.EFI'.
Load address: 0x4200
Loading: ###
 5.1 MiB/s
done
=> bootefi 0x4200
## Starting EFI application at 4200 ...
Scanning disks on usb...
Disk usb0 not ready
Disk usb1 not ready
Disk usb2 not ready
Disk usb3 not ready
Scanning disks on mmc...
MMC Device 1 not found
MMC Device 2 not found
MMC Device 3 not found
Found 3 disks
WARNING: Invalid device tree, expect boot to fail
disks: sd0
>> OpenBSD/arm64 BOOTAA64 0.16
open(tftp0a:/etc/boot.conf): Operation not permitted
boot>
booting tftp0a:/bsd: open tftp0a:/bsd: Operation not permitted
 failed(1). will try /bsd
boot>
booting tftp0a:/bsd: open tftp0a:/bsd: Operation not permitted
 failed(1). will try /bsd
Turning timeout off.
boot> l
stat(tftp0a:/.): Operation not permitted
boot> l
stat(tftp0a:/.): Operation not permitted
boot> l
stat(tftp0a:/.): Operation not permitted
boot> boot tftp:bsd.rd
cannot open tftp:/etc/random.seed: bad drive number
booting tftp:bsd.rd: open tftp:bsd.rd: bad drive number
 failed(98). will try /bsd
boot> boot 172.16.65.1:/bsd.rd
boot> boot tftp:/bsd.rd
cannot open tftp:/etc/random.seed: bad drive number
booting tftp:/bsd.rd: open tftp:/bsd.rd: bad drive number
 failed(98). will try /bsd
boot> boot tftp0a:/bsd.rd
booting tftp0a:/bsd.rd: open tftp0a:/bsd.rd: Operation not permitted
 failed(1). will try /bsd

is there some kind of trick to boot this ?


On Thu, Jan 31, 2019 at 7:54 AM Mark Kettenis  wrote:
>
> > Date: Wed, 30 Jan 2019 22:31:31 +0100
> > From: Patrick Wildt 
> >
> > Hi,
> >
> > to boot a pinebook with softraid crypto there are three parts that need
> > to be implemented, which this diff hopefully does well enough.
> >
> > First, our EFI bootloader so far only supports one disk, so we need to
> > extend this so we can see all connected block devices.  The second part
> > is adding the softraid code, which probes for softraid partitions on the
> > block devices and "spawns" its own block device.  At last, we need to
> > push the softraid uuid and key up to the kernel.
> >
> > Pushing the informaton to the kernel is the easiest part.  We can add
> > another openbsd-specific property in the device tree which we read out
> > early on, so that we can zero the space in the device tree so it can not
> > be read by anyone else.
> >
> > For the other part there's a bit more to do.  Very early on efiboot
> > looks for the boot device.  While doing that, we create a list of block
> > devices.  While there, already try to read the disklabel because after
> > this the softraid probing needs to know the partition table.
> >
> > We then call srprobe() which will go over the disklist to look for a
> > disk that contains RAID volumes, but calling the disk's diskio and
> > strategy function to read from the device.  This creates the sr_volumes
> > list of softraid volumes.  The MI boot code will then look for the boot
> > device name by calling devboot().  If we don't have the device path for
> > the boot device, it's a PXE boot.  Otherwise we will find it in either
> > the disklist or sr_volumes list.
> >
> > The next part is the MI code opening the device, looking for a file-
> > system.  The boot device name now is e.g. sr0a, which will then be
> > looked up in the devsw[] array.  From there it's straight forward.
> > Softraid maps the unit number to a volume, the volume already has a
> > pointer to the diskinfo structure from the actual device and then
> > calls the IO functions of that device.
> >
> > softraid_arm64.[ch] is copied from amd64 and adjusted a little bit
> > for arm64, including adding the small devsw[] abstraction.
> >
> > With this I can successfully boot from softraid crypto on bluhm@'s
> > Pinebook.
> >
> > Feedback?
>
> Apart from the property names, this looks good to me.  So feel free to
> go ahead once you changed those properties.
>
> > diff --git sys/arch/arm64/arm64/machdep.c sys/arch/arm64/arm64/machdep.c
> > index 45f6451066a..03bc8464f3a 100644
> > --- sys/arch/arm64/arm64/machdep.c
> > +++ sys/arch/arm64/arm64/machdep.c
> > @@ -30,6 +30,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #include 
> >  #include 
> > @@ -49,6 +50,11 @@
> >
> >  #include 
> >
> > +#include "softraid.h"
> > +#if NSOFTRAID > 0
> > +#include 
> > +#endif
> > +
> >  char *boot_args = NULL;
> >  char *boot_file = "";
> >
> > @@ -826,6 +832,22 @@ initarm(struct arm64_bootparams *abp)
> >   

Re: iwm(4) WPA2 crypto hardware offload

2019-08-02 Thread Ville Valkonen
On Fri, 2 Aug 2019 at 14:18, Stefan Sperling  wrote:
>
> This diff enables HW offload for WPA2 CCMP (AES) encrypted unicast
> frames in iwm(4). This is in preparation for Tx aggregation support.
>
> WEP and WPA1/TKIP ciphers are still handled in software, which mirrors
> what the older iwn(4) driver is doing. We don't enable 11n at all with
> those ciphers (see ieee80211_ht_negotiate()), and we won't aggregate
> non-encrypted frames (see ieee80211_can_use_ampdu()).
>
> Based on an initial diff by procter@ and some code from iwn(4).
>
> Tested on 7260, 7265, 8260, and 8265 in bsd.rd upgrade and with pkg_add -u.
>
> ok?
>
> diff refs/heads/master refs/heads/iwm-hwcrypt
> blob - 7add1e9e682ef5e22169ec1e89a182cda1af7e2a
> blob + 839c0a0f8b3a62115ba6d5e15adfa63158475c86
> --- sys/dev/pci/if_iwm.c
> +++ sys/dev/pci/if_iwm.c
> @@ -367,6 +367,8 @@ int iwm_get_signal_strength(struct iwm_softc *, struct
>  void   iwm_rx_rx_phy_cmd(struct iwm_softc *, struct iwm_rx_packet *,
> struct iwm_rx_data *);
>  intiwm_get_noise(const struct iwm_statistics_rx_non_phy *);
> +intiwm_ccmp_decap(struct iwm_softc *, struct mbuf *,
> +   struct ieee80211_node *);
>  void   iwm_rx_rx_mpdu(struct iwm_softc *, struct iwm_rx_packet *,
> struct iwm_rx_data *);
>  void   iwm_enable_ht_cck_fallback(struct iwm_softc *, struct iwm_node *);
> @@ -448,6 +450,10 @@ intiwm_disassoc(struct iwm_softc *);
>  intiwm_run(struct iwm_softc *);
>  intiwm_run_stop(struct iwm_softc *);
>  struct ieee80211_node *iwm_node_alloc(struct ieee80211com *);
> +intiwm_set_key(struct ieee80211com *, struct ieee80211_node *,
> +   struct ieee80211_key *);
> +void   iwm_delete_key(struct ieee80211com *,
> +   struct ieee80211_node *, struct ieee80211_key *);
>  void   iwm_calib_timeout(void *);
>  intiwm_media_change(struct ifnet *);
>  void   iwm_newstate_task(void *);
> @@ -3429,11 +3435,91 @@ iwm_get_noise(const struct iwm_statistics_rx_non_phy *
> return (nbant == 0) ? -127 : (total / nbant) - 107;
>  }
>
> +int
> +iwm_ccmp_decap(struct iwm_softc *sc, struct mbuf *m, struct ieee80211_node 
> *ni)
> +{
> +   struct ieee80211com *ic = >sc_ic;
> +   struct ieee80211_key *k = >ni_pairwise_key;
> +   struct ieee80211_frame *wh;
> +   struct ieee80211_rx_ba *ba;
> +   uint64_t pn, *prsc;
> +   uint8_t *ivp;
> +   uint8_t tid;
> +   int hdrlen, hasqos;
> +
> +   wh = mtod(m, struct ieee80211_frame *);
> +   hdrlen = ieee80211_get_hdrlen(wh);
> +   ivp = (uint8_t *)wh + hdrlen;
> +
> +   /* Check that ExtIV bit is be set. */
> +   if (!(ivp[3] & IEEE80211_WEP_EXTIV)) {
> +   DPRINTF(("CCMP decap ExtIV not set\n"));
> +   return 1;
> +   }
> +   hasqos = ieee80211_has_qos(wh);
> +   tid = hasqos ? ieee80211_get_qos(wh) & IEEE80211_QOS_TID : 0;
> +   ba = hasqos ? >ni_rx_ba[tid] : NULL;
> +   prsc = >k_rsc[tid];
> +
> +   /* Extract the 48-bit PN from the CCMP header. */
> +   pn = (uint64_t)ivp[0]   |
> +(uint64_t)ivp[1] <<  8 |
> +(uint64_t)ivp[4] << 16 |
> +(uint64_t)ivp[5] << 24 |
> +(uint64_t)ivp[6] << 32 |
> +(uint64_t)ivp[7] << 40;
> +   if (pn <= *prsc) {
> +   if (hasqos && ba->ba_state == IEEE80211_BA_AGREED) {
> +   /*
> +* This is an A-MPDU subframe.
> +* Such frames may be received out of order due to
> +* legitimate retransmissions of failed subframes
> +* in previous A-MPDUs. Duplicates will be handled
> +* in ieee80211_input() as part of A-MPDU reordering.
> +*/
> +   } else if (ieee80211_has_seq(wh)) {
> +   /*
> +* Not necessarily a replayed frame since we did not
> +* check the sequence number of the 802.11 header yet.
> +*/
> +   int nrxseq, orxseq;
> +
> +   nrxseq = letoh16(*(u_int16_t *)wh->i_seq) >>
> +   IEEE80211_SEQ_SEQ_SHIFT;
> +   if (hasqos)
> +   orxseq = ni->ni_qos_rxseqs[tid];
> +   else
> +   orxseq = ni->ni_rxseq;
> +   if (nrxseq < orxseq) {
> +   DPRINTF(("CCMP replayed (n=%d < o=%d)\n",
> +   nrxseq, orxseq));
> +   ic->ic_stats.is_ccmp_replays++;
> +   return 1;
> +   }
> +   } else {
> +   DPRINTF(("CCMP replayed\n"));
> +   ic->ic_stats.is_ccmp_replays++;
> +   return 1;
> +   }
> +   }
> +   /* Update 

Re: iwm(4) WPA2 crypto hardware offload

2019-08-02 Thread Brian Callahan




On 8/2/19 7:13 AM, Stefan Sperling wrote:

This diff enables HW offload for WPA2 CCMP (AES) encrypted unicast
frames in iwm(4). This is in preparation for Tx aggregation support.

WEP and WPA1/TKIP ciphers are still handled in software, which mirrors
what the older iwn(4) driver is doing. We don't enable 11n at all with
those ciphers (see ieee80211_ht_negotiate()), and we won't aggregate
non-encrypted frames (see ieee80211_can_use_ampdu()).

Based on an initial diff by procter@ and some code from iwn(4).

Tested on 7260, 7265, 8260, and 8265 in bsd.rd upgrade and with pkg_add -u.


Works on my 3165 with pkg_add -u.

~Brian


ok?

diff refs/heads/master refs/heads/iwm-hwcrypt
blob - 7add1e9e682ef5e22169ec1e89a182cda1af7e2a
blob + 839c0a0f8b3a62115ba6d5e15adfa63158475c86
--- sys/dev/pci/if_iwm.c
+++ sys/dev/pci/if_iwm.c
@@ -367,6 +367,8 @@ int iwm_get_signal_strength(struct iwm_softc *, struct
  void  iwm_rx_rx_phy_cmd(struct iwm_softc *, struct iwm_rx_packet *,
struct iwm_rx_data *);
  int   iwm_get_noise(const struct iwm_statistics_rx_non_phy *);
+intiwm_ccmp_decap(struct iwm_softc *, struct mbuf *,
+   struct ieee80211_node *);
  void  iwm_rx_rx_mpdu(struct iwm_softc *, struct iwm_rx_packet *,
struct iwm_rx_data *);
  void  iwm_enable_ht_cck_fallback(struct iwm_softc *, struct iwm_node *);
@@ -448,6 +450,10 @@ intiwm_disassoc(struct iwm_softc *);
  int   iwm_run(struct iwm_softc *);
  int   iwm_run_stop(struct iwm_softc *);
  struct ieee80211_node *iwm_node_alloc(struct ieee80211com *);
+intiwm_set_key(struct ieee80211com *, struct ieee80211_node *,
+   struct ieee80211_key *);
+void   iwm_delete_key(struct ieee80211com *,
+   struct ieee80211_node *, struct ieee80211_key *);
  void  iwm_calib_timeout(void *);
  int   iwm_media_change(struct ifnet *);
  void  iwm_newstate_task(void *);
@@ -3429,11 +3435,91 @@ iwm_get_noise(const struct iwm_statistics_rx_non_phy *
return (nbant == 0) ? -127 : (total / nbant) - 107;
  }
  
+int

+iwm_ccmp_decap(struct iwm_softc *sc, struct mbuf *m, struct ieee80211_node *ni)
+{
+   struct ieee80211com *ic = >sc_ic;
+   struct ieee80211_key *k = >ni_pairwise_key;
+   struct ieee80211_frame *wh;
+   struct ieee80211_rx_ba *ba;
+   uint64_t pn, *prsc;
+   uint8_t *ivp;
+   uint8_t tid;
+   int hdrlen, hasqos;
+
+   wh = mtod(m, struct ieee80211_frame *);
+   hdrlen = ieee80211_get_hdrlen(wh);
+   ivp = (uint8_t *)wh + hdrlen;
+
+   /* Check that ExtIV bit is be set. */
+   if (!(ivp[3] & IEEE80211_WEP_EXTIV)) {
+   DPRINTF(("CCMP decap ExtIV not set\n"));
+   return 1;
+   }
+   hasqos = ieee80211_has_qos(wh);
+   tid = hasqos ? ieee80211_get_qos(wh) & IEEE80211_QOS_TID : 0;
+   ba = hasqos ? >ni_rx_ba[tid] : NULL;
+   prsc = >k_rsc[tid];
+
+   /* Extract the 48-bit PN from the CCMP header. */
+   pn = (uint64_t)ivp[0]   |
+(uint64_t)ivp[1] <<  8 |
+(uint64_t)ivp[4] << 16 |
+(uint64_t)ivp[5] << 24 |
+(uint64_t)ivp[6] << 32 |
+(uint64_t)ivp[7] << 40;
+   if (pn <= *prsc) {
+   if (hasqos && ba->ba_state == IEEE80211_BA_AGREED) {
+   /*
+* This is an A-MPDU subframe.
+* Such frames may be received out of order due to
+* legitimate retransmissions of failed subframes
+* in previous A-MPDUs. Duplicates will be handled
+* in ieee80211_input() as part of A-MPDU reordering.
+*/
+   } else if (ieee80211_has_seq(wh)) {
+   /*
+* Not necessarily a replayed frame since we did not
+* check the sequence number of the 802.11 header yet.
+*/
+   int nrxseq, orxseq;
+
+   nrxseq = letoh16(*(u_int16_t *)wh->i_seq) >>
+   IEEE80211_SEQ_SEQ_SHIFT;
+   if (hasqos)
+   orxseq = ni->ni_qos_rxseqs[tid];
+   else
+   orxseq = ni->ni_rxseq;
+   if (nrxseq < orxseq) {
+   DPRINTF(("CCMP replayed (n=%d < o=%d)\n",
+   nrxseq, orxseq));
+   ic->ic_stats.is_ccmp_replays++;
+   return 1;
+   }
+   } else {
+   DPRINTF(("CCMP replayed\n"));
+   ic->ic_stats.is_ccmp_replays++;
+   return 1;
+   }
+   }
+   /* Update last seen packet number. */
+   *prsc = pn;
+
+   /* Clear Protected bit and strip IV. */
+   wh->i_fc[1] &= ~IEEE80211_FC1_PROTECTED;
+   memmove(mtod(m, 

Re: TSC synchronization on MP machines

2019-08-02 Thread Bryan Steele
On Fri, Aug 02, 2019 at 01:29:37PM +0300, Paul Irofti wrote:
> Hi,
> 
> Here is an updated diff with a few bugs eliminated from the previous and
> with most of the concerns I got in private and from Mark fixed.
> 
> I will do the TSC_ADJUST_MSR dance in another iteration if the current
> incarnation turns out to be correct for machines suffering from TSCs not
> in sync.
> 
> The thing I am mostly worried about now is in the following sum
> 
>  uint
>  tsc_get_timecount(struct timecounter *tc)
>  {
>   return rdtsc() + curcpu()->cpu_cc_skew;
>  }
>  
> can one term be executed on one CPU and the other on another? Is there a
> way to protect this from happening other than locking?
> 
> I see NetBSD is checking for a change in the number of context switches 
> of the current process.
> 
> My plan is to have a fix in the tree before 6.6 is released, so I would
> love to hear your thoughts and reports on this.
> 
> Thanks,
> Paul
> 
> 
> Index: arch/amd64/amd64/cpu.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/cpu.c,v
> retrieving revision 1.137
> diff -u -p -u -p -r1.137 cpu.c
> --- arch/amd64/amd64/cpu.c28 May 2019 18:17:01 -  1.137
> +++ arch/amd64/amd64/cpu.c2 Aug 2019 10:25:04 -
> @@ -754,6 +754,10 @@ cpu_init(struct cpu_info *ci)
>   cr4 = rcr4();
>   lcr4(cr4 & ~CR4_PGE);
>   lcr4(cr4);
> +
> + /* Synchronize TSC */
> + if (!CPU_IS_PRIMARY(ci))
> +   tsc_sync_ap(ci);
>  #endif
>  }
>  
> @@ -808,6 +812,7 @@ void
>  cpu_start_secondary(struct cpu_info *ci)
>  {
>   int i;
> + u_long s;
>  
>   ci->ci_flags |= CPUF_AP;
>  
> @@ -828,6 +833,17 @@ cpu_start_secondary(struct cpu_info *ci)
>   printf("dropping into debugger; continue from here to resume 
> boot\n");
>   db_enter();
>  #endif
> + } else {
> + /*
> +  * Synchronize time stamp counters. Invalidate cache and do
> +  * twice (in tsc_sync_bp) to minimize possible cache effects.
> +  * Disable interrupts to try and rule out any external
> +  * interference.
> +  */
> + s = intr_disable();
> + wbinvd();
> + tsc_sync_bp(ci);
> + intr_restore(s);
>   }
>  
>   if ((ci->ci_flags & CPUF_IDENTIFIED) == 0) {
> @@ -852,6 +868,8 @@ void
>  cpu_boot_secondary(struct cpu_info *ci)
>  {
>   int i;
> + int64_t drift;
> + u_long s;
>  
>   atomic_setbits_int(>ci_flags, CPUF_GO);
>  
> @@ -864,6 +882,17 @@ cpu_boot_secondary(struct cpu_info *ci)
>   printf("dropping into debugger; continue from here to resume 
> boot\n");
>   db_enter();
>  #endif
> + } else {
> + /* Synchronize TSC again, check for drift. */
> + drift = ci->cpu_cc_skew;
> + s = intr_disable();
> + wbinvd();
> + tsc_sync_bp(ci);
> + intr_restore(s);
> + drift -= ci->cpu_cc_skew;
> + printf("TSC skew=%lld drift=%lld\n",
> + (long long)ci->cpu_cc_skew, (long long)drift);
> + tsc_sync_drift(drift);
>   }
>  }
>  
> @@ -888,7 +917,13 @@ cpu_hatch(void *v)
>   panic("%s: already running!?", ci->ci_dev->dv_xname);
>  #endif
>  
> + /*
> +  * Synchronize the TSC for the first time. Note that interrupts are
> +  * off at this point.
> +  */
> + wbinvd();
>   ci->ci_flags |= CPUF_PRESENT;
> + tsc_sync_ap(ci);
>  
>   lapic_enable();
>   lapic_startclock();
> Index: arch/amd64/amd64/tsc.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
> retrieving revision 1.11
> diff -u -p -u -p -r1.11 tsc.c
> --- arch/amd64/amd64/tsc.c6 Jun 2019 19:43:35 -   1.11
> +++ arch/amd64/amd64/tsc.c2 Aug 2019 10:25:04 -
> @@ -1,8 +1,10 @@
>  /*   $OpenBSD: tsc.c,v 1.11 2019/06/06 19:43:35 kettenis Exp $   */
>  /*
> + * Copyright (c) 2008 The NetBSD Foundation, Inc.
>   * Copyright (c) 2016,2017 Reyk Floeter 
>   * Copyright (c) 2017 Adam Steen 
>   * Copyright (c) 2017 Mike Belopuhov 
> + * Copyright (c) 2019 Paul Irofti 
>   *
>   * Permission to use, copy, modify, and distribute this software for any
>   * purpose with or without fee is hereby granted, provided that the above
> @@ -20,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -33,6 +36,13 @@ inttsc_recalibrate;
>  uint64_t tsc_frequency;
>  int  tsc_is_invariant;
>  
> +int64_t  tsc_drift_max = 250;/* max cycles */
> +int64_t  tsc_drift_observed;
> +bool tsc_good;
> +
> +volatile int64_t tsc_sync_val;
> +volatile struct cpu_info *tsc_sync_cpu;
> +
>  uint tsc_get_timecount(struct timecounter *tc);
>  
>  struct timecounter tsc_timecounter = {
> @@ -172,10 +182,8 @@ 

socreate(9) M_WAIT

2019-08-02 Thread Alexander Bluhm
Hi,

I am trying to hunt some ENOBUFS bugs reported from the socket
layer.  This may also happen if pool(9) or malloc(9) fails, so this
an easy first step.  A system call should not fail due to temporary
memory shortage.  It is the kernel's job to handle that, usually
by sleeping.

So from socreate(9) I pass M_WAIT to the protocol attach function.
As sonewconn() may be called from softnet, I pass M_DONTWAIT.

The rest of the diff is just passing down the flag to pool(9) and
malloc(9).

ok?

bluhm

Index: kern/uipc_socket.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.234
diff -u -p -r1.234 uipc_socket.c
--- kern/uipc_socket.c  22 Jul 2019 15:34:07 -  1.234
+++ kern/uipc_socket.c  2 Aug 2019 14:12:56 -
@@ -143,7 +143,7 @@ socreate(int dom, struct socket **aso, i
so->so_proto = prp;

s = solock(so);
-   error = (*prp->pr_attach)(so, proto);
+   error = (*prp->pr_attach)(so, proto, M_WAIT);
if (error) {
so->so_state |= SS_NOFDREF;
/* sofree() calls sounlock(). */
Index: kern/uipc_socket2.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_socket2.c,v
retrieving revision 1.101
diff -u -p -r1.101 uipc_socket2.c
--- kern/uipc_socket2.c 16 Apr 2019 13:15:32 -  1.101
+++ kern/uipc_socket2.c 2 Aug 2019 14:15:15 -
@@ -190,7 +190,7 @@ sonewconn(struct socket *head, int conns
sigio_copy(>so_sigio, >so_sigio);

soqinsque(head, so, soqueue);
-   if ((*so->so_proto->pr_attach)(so, 0)) {
+   if ((*so->so_proto->pr_attach)(so, 0, M_DONTWAIT)) {
(void) soqremque(so, soqueue);
sigio_free(>so_sigio);
pool_put(_pool, so);
Index: kern/uipc_usrreq.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_usrreq.c,v
retrieving revision 1.142
diff -u -p -r1.142 uipc_usrreq.c
--- kern/uipc_usrreq.c  16 Jul 2019 21:41:37 -  1.142
+++ kern/uipc_usrreq.c  2 Aug 2019 15:16:00 -
@@ -354,7 +354,7 @@ u_long  unpdg_recvspace = 4*1024;
 intunp_rights; /* file descriptors in flight */

 int
-uipc_attach(struct socket *so, int proto)
+uipc_attach(struct socket *so, int proto, int wait)
 {
struct unpcb *unp;
int error;
@@ -379,7 +379,8 @@ uipc_attach(struct socket *so, int proto
if (error)
return (error);
}
-   unp = pool_get(_pool, PR_NOWAIT|PR_ZERO);
+   unp = pool_get(_pool, (wait == M_WAIT ? PR_WAITOK : PR_NOWAIT) |
+   PR_ZERO);
if (unp == NULL)
return (ENOBUFS);
unp->unp_socket = so;
Index: net/pfkeyv2.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/pfkeyv2.c,v
retrieving revision 1.198
diff -u -p -r1.198 pfkeyv2.c
--- net/pfkeyv2.c   17 Jul 2019 18:52:46 -  1.198
+++ net/pfkeyv2.c   2 Aug 2019 14:18:15 -
@@ -177,7 +177,7 @@ static int npromisc = 0;

 void pfkey_init(void);

-int pfkeyv2_attach(struct socket *, int);
+int pfkeyv2_attach(struct socket *, int, int);
 int pfkeyv2_detach(struct socket *);
 int pfkeyv2_usrreq(struct socket *, int, struct mbuf *, struct mbuf *,
 struct mbuf *, struct proc *);
@@ -261,7 +261,7 @@ pfkey_init(void)
  * Attach a new PF_KEYv2 socket.
  */
 int
-pfkeyv2_attach(struct socket *so, int proto)
+pfkeyv2_attach(struct socket *so, int proto, int wait)
 {
struct pkpcb *kp;
int error;
@@ -269,6 +269,7 @@ pfkeyv2_attach(struct socket *so, int pr
if ((so->so_state & SS_PRIV) == 0)
return EACCES;

+   KASSERT(wait == M_WAIT);
kp = pool_get(_pool, PR_WAITOK|PR_ZERO);
so->so_pcb = kp;
refcnt_init(>kcb_refcnt);
Index: net/rtsock.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/net/rtsock.c,v
retrieving revision 1.289
diff -u -p -r1.289 rtsock.c
--- net/rtsock.c17 Jul 2019 19:57:32 -  1.289
+++ net/rtsock.c2 Aug 2019 14:19:11 -
@@ -288,7 +288,7 @@ route_usrreq(struct socket *so, int req,
 }

 int
-route_attach(struct socket *so, int proto)
+route_attach(struct socket *so, int proto, int wait)
 {
struct rtpcb*rop;
int  error;
@@ -298,6 +298,7 @@ route_attach(struct socket *so, int prot
 * code does not care about the additional fields
 * and works directly on the raw socket.
 */
+   KASSERT(wait == M_WAIT);
rop = pool_get(_pool, PR_WAITOK|PR_ZERO);
so->so_pcb = rop;
/* Init the timeout structure */
Index: netinet/in_pcb.c
===
RCS file: 

Re: printf(1) man page has a small omission

2019-08-02 Thread Andras Farkas
On Thu, Aug 1, 2019 at 5:54 PM Ingo Schwarze  wrote:
> please do not cross-post on OpenBSD lists, choose whatever list fits
> best.  I trimmed bugs@ for this followup.
Ah, my bad.  I just found it unclear which list to send documentation
issues to.  I've normally sent them to bugs@ in the past, but then I
also sent a diff and I've been told diffs usually go to tech@
Any advice?
I'd totally subscribe to a docs@ if there was one. :D
> I don't quite agree with your patch.  In practice, both \0num
> and \num work; i inspected the code of FreeBSD and NetBSD which
> both appear to support \num, too, even though they don't document
> it, and i tested on Linux, and on Solaris 9, 10, and 11, and both
> forms work everywhere:
>
>$ printf '%b\n' '\0176x'
>   ~x
>$ printf '%b\n' '\176x'
>   ~x
True, that's right.
> So here is an alternative patch.
This patch looks great.
*thumbs-up*



unmount after realpath

2019-08-02 Thread Moritz Buhl
Hi,

the same as to unveil(2) also counts for realpath(3).
https://marc.info/?l=openbsd-bugs=156469573812165=2
Similar diff attached.

tests are here:
https://github.com/moritzbuhl/realpath-unmount-regress

thanks,
mbuhl

Index: sys/kern/vfs_syscalls.c
===
RCS file: /cvs/src/sys/kern/vfs_syscalls.c,v
retrieving revision 1.327
diff -u -p -r1.327 vfs_syscalls.c
--- sys/kern/vfs_syscalls.c 25 Jul 2019 01:43:21 -  1.327
+++ sys/kern/vfs_syscalls.c 2 Aug 2019 12:45:12 -
@@ -948,10 +948,10 @@ sys___realpath(struct proc *p, void *v, 
VOP_UNLOCK(nd.ni_vp);
vrele(nd.ni_vp);
}
-   if (nd.ni_dvp && nd.ni_dvp != nd.ni_vp){
+   if (nd.ni_dvp && nd.ni_dvp != nd.ni_vp)
VOP_UNLOCK(nd.ni_dvp);
+   if (nd.ni_dvp)
vrele(nd.ni_dvp);
-   }
 
error = copyoutstr(nd.ni_cnd.cn_rpbuf, SCARG(uap, resolved),
MAXPATHLEN, NULL);



iwm(4) WPA2 crypto hardware offload

2019-08-02 Thread Stefan Sperling
This diff enables HW offload for WPA2 CCMP (AES) encrypted unicast
frames in iwm(4). This is in preparation for Tx aggregation support.

WEP and WPA1/TKIP ciphers are still handled in software, which mirrors
what the older iwn(4) driver is doing. We don't enable 11n at all with
those ciphers (see ieee80211_ht_negotiate()), and we won't aggregate
non-encrypted frames (see ieee80211_can_use_ampdu()).

Based on an initial diff by procter@ and some code from iwn(4).

Tested on 7260, 7265, 8260, and 8265 in bsd.rd upgrade and with pkg_add -u.

ok?

diff refs/heads/master refs/heads/iwm-hwcrypt
blob - 7add1e9e682ef5e22169ec1e89a182cda1af7e2a
blob + 839c0a0f8b3a62115ba6d5e15adfa63158475c86
--- sys/dev/pci/if_iwm.c
+++ sys/dev/pci/if_iwm.c
@@ -367,6 +367,8 @@ int iwm_get_signal_strength(struct iwm_softc *, struct
 void   iwm_rx_rx_phy_cmd(struct iwm_softc *, struct iwm_rx_packet *,
struct iwm_rx_data *);
 intiwm_get_noise(const struct iwm_statistics_rx_non_phy *);
+intiwm_ccmp_decap(struct iwm_softc *, struct mbuf *,
+   struct ieee80211_node *);
 void   iwm_rx_rx_mpdu(struct iwm_softc *, struct iwm_rx_packet *,
struct iwm_rx_data *);
 void   iwm_enable_ht_cck_fallback(struct iwm_softc *, struct iwm_node *);
@@ -448,6 +450,10 @@ intiwm_disassoc(struct iwm_softc *);
 intiwm_run(struct iwm_softc *);
 intiwm_run_stop(struct iwm_softc *);
 struct ieee80211_node *iwm_node_alloc(struct ieee80211com *);
+intiwm_set_key(struct ieee80211com *, struct ieee80211_node *,
+   struct ieee80211_key *);
+void   iwm_delete_key(struct ieee80211com *,
+   struct ieee80211_node *, struct ieee80211_key *);
 void   iwm_calib_timeout(void *);
 intiwm_media_change(struct ifnet *);
 void   iwm_newstate_task(void *);
@@ -3429,11 +3435,91 @@ iwm_get_noise(const struct iwm_statistics_rx_non_phy *
return (nbant == 0) ? -127 : (total / nbant) - 107;
 }
 
+int
+iwm_ccmp_decap(struct iwm_softc *sc, struct mbuf *m, struct ieee80211_node *ni)
+{
+   struct ieee80211com *ic = >sc_ic;
+   struct ieee80211_key *k = >ni_pairwise_key;
+   struct ieee80211_frame *wh;
+   struct ieee80211_rx_ba *ba;
+   uint64_t pn, *prsc;
+   uint8_t *ivp;
+   uint8_t tid;
+   int hdrlen, hasqos;
+
+   wh = mtod(m, struct ieee80211_frame *);
+   hdrlen = ieee80211_get_hdrlen(wh);
+   ivp = (uint8_t *)wh + hdrlen;
+
+   /* Check that ExtIV bit is be set. */
+   if (!(ivp[3] & IEEE80211_WEP_EXTIV)) {
+   DPRINTF(("CCMP decap ExtIV not set\n"));
+   return 1;
+   }
+   hasqos = ieee80211_has_qos(wh);
+   tid = hasqos ? ieee80211_get_qos(wh) & IEEE80211_QOS_TID : 0;
+   ba = hasqos ? >ni_rx_ba[tid] : NULL;
+   prsc = >k_rsc[tid];
+
+   /* Extract the 48-bit PN from the CCMP header. */
+   pn = (uint64_t)ivp[0]   |
+(uint64_t)ivp[1] <<  8 |
+(uint64_t)ivp[4] << 16 |
+(uint64_t)ivp[5] << 24 |
+(uint64_t)ivp[6] << 32 |
+(uint64_t)ivp[7] << 40;
+   if (pn <= *prsc) {
+   if (hasqos && ba->ba_state == IEEE80211_BA_AGREED) {
+   /*
+* This is an A-MPDU subframe.
+* Such frames may be received out of order due to
+* legitimate retransmissions of failed subframes
+* in previous A-MPDUs. Duplicates will be handled
+* in ieee80211_input() as part of A-MPDU reordering.
+*/
+   } else if (ieee80211_has_seq(wh)) {
+   /*
+* Not necessarily a replayed frame since we did not
+* check the sequence number of the 802.11 header yet.
+*/
+   int nrxseq, orxseq;
+
+   nrxseq = letoh16(*(u_int16_t *)wh->i_seq) >>
+   IEEE80211_SEQ_SEQ_SHIFT;
+   if (hasqos)
+   orxseq = ni->ni_qos_rxseqs[tid];
+   else
+   orxseq = ni->ni_rxseq;
+   if (nrxseq < orxseq) {
+   DPRINTF(("CCMP replayed (n=%d < o=%d)\n",
+   nrxseq, orxseq));
+   ic->ic_stats.is_ccmp_replays++;
+   return 1;
+   }
+   } else {
+   DPRINTF(("CCMP replayed\n"));
+   ic->ic_stats.is_ccmp_replays++;
+   return 1;
+   }
+   }
+   /* Update last seen packet number. */
+   *prsc = pn;
+
+   /* Clear Protected bit and strip IV. */
+   wh->i_fc[1] &= ~IEEE80211_FC1_PROTECTED;
+   memmove(mtod(m, caddr_t) + IEEE80211_CCMP_HDRLEN, wh, hdrlen);
+   m_adj(m, IEEE80211_CCMP_HDRLEN);
+   /* 

Re: TSC synchronization on MP machines

2019-08-02 Thread Mark Kettenis
> Date: Fri, 2 Aug 2019 13:29:37 +0300
> From: Paul Irofti 
> 
> On Mon, Jul 01, 2019 at 10:32:51AM +0200, Mark Kettenis wrote:
> > > Date: Thu, 27 Jun 2019 15:08:00 +0300
> > > From: Paul Irofti 
> > > 
> > > Hi,
> > > 
> > > Here is an initial diff, adapted from NetBSD, that synchronizes TSC
> > > clocks across cores.
> > > 
> > > CPU0 is the reference clock and all others are skewed. During CPU
> > > initialization the clocks synchronize by keeping a registry of each CPU
> > > clock skewness and adapting the TSC read routine accordingly.
> > > 
> > > I choose this implementation over what FreeBSD is doing (which is just
> > > copying Linux really), because it is clean and elegant.
> > > 
> > > I would love to hear reports from machines that were broken by this.
> > > Mine, which never exhibited the problem in the first place, run just
> > > fine with the following diff. In fact I am writting this message on one
> > > such machine.
> > > 
> > > Also constructive comments are more than welcomed!
> > > 
> > > Notes:
> > > 
> > > - cpu_counter_serializing() could probably have a better name
> > >   (tsc _read for example)
> > > - the PAUSE instruction is probably not needed
> > > - acpi(4) suspend and resume bits are left out on purpose, but should
> > >   be trivial to add once the current diff settles
> > > 
> > > Paul Irofti
> > 
> > I don't think we want to introduce a  header file.
> > 
> > The code suffers from some NetBSD-isms, so that'll need to be fixed.
> > I pointed some of them out below.
> > 
> > Also, how accurate is your skew detection?  What skew is detected on a
> > machine that (supposedly) has the TSCs in sync?  The result will be
> > that you actually slightly desync the counters on different CPUs.
> > 
> > I think Linux uses the TSC_ADJUST MSR and compares its value across
> > cores.  If the skew is small and the TSC_ADJUST values are the same
> > across cores it skips the TSC adjustments.
> 
> Hi,
> 
> Here is an updated diff with a few bugs eliminated from the previous and
> with most of the concerns I got in private and from Mark fixed.
> 
> I will do the TSC_ADJUST_MSR dance in another iteration if the current
> incarnation turns out to be correct for machines suffering from TSCs not
> in sync.
> 
> The thing I am mostly worried about now is in the following sum
> 
>  uint
>  tsc_get_timecount(struct timecounter *tc)
>  {
>   return rdtsc() + curcpu()->cpu_cc_skew;
>  }
>  
> can one term be executed on one CPU and the other on another? Is there a
> way to protect this from happening other than locking?

Our kernel is non-preemptable so a context switch will only happen if
you sleep.  So that isn't an issue.

> I see NetBSD is checking for a change in the number of context switches 
> of the current process.



Re: TSC synchronization on MP machines

2019-08-02 Thread Paul Irofti
On Mon, Jul 01, 2019 at 10:32:51AM +0200, Mark Kettenis wrote:
> > Date: Thu, 27 Jun 2019 15:08:00 +0300
> > From: Paul Irofti 
> > 
> > Hi,
> > 
> > Here is an initial diff, adapted from NetBSD, that synchronizes TSC
> > clocks across cores.
> > 
> > CPU0 is the reference clock and all others are skewed. During CPU
> > initialization the clocks synchronize by keeping a registry of each CPU
> > clock skewness and adapting the TSC read routine accordingly.
> > 
> > I choose this implementation over what FreeBSD is doing (which is just
> > copying Linux really), because it is clean and elegant.
> > 
> > I would love to hear reports from machines that were broken by this.
> > Mine, which never exhibited the problem in the first place, run just
> > fine with the following diff. In fact I am writting this message on one
> > such machine.
> > 
> > Also constructive comments are more than welcomed!
> > 
> > Notes:
> > 
> > - cpu_counter_serializing() could probably have a better name
> >   (tsc _read for example)
> > - the PAUSE instruction is probably not needed
> > - acpi(4) suspend and resume bits are left out on purpose, but should
> >   be trivial to add once the current diff settles
> > 
> > Paul Irofti
> 
> I don't think we want to introduce a  header file.
> 
> The code suffers from some NetBSD-isms, so that'll need to be fixed.
> I pointed some of them out below.
> 
> Also, how accurate is your skew detection?  What skew is detected on a
> machine that (supposedly) has the TSCs in sync?  The result will be
> that you actually slightly desync the counters on different CPUs.
> 
> I think Linux uses the TSC_ADJUST MSR and compares its value across
> cores.  If the skew is small and the TSC_ADJUST values are the same
> across cores it skips the TSC adjustments.

Hi,

Here is an updated diff with a few bugs eliminated from the previous and
with most of the concerns I got in private and from Mark fixed.

I will do the TSC_ADJUST_MSR dance in another iteration if the current
incarnation turns out to be correct for machines suffering from TSCs not
in sync.

The thing I am mostly worried about now is in the following sum

 uint
 tsc_get_timecount(struct timecounter *tc)
 {
return rdtsc() + curcpu()->cpu_cc_skew;
 }
 
can one term be executed on one CPU and the other on another? Is there a
way to protect this from happening other than locking?

I see NetBSD is checking for a change in the number of context switches 
of the current process.

My plan is to have a fix in the tree before 6.6 is released, so I would
love to hear your thoughts and reports on this.

Thanks,
Paul


Index: arch/amd64/amd64/cpu.c
===
RCS file: /cvs/src/sys/arch/amd64/amd64/cpu.c,v
retrieving revision 1.137
diff -u -p -u -p -r1.137 cpu.c
--- arch/amd64/amd64/cpu.c  28 May 2019 18:17:01 -  1.137
+++ arch/amd64/amd64/cpu.c  2 Aug 2019 10:25:04 -
@@ -754,6 +754,10 @@ cpu_init(struct cpu_info *ci)
cr4 = rcr4();
lcr4(cr4 & ~CR4_PGE);
lcr4(cr4);
+
+   /* Synchronize TSC */
+   if (!CPU_IS_PRIMARY(ci))
+ tsc_sync_ap(ci);
 #endif
 }
 
@@ -808,6 +812,7 @@ void
 cpu_start_secondary(struct cpu_info *ci)
 {
int i;
+   u_long s;
 
ci->ci_flags |= CPUF_AP;
 
@@ -828,6 +833,17 @@ cpu_start_secondary(struct cpu_info *ci)
printf("dropping into debugger; continue from here to resume 
boot\n");
db_enter();
 #endif
+   } else {
+   /*
+* Synchronize time stamp counters. Invalidate cache and do
+* twice (in tsc_sync_bp) to minimize possible cache effects.
+* Disable interrupts to try and rule out any external
+* interference.
+*/
+   s = intr_disable();
+   wbinvd();
+   tsc_sync_bp(ci);
+   intr_restore(s);
}
 
if ((ci->ci_flags & CPUF_IDENTIFIED) == 0) {
@@ -852,6 +868,8 @@ void
 cpu_boot_secondary(struct cpu_info *ci)
 {
int i;
+   int64_t drift;
+   u_long s;
 
atomic_setbits_int(>ci_flags, CPUF_GO);
 
@@ -864,6 +882,17 @@ cpu_boot_secondary(struct cpu_info *ci)
printf("dropping into debugger; continue from here to resume 
boot\n");
db_enter();
 #endif
+   } else {
+   /* Synchronize TSC again, check for drift. */
+   drift = ci->cpu_cc_skew;
+   s = intr_disable();
+   wbinvd();
+   tsc_sync_bp(ci);
+   intr_restore(s);
+   drift -= ci->cpu_cc_skew;
+   printf("TSC skew=%lld drift=%lld\n",
+   (long long)ci->cpu_cc_skew, (long long)drift);
+   tsc_sync_drift(drift);
}
 }
 
@@ -888,7 +917,13 @@ cpu_hatch(void *v)
panic("%s: already running!?", ci->ci_dev->dv_xname);
 #endif
 
+   /*
+* Synchronize