Re: [please test] tsc: derive frequency on AMD CPUs from MSRs

2022-09-23 Thread Timo Myyrä
Scott Cheloha  [2022-09-23, 14:51 -0500]:

> On Fri, Sep 23, 2022 at 10:40:19PM +0300, Timo Myyr?? wrote:
>
>> Scott Cheloha  [2022-09-23, 09:16 -0500]:
>> 
>> > [...]
>> >
>> > Test results?  Clues on reading the configuration space?
>> >
>> > [...]
>> 
>> Hi,
>> 
>> Here's a dmesg from thinkpad e485:
>
> Thanks for testing.
>
>> Does these timers affect the booting of kernel? Once I select the kernel
>> to boot by pressing enter on "bsd>" line, the boot process takes about
>> 18s to proceed from the "booting sr0a:/bsd".
>
> The patch reads a couple MSRs and prints ~10 additional lines during
> boot from the primary CPU.  The computed TSC frequency is not used by
> the kernel, only printed so I can check whether my code is correct.
>
> It should have zero impact on the length of the boot.  It should not
> change any runtime behavior whatsoever.
>
> Your boot probably should not be taking that long, but I can't imagine
> how my patch would cause such a dramatic change.
>
> If you reverse the patch, what happens?
>

I haven't been keeping track of boot times but I doubt it is new issue
with this laptop. Current cold boot seemed to pass the "counters" part
in about 5s so it varies a bit. I'll see if I find the time to dig
through the code to see what boot process is actually doing at that
point.

>> OpenBSD 7.2 (GENERIC.MP) #20: Fri Sep 23 22:27:31 EEST 2022
>> t...@asteroid.bittivirhe.fi:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>> [...]
>> cpu0 at mainbus0: apid 0 (boot processor)
>> cpu0: MSR C001_0064: en 1 base 2 mul 100 div 10 freq 20 Hz
>> cpu0: MSR C001_0065: en 1 base 2 mul 102 div 12 freq 17 Hz
>> cpu0: MSR C001_0066: en 1 base 2 mul 96 div 12 freq 16 Hz
>> cpu0: MSR C001_0067: en 0
>> cpu0: MSR C001_0068: en 0
>> cpu0: MSR C001_0069: en 0
>> cpu0: MSR C001_006A: en 0
>> cpu0: MSR C001_006B: en 0
>> cpu0: AMD Ryzen 5 2500U with Radeon Vega Mobile Gfx, 1996.30 MHz, 17-11-00
>> cpu0:
>> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA,IBPB,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
>> cpu0: 32KB 64b/line 8-way D-cache, 64KB 64b/line 4-way I-cache, 512KB 
>> 64b/line 8-way L2 cache, 4MB 64b/line 16-way L3 cache
>> tsc: calibrating with acpihpet0: 1996264149 Hz
>
> Your family 17h CPU has a computed P0 frequency of 2000MHz.  The
> calibrated TSC frequency is 1996264149 Hz.
>
> That seems right to me, thank you for testing.



Re: [please test] tsc: derive frequency on AMD CPUs from MSRs

2022-09-23 Thread Timo Myyrä
Scott Cheloha  [2022-09-23, 09:16 -0500]:

> Hi,
>
> TL;DR:
>
> I want to compute the TSC frequency on AMD CPUs using the methods laid
> out in the AMD manuals instead of calibrating the TSC by hand.
>
> If you have an AMD CPU with an invariant TSC, please apply this patch,
> recompile/boot the resulting kernel, and send me the resulting dmesg.
>
> Family 10h-16h CPUs are especially interesting.  If you've got one,
> don't be shy!
>
> Long explanation:
>
> On AMD CPUs we calibrate the TSC with a separate timer.  This is slow
> and introduces error.  I also worry about a future where legacy timers
> are absent or heavily gated (read: useless).
>
> This patch adds most of the code needed to compute the TSC frequency
> on AMD family 10h+ CPUs.  CPUs prior to family 10h did not support an
> invariant TSC so they are irrelevant.
>
> I have riddled the code with printf(9) calls so I can work out what's
> wrong by hand if a test result makes no sense.
>
> The only missing piece is code to read the configuration space on
> family 10h-16h CPUs to determine how many boosted P-states we need to
> skip to get to the MSR describing the software P0 state.  I would
> really appreciate it if someone could explain how to do this at this
> very early point in boot.  jsg@ pointed me to pci_conf_read(9), but
> I'm a little confused about how I get the needed pci* inputs at this
> point in boot.
>
> --
>
> Test results?  Clues on reading the configuration space?
>
> -Scott
>
> Index: tsc.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
> retrieving revision 1.29
> diff -u -p -r1.29 tsc.c
> --- tsc.c 22 Sep 2022 04:57:08 -  1.29
> +++ tsc.c 23 Sep 2022 14:04:22 -
> @@ -100,6 +100,253 @@ tsc_freq_cpuid(struct cpu_info *ci)
>   return (0);
>  }
>  
> +uint64_t
> +tsc_freq_msr(struct cpu_info *ci)
> +{
> + uint64_t base, def, did, did_lsd, did_msd, divisor, fid, multiplier;
> + uint32_t msr, off = 0;
> +
> + if (strcmp(cpu_vendor, "AuthenticAMD") != 0)
> + return 0;
> +
> + /*
> +  * All family 10h+ CPUs have MSR_HWCR and the TscFreqSel bit.
> +  * If TscFreqSel is not set the TSC does not advance at the P0
> +  * frequency, in which case something is wrong and we need to
> +  * calibrate by hand.
> +  */
> +#define HWCR_TSCFREQSEL (1 << 24)
> + if (!ISSET(rdmsr(MSR_HWCR), HWCR_TSCFREQSEL))   /* XXX specialreg.h */
> + return 0;
> +#undef HWCR_TSCFREQSEL
> +
> + /*
> +  * For families 10h, 12h, 14h, 15h, and 16h, we need to skip past
> +  * the boosted P-states (Pb0, Pb1, etc.) to find the MSR describing
> +  * P0, i.e. the highest performance unboosted P-state.  The number
> +  * of boosted states is kept in the "Core Performance Boost Control"
> +  * configuration space register.
> +  */
> +#ifdef __not_yet__
> + uint32_t reg;
> + switch (ci->ci_family) {
> + case 0x10:
> + /* XXX How do I read config space at this point in boot? */
> + reg = read_config_space(F4x15C);
> + off = (reg >> 2) & 0x1;
> + break;
> + case 0x12:
> + case 0x14:
> + case 0x15:
> + case 0x16:
> + /* XXX How do I read config space at this point in boot? */
> + reg = read_config_space(D18F4x15C);
> + off = (reg >> 2) & 0x7;
> + break;
> + default:
> + break;
> + }
> +#endif
> +
> +/* DEBUG Let's look at all the MSRs to check my math. */
> +for (; off < 8; off++) {
> +
> + /*
> +  * In family 10h+, core P-state voltage/frequency definitions
> +  * are kept in MSRs C001_006[4:B] (eight registers in total).
> +  * All MSRs in the range are readable, but if the EN bit isn't
> +  * set the register doesn't define a valid P-state.
> +  */
> + msr = 0xc0010064 + off; /* XXX specialreg.h */
> + def = rdmsr(msr);
> + printf("%s: MSR %04X_%04X: en %d",
> + ci->ci_dev->dv_xname, msr >> 16, msr & 0x,
> + !!ISSET(def, 1ULL << 63));
> + if (!ISSET(def, 1ULL << 63)) {  /* XXX specialreg.h */
> + printf("\n");
> + continue;
> + }
> + switch (ci->ci_family) {
> + case 0x10:
> + /* AMD Family 10h Processor BKDG, Rev 3.62, p. 429 */
> + base = 1;   /* 100.0 MHz */
> + did = (def >> 6) & 0x7;
> + divisor = 1ULL << did;
> + fid = def & 0x1f;
> + multiplier = fid + 0x10;
> + printf(" base %llu did %llu div %llu fid %llu mul %llu",
> + base, did, divisor, fid, multiplier);
> + break;
> + case 0x11:
> + /* AMD Family 11h Processor BKDG, Rev 3.62, p. 236 */
> + base = 1;   /* 100.0 MHz */
> + did = (def >> 6) & 0x7;
> + divisor = 1ULL << did;
> + fid = def 

Re: [v5] amd64: simplify TSC sync testing

2022-07-31 Thread Timo Myyrä
Scott Cheloha  [2022-07-30, 22:13 -0500]:

> Hi,
>
> At the urging of sthen@ and dv@, here is v5.
>
> Two major changes from v4:
>
> - Add the function tc_reset_quality() to kern_tc.c and use it
>   to lower the quality of the TSC timecounter if we fail the
>   sync test.
>
>   tc_reset_quality() will choose a new active timecounter if,
>   after the quality change, the given timecounter is no longer
>   the best timecounter.
>
>   The upshot is: if you fail the TSC sync test you should boot
>   with the HPET as your active timecounter.  If you don't have
>   an HPET you'll be using something else.
>
> - Drop the SMT accomodation from the hot loop.  It hasn't been
>   necessary since last year when I rewrote the test to run without
>   a mutex.  In the rewritten test, the two CPUs in the hot loop
>   are not competing for any resources so they should not be able
>   to starve one another.
>
> dv: Could you double-check that this still chooses the right
> timecounter on your machine?  If so, I will ask deraadt@ to
> put this into snaps to replace v4.
>
> Additional test reports are welcome.  Include your dmesg.
>
> --
>
> I do not see much more I can do to improve this patch.
>
> I am seeking patch review and OKs.
>
> I am especially interested in whether my assumptions in tsc_ap_test()
> and tsc_bp_test() are correct.  The whole patch depends on those
> assumptions.  Is this a valid way to test for TSC desync?  Or am I
> missing membar_producer()/membar_consumer() calls?
>
> Here is the long version of "what" and "why" for this patch.
>
> The patch is attached at the end.
>
> - Computing a per-CPU TSC skew value is error-prone, especially
>   on multisocket machines and VMs.  My best guess is that larger
>   latencies appear to the skew measurement test as TSC desync,
>   and so the TSC is demoted to a kernel timecounter on these
>   machines or marked non-monotonic.
>
>   This patch eliminates per-CPU TSC skew values.  Instead of trying
>   to measure and correct for TSC desync we only try to detect desync,
>   which is less error-prone.  This approach should allow a wider
>   variety of machines to use the TSC as a timecounter when running
>   OpenBSD.
>
> - In the new sync test, both CPUs repeatedly try to detect whether
>   their TSC is trailing the other CPU's TSC.  The upside to this
>   approach is that it yields no false positives (if my assumptions
>   about AMD64 memory access and instruction serialization are correct).
>   The downside to this approach is that it takes more time than the
>   current skew measurement test.  Each test round takes 1ms, and
>   we run up to two rounds per CPU, so this patch slows boot down
>   by 2ms per AP.
>
> - If any CPU fails the sync test, the TSC is marked non-monotonic
>   and a different timecounter is activated.  The TC_USER flag
>   remains intact.  There is no "middle ground" where we fall back
>   to only using the TSC in the kernel.
>
> - Because there is no per-CPU skew value, there is also no concept
>   of TSC drift anymore.
>
> - Before running the test, we check for the IA32_TSC_ADJUST
>   register and reset it if necessary.  This is a trivial way
>   to work around firmware bugs that desync the TSC before we
>   reach the kernel.
>
>   Unfortunately, at the moment this register appears to only
>   be available on Intel processors and I cannot find an equivalent
>   but differently-named MSR for AMD processors.
>
> --
>
> Index: sys/arch/amd64/amd64/tsc.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
> retrieving revision 1.24
> diff -u -p -r1.24 tsc.c
> --- sys/arch/amd64/amd64/tsc.c31 Aug 2021 15:11:54 -  1.24
> +++ sys/arch/amd64/amd64/tsc.c31 Jul 2022 03:06:39 -
> @@ -36,13 +36,6 @@ inttsc_recalibrate;
>  uint64_t tsc_frequency;
>  int  tsc_is_invariant;
>  
> -#define  TSC_DRIFT_MAX   250
> -#define TSC_SKEW_MAX 100
> -int64_t  tsc_drift_observed;
> -
> -volatile int64_t tsc_sync_val;
> -volatile struct cpu_info *tsc_sync_cpu;
> -
>  u_inttsc_get_timecount(struct timecounter *tc);
>  void tsc_delay(int usecs);
>  
> @@ -236,22 +229,12 @@ cpu_recalibrate_tsc(struct timecounter *
>  u_int
>  tsc_get_timecount(struct timecounter *tc)
>  {
> - return rdtsc_lfence() + curcpu()->ci_tsc_skew;
> + return rdtsc_lfence();
>  }
>  
>  void
>  tsc_timecounter_init(struct cpu_info *ci, uint64_t cpufreq)
>  {
> -#ifdef TSC_DEBUG
> - printf("%s: TSC skew=%lld observed drift=%lld\n", ci->ci_dev->dv_xname,
> - (long long)ci->ci_tsc_skew, (long long)tsc_drift_observed);
> -#endif
> - if (ci->ci_tsc_skew < -TSC_SKEW_MAX || ci->ci_tsc_skew > TSC_SKEW_MAX) {
> - printf("%s: disabling user TSC (skew=%lld)\n",
> - ci->ci_dev->dv_xname, (long long)ci->ci_tsc_skew);
> - tsc_timecounter.tc_user = 

Re: [v4] amd64: simplify TSC sync testing

2022-07-28 Thread Timo Myyrä
Scott Cheloha  [2022-07-28, 20:34 -0500]:

> On Thu, Jul 28, 2022 at 04:57:41PM -0400, Dave Voutila wrote:
>
>> 
>> Stuart Henderson  writes:
>> 
>> > On 2022/07/28 12:57, Scott Cheloha wrote:
>> >> On Thu, Jul 28, 2022 at 07:55:40AM -0400, Dave Voutila wrote:
>> >> >
>> >> > This is breaking timecounter selection on my x13 Ryzen 5 Pro laptop
>> >> > running the latest kernel from snaps.
>> >>
>> >> Define "breaking".
>> >
>> > That's clear from the output:
>> >
>> > : On 2022/07/28 07:55, Dave Voutila wrote:
>> > : > $ sysctl -a | grep tsc
>> > : > kern.timecounter.choice=i8254(0) tsc(-1000) acpihpet0(1000)
>> > : > acpitimer0(1000)
>> > : > machdep.tscfreq=2096064730
>> > : > machdep.invarianttsc=1
>> > : >
>> > : > $ sysctl kern.timecounter
>> > : > kern.timecounter.tick=1
>> > : > kern.timecounter.timestepwarnings=0
>> > : > kern.timecounter.hardware=i8254
>> > : > kern.timecounter.choice=i8254(0) tsc(-1000) acpihpet0(1000)
>> > : > acpitimer0(1000)
>> >
>> >> The code detects TSC desync and marks the timecounter non-monotonic.
>> >
>> > That's good (and I think as would have happened before)
>> >
>> >> So it uses the i8254 instead.
>> >
>> > But that's not so good, there are higher prio timecounters available,
>> > acpihpet0 and acpitimer0, which would be better choices than i8254.
>> 
>> Exactly my point. Thanks Stuart.
>
> Okay, please try this patch on the machine in question.
>
> It adds a tc_detach() function to kern_tc.c.  The first time we fail
> the sync test, the BP calls tc_detach(), changes the TSC's tc_quality
> to a negative value to tell everyone "this is not monotonic", then
> reinstalls the TSC timecounter again with tc_init().
>
> Because we are making this call *once*, from one place, I do not think
> the O(n) removal time matters, so I have not switched the tc_list from
> SLIST to TAILQ.
>
> It is possible for a thread to be asleep in sysctl_tc_hardware()
> during resume, but the thread would be done iterating through the list
> if it had reached rw_enter_write(), so removing/adding tsc_timecounter
> to the list during resume cannot break list traversal.
>
> Switching the active timecounter during resume is also fine.  The only
> race is with tc_adjfreq().  If a thread is asleep in adjfreq(2) when
> the system suspends, and we change the active timecounter during
> resume, the frequency change may be applied to the "wrong" timecounter.
>
> ... but this is always a race, because adjfreq(2) only operates on the
> active timecounter, and root can change it at any time via sysctl(2).
> So it's not a new problem.
>
> ...
>
> It might be simpler to just change tc_lock from a rwlock to a mutex.
> Then the MP analysis is much simpler across a suspend/resume.
>
> Index: sys/arch/amd64/amd64/tsc.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
> retrieving revision 1.24
> diff -u -p -r1.24 tsc.c
> --- sys/arch/amd64/amd64/tsc.c31 Aug 2021 15:11:54 -  1.24
> +++ sys/arch/amd64/amd64/tsc.c29 Jul 2022 01:06:17 -
> @@ -36,13 +36,6 @@ inttsc_recalibrate;
>  uint64_t tsc_frequency;
>  int  tsc_is_invariant;
>  
> -#define  TSC_DRIFT_MAX   250
> -#define TSC_SKEW_MAX 100
> -int64_t  tsc_drift_observed;
> -
> -volatile int64_t tsc_sync_val;
> -volatile struct cpu_info *tsc_sync_cpu;
> -
>  u_inttsc_get_timecount(struct timecounter *tc);
>  void tsc_delay(int usecs);
>  
> @@ -236,22 +229,12 @@ cpu_recalibrate_tsc(struct timecounter *
>  u_int
>  tsc_get_timecount(struct timecounter *tc)
>  {
> - return rdtsc_lfence() + curcpu()->ci_tsc_skew;
> + return rdtsc_lfence();
>  }
>  
>  void
>  tsc_timecounter_init(struct cpu_info *ci, uint64_t cpufreq)
>  {
> -#ifdef TSC_DEBUG
> - printf("%s: TSC skew=%lld observed drift=%lld\n", ci->ci_dev->dv_xname,
> - (long long)ci->ci_tsc_skew, (long long)tsc_drift_observed);
> -#endif
> - if (ci->ci_tsc_skew < -TSC_SKEW_MAX || ci->ci_tsc_skew > TSC_SKEW_MAX) {
> - printf("%s: disabling user TSC (skew=%lld)\n",
> - ci->ci_dev->dv_xname, (long long)ci->ci_tsc_skew);
> - tsc_timecounter.tc_user = 0;
> - }
> -
>   if (!(ci->ci_flags & CPUF_PRIMARY) ||
>   !(ci->ci_flags & CPUF_CONST_TSC) ||
>   !(ci->ci_flags & CPUF_INVAR_TSC))
> @@ -268,111 +251,276 @@ tsc_timecounter_init(struct cpu_info *ci
>   calibrate_tsc_freq();
>   }
>  
> - if (tsc_drift_observed > TSC_DRIFT_MAX) {
> - printf("ERROR: %lld cycle TSC drift observed\n",
> - (long long)tsc_drift_observed);
> - tsc_timecounter.tc_quality = -1000;
> - tsc_timecounter.tc_user = 0;
> - tsc_is_invariant = 0;
> - }
> -
>   tc_init(_timecounter);
>  }
>  
> -/*
> - * Record drift (in clock cycles).  Called during AP startup.
> - */
>  void
> 

Re: Driver and kernel recognition for intel AX210 wifi chip

2021-07-26 Thread Timo Myyrä
Alex Beakes  [2021-07-26, 07:24 +]:

> I figured it would be more appropriate mailing tech then misc, I hope such 
> emails are allowed.
>
> Intel wifi card: AX210 in the T14s gen2 intel
>
> If anything there is an email in misc with all the details of the probmem. 
> The subject: 
>>Device not recognized: T14s gen 2 (intel); soldered intel wifi chip isn't 
>>recognized by openbsd.
>
>
> Can anyone help me with writing the needed software?
>
> Or at least if there are some books or websites I can read and understand, so 
> that I can understand and maybe start the process
>
> And how hard is it to write drivers and dev type programs for openbsd?
>
> How does one start to implement stuff like that?
>
> Just want to start using openbsd as soon as possible, start learning the 
> system and maybe if I get good start patching it.
>
> I have a crude understanding of C, C++. Better of rust. Don't know much about 
> deep kernel and OS stuff. Want to learn more.
>
> Thanks

Check out following paper to get started:
https://www.openbsd.org/papers/eurobsdcon2017-device-drivers.pdf

Timo



Re: Pump my sched: fewer SCHED_LOCK() & kill p_priority

2019-06-27 Thread Timo Myyrä
Amit Kulkarni  writes:

>> root on sd2a (88532b67c09ce3ee.a) swap on sd2b dump on sd2b
>> TSC skew=-6129185140 drift=170
>> TSC skew=-6129184900 drift=-10
>> TSC skew=-6129184890 drift=-20
>> TSC skew=-6129184910 drift=30
>> TSC skew=-6129184910 drift=10
>> TSC skew=-6129184900 drift=20
>> TSC skew=-6129184910 drift=30
>> iwm0: hw rev 0x230, fw ver 22.361476.0, address 68:ec:c5:ad:9a:cb
>> initializing kernel modesetting (RAVEN 0x1002:0x15DD 0x17AA:0x506F 0xC4).
>> amdgpu0: 1920x1080, 32bpp
>> wsdisplay0 at amdgpu0 mux 1: console (std, vt100 emulation), using wskbd0
>> wsdisplay0: screen 1-5 added (std, vt100 emulation)
>>
>
> It seems that you have Paul's TSC patch also applied. Please apply
> just one patch and test separately, and then report back!
>
> Thanks

Ah, I tested also without the TSC patch and it didn't make any difference.
Only other tweak is enabled amdgpu driver in GENERIC.

timo



Re: Pump my sched: fewer SCHED_LOCK() & kill p_priority

2019-06-27 Thread Timo Myyrä
Martin Pieuchot  writes:

> On 06/06/19(Thu) 15:16, Martin Pieuchot wrote:
>> On 02/06/19(Sun) 16:41, Martin Pieuchot wrote:
>> > On 01/06/19(Sat) 18:55, Martin Pieuchot wrote:
>> > > Diff below exists mainly for documentation and test purposes.  If
>> > > you're not interested about how to break the scheduler internals in
>> > > pieces, don't read further and go straight to testing!
>> > > 
>> > > - First change is to stop calling tsleep(9) at PUSER.  That makes
>> > >   it clear that all "sleeping priorities" are smaller than PUSER.
>> > >   That's important to understand for the diff below.  `p_priority'
>> > >   is currently a placeholder for the "sleeping priority" and the
>> > >   "runnqueue priority".  Both fields are separated by this diff.
>> > > 
>> > > - When a thread goes to sleep, the priority argument of tsleep(9) is
>> > >   now recorded in `p_slpprio'.  This argument can be considered as part
>> > >   of the sleep queue.  Its purpose is to place the thread into a higher
>> > >   runqueue when awoken.
>> > > 
>> > > - Currently, for stopped threads, `p_priority' correspond to `p_usrpri'. 
>> > >   So setrunnable() has been untangled to place SSTOP and SSLEEP threads
>> > >   in the preferred queue without having to use `p_priority'.  Note that
>> > >   `p_usrpri' is still recalculated *after* having called setrunqueue().
>> > >   This is currently fine because setrunnable() is called with 
>> > > SCHED_LOCK() 
>> > >   but it will be racy when we'll split it.
>> > > 
>> > > - A new field, `p_runprio' has been introduced.  It should be considered
>> > >   as part of the per-CPU runqueues.  It indicates where a current thread
>> > >   is placed.
>> > > 
>> > > - `spc_curpriority' is now updated at every context-switch.  That means
>> > >need_resched() won't be called after comparing an out-of-date value.
>> > >At the same time, `p_usrpri' is initialized to the highest possible
>> > >value for idle threads.
>> > > 
>> > > - resched_proc() was calling need_resched() in the following conditions:
>> > >- If the SONPROC thread has a higher priority that the current
>> > >  running thread (itself).
>> > >- Twice in setrunnable() when we know that p_priority <= p_usrpri.
>> > >- If schedcpu() considered that a thread, after updating its prio,
>> > >  should preempt the one running on the CPU pointed by `p_cpu'. 
>> > > 
>> > >   The diff below simplify all of that by calling need_resched() when:
>> > >- A thread is inserted in a CPU runqueue at a higher priority than
>> > >  the one SONPROC.
>> > >- schedcpu() decides that a thread in SRUN state should preempt the
>> > >  one SONPROC.
>> > > 
>> > > - `p_estcpu' `p_usrpri' and `p_slptime' which represent the "priority"
>> > >   of a thread are now updated while holding a per-thread mutex.  As a
>> > >   result schedclock() and donice() no longer takes the SCHED_LOCK(),
>> > >   and schedcpu() almost never take it.
>> > > 
>> > > - With this diff top(1) and ps(1) will report the "real" `p_usrpi' value
>> > >   when displaying priorities.  This is helpful to understand what's
>> > >   happening:
>> > > 
>> > > load averages:  0.99,  0.56,  0.25   two.lab.grenadille.net 
>> > > 23:42:10
>> > > 70 threads: 68 idle, 2 on processor
>> > > up  0:09
>> > > CPU0:  0.0% user,  0.0% nice, 51.0% sys,  2.0% spin,  0.0% intr, 47.1% 
>> > > idle
>> > > CPU1:  2.0% user,  0.0% nice, 51.0% sys,  3.9% spin,  0.0% intr, 43.1% 
>> > > idle
>> > > Memory: Real: 47M/1005M act/tot Free: 2937M Cache: 812M Swap: 0K/4323M
>> > > 
>> > >   PID  TID PRI NICE  SIZE   RES STATE WAIT  TIMECPU 
>> > > COMMAND
>> > > 81000   145101  7200K 1664K sleep/1   bored 1:15 36.96% 
>> > > softnet
>> > > 47133   244097  730 2984K 4408K sleep/1   netio 1:06 35.06% cvs 
>> > > 64749   522184  660  176K  148K onproc/1  - 0:55 28.81% nfsd
>> > > 21615   602473 12700K 1664K sleep/0   - 7:22  0.00% 
>> > > idle0  
>> > > 12413   606242 12700K 1664K sleep/1   - 7:08  0.00% idle1
>> > > 85778   338258  500 4936K 7308K idle  select0:10  0.00% ssh  
>> > > 22771   575513  500  176K  148K sleep/0   nfsd  0:02  0.00% nfsd 
>> > > 
>> > > 
>> > > 
>> > > - The removal of `p_priority' and the change that makes mi_switch()
>> > >   always update `spc_curpriority' might introduce some changes in
>> > >   behavior, especially with kernel threads that were not going through
>> > >   tsleep(9).  We currently have some situations where the priority of
>> > >   the running thread isn't correctly reflected.  This diff changes that
>> > >   which means we should be able to better understand where the problems
>> > >   are.
>> > > 
>> > > I'd be interested in comments/tests/reviews before continuing in this
>> > > direction.  Note that at least part of this diff are required to split
>> > > the accounting apart from 

Re: TSC synchronization on MP machines

2019-06-27 Thread Timo Myyrä
Paul Irofti  writes:

> Hi,
>
> Here is an initial diff, adapted from NetBSD, that synchronizes TSC
> clocks across cores.
>
> CPU0 is the reference clock and all others are skewed. During CPU
> initialization the clocks synchronize by keeping a registry of each CPU
> clock skewness and adapting the TSC read routine accordingly.
>
> I choose this implementation over what FreeBSD is doing (which is just
> copying Linux really), because it is clean and elegant.
>
> I would love to hear reports from machines that were broken by this.
> Mine, which never exhibited the problem in the first place, run just
> fine with the following diff. In fact I am writting this message on one
> such machine.
>
> Also constructive comments are more than welcomed!
>
> Notes:
>
> - cpu_counter_serializing() could probably have a better name
>   (tsc _read for example)
> - the PAUSE instruction is probably not needed
> - acpi(4) suspend and resume bits are left out on purpose, but should
>   be trivial to add once the current diff settles
>
> Paul Irofti
>
> Index: arch/amd64/amd64/cpu.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/cpu.c,v
> retrieving revision 1.137
> diff -u -p -u -p -r1.137 cpu.c
> --- arch/amd64/amd64/cpu.c28 May 2019 18:17:01 -  1.137
> +++ arch/amd64/amd64/cpu.c27 Jun 2019 11:55:08 -
> @@ -96,6 +96,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #if NLAPIC > 0
>  #include 
> @@ -754,6 +755,10 @@ cpu_init(struct cpu_info *ci)
>   cr4 = rcr4();
>   lcr4(cr4 & ~CR4_PGE);
>   lcr4(cr4);
> +
> + /* Synchronize TSC */
> + if (!CPU_IS_PRIMARY(ci))
> +   tsc_sync_ap(ci);
>  #endif
>  }
>  
> @@ -808,6 +813,7 @@ void
>  cpu_start_secondary(struct cpu_info *ci)
>  {
>   int i;
> + u_long s;
>  
>   ci->ci_flags |= CPUF_AP;
>  
> @@ -828,8 +834,20 @@ cpu_start_secondary(struct cpu_info *ci)
>   printf("dropping into debugger; continue from here to resume 
> boot\n");
>   db_enter();
>  #endif
> + } else {
> + /*
> +  * Synchronize time stamp counters. Invalidate cache and do
> +  * twice (in tsc_sync_bp) to minimize possible cache effects.
> +  * Disable interrupts to try and rule out any external
> +  * interference.
> +  */
> + s = intr_disable();
> + wbinvd();
> + tsc_sync_bp(ci);
> + intr_restore(s);
>   }
>  
> +
>   if ((ci->ci_flags & CPUF_IDENTIFIED) == 0) {
>   atomic_setbits_int(>ci_flags, CPUF_IDENTIFY);
>  
> @@ -852,6 +870,8 @@ void
>  cpu_boot_secondary(struct cpu_info *ci)
>  {
>   int i;
> + int64_t drift;
> + u_long s;
>  
>   atomic_setbits_int(>ci_flags, CPUF_GO);
>  
> @@ -864,6 +884,17 @@ cpu_boot_secondary(struct cpu_info *ci)
>   printf("dropping into debugger; continue from here to resume 
> boot\n");
>   db_enter();
>  #endif
> + } else {
> + /* Synchronize TSC again, check for drift. */
> + drift = ci->cpu_cc_skew;
> + s = intr_disable();
> + wbinvd();
> + tsc_sync_bp(ci);
> + intr_restore(s);
> + drift -= ci->cpu_cc_skew;
> + printf("TSC skew=%lld drift=%lld\n",
> + (long long)ci->cpu_cc_skew, (long long)drift);
> + tsc_sync_drift(drift);
>   }
>  }
>  
> @@ -888,7 +919,13 @@ cpu_hatch(void *v)
>   panic("%s: already running!?", ci->ci_dev->dv_xname);
>  #endif
>  
> + /*
> +  * Synchronize the TSC for the first time. Note that interrupts are
> +  * off at this point.
> +  */
> + wbinvd();
>   ci->ci_flags |= CPUF_PRESENT;
> + tsc_sync_ap(ci);
>  
>   lapic_enable();
>   lapic_startclock();
> Index: arch/amd64/amd64/tsc.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
> retrieving revision 1.11
> diff -u -p -u -p -r1.11 tsc.c
> --- arch/amd64/amd64/tsc.c6 Jun 2019 19:43:35 -   1.11
> +++ arch/amd64/amd64/tsc.c27 Jun 2019 11:55:08 -
> @@ -1,8 +1,10 @@
>  /*   $OpenBSD: tsc.c,v 1.11 2019/06/06 19:43:35 kettenis Exp $   */
>  /*
> + * Copyright (c) 2008 The NetBSD Foundation, Inc.
>   * Copyright (c) 2016,2017 Reyk Floeter 
>   * Copyright (c) 2017 Adam Steen 
>   * Copyright (c) 2017 Mike Belopuhov 
> + * Copyright (c) 2019 Paul Irofti 
>   *
>   * Permission to use, copy, modify, and distribute this software for any
>   * purpose with or without fee is hereby granted, provided that the above
> @@ -20,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -33,6 +36,13 @@ inttsc_recalibrate;
>  uint64_t tsc_frequency;
>  int  tsc_is_invariant;
>  
> +static int64_t   tsc_drift_max = 250;/* max cycles 

Re: inteldrm(4) diff needs review and testing

2017-07-17 Thread Timo Myyrä
Mark Kettenis  writes:

> Can somebody test the following diff on Ivy Bridge or Haswell (Intel
> HD Graphics 2500/4000/4600/4700/5000/5100/5200)?
>
> When I added support for the command parser, I took a bit of a
> shortcut and implemented the hash tables as a single linked list.
> This diff fixes that.
>
> For the hash function I used a "mode (size-1)" approach that leaves
> one of the hash table entries unused.  Perhaps somebody with a CS
> background has a better idea that isn't too complicated to implement?
>
> Paul, Stuart, there is a small chance that this will improve the
> vncviewer performance.
>
>
> Index: dev/pci/drm/drm_linux.h
> ===
> RCS file: /cvs/src/sys/dev/pci/drm/drm_linux.h,v
> retrieving revision 1.56
> diff -u -p -r1.56 drm_linux.h
> --- dev/pci/drm/drm_linux.h   14 Jul 2017 11:18:04 -  1.56
> +++ dev/pci/drm/drm_linux.h   16 Jul 2017 12:54:51 -
> @@ -40,6 +40,7 @@
>  
>  #include 
>  #include 
> +#include 
>  
>  /* The Linux code doesn't meet our usual standards! */
>  #ifdef __clang__
> @@ -202,16 +203,42 @@ bitmap_weight(void *p, u_int n)
>   return sum;
>  }
>  
> -#define DECLARE_HASHTABLE(x, y) struct hlist_head x;
> +#define DECLARE_HASHTABLE(name, bits) struct hlist_head name[1 << (bits)]
>  
> -#define hash_init(x) INIT_HLIST_HEAD(&(x))
> -#define hash_add(x, y, z)hlist_add_head(y, &(x))
> -#define hash_del(x)  hlist_del_init(x)
> -#define hash_empty(x)hlist_empty(&(x))
> -#define hash_for_each_possible(a, b, c, d) \
> - hlist_for_each_entry(b, &(a), c)
> -#define hash_for_each_safe(a, b, c, d, e) (void)(b); \
> - hlist_for_each_entry_safe(d, c, &(a), e)
> +static inline void
> +__hash_init(struct hlist_head *table, u_int size)
> +{
> + u_int i;
> +
> + for (i = 0; i < size; i++)
> + INIT_HLIST_HEAD([i]);
> +}
> +
> +static inline bool
> +__hash_empty(struct hlist_head *table, u_int size)
> +{
> + u_int i;
> +
> + for (i = 0; i < size; i++) {
> + if (!hlist_empty([i]))
> + return false;
> + }
> +
> + return true;
> +}
> +
> +#define __hash(table, key)   [key % (nitems(table) - 1)]
> +
> +#define hash_init(table) __hash_init(table, nitems(table))
> +#define hash_add(table, node, key) \
> + hlist_add_head(node, __hash(table, key))
> +#define hash_del(node)   hlist_del_init(node)
> +#define hash_empty(table)__hash_empty(table, nitems(table))
> +#define hash_for_each_possible(table, obj, member, key) \
> + hlist_for_each_entry(obj, __hash(table, key), member)
> +#define hash_for_each_safe(table, i, tmp, obj, member)   \
> + for (i = 0; i < nitems(table); i++) \
> +hlist_for_each_entry_safe(obj, tmp, [i], member)
>  
>  #define ACCESS_ONCE(x)   (x)
>  

Seems to work here on HD4000. Quickly tested browsing, watching videos and
suspend/resume. I didn't notice any regressions.

Though it still has following lines: 
error: [drm:pid0:cpt_set_fifo_underrun_reporting] *ERROR* uncleared pch fifo 
underrun on pch transcoder A
error: [drm:pid0:intel_pch_fifo_underrun_irq_handler] *ERROR* PCH transcoder A 
FIFO underrun

These probably appeared once the intel driver was switched. So far they don't
seem to cause any noticable issues.

Timo

OpenBSD 6.1-current (GENERIC.MP) #8: Mon Jul 17 08:58:12 EEST 2017
tmy@phobos.TeleWell.gateway:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 16973611008 (16187MB)
avail mem = 16453402624 (15691MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xdae9c000 (68 entries)
bios0: vendor LENOVO version "G7ETA4WW (2.64 )" date 10/08/2015
bios0: LENOVO 2355C16
acpi0 at bios0: rev 2
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP SLIC TCPA SSDT SSDT SSDT HPET APIC MCFG ECDT FPDT ASF! 
UEFI UEFI POAT SSDT SSDT DMAR UEFI DBG2
acpi0: wakeup devices LID_(S4) SLPB(S3) IGBE(S4) EXP3(S4) XHCI(S3) EHC1(S3) 
EHC2(S3) HDEF(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 14318179 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2594.56 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,SENSOR,ARAT
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: TSC frequency 2594563520 Hz
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.1.2, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz, 2594.12 

Re: 11n support for athn(4)

2017-03-06 Thread Timo Myyrä
Stefan Sperling <s...@stsp.name> writes:

> On Mon, Mar 06, 2017 at 07:36:21AM +0200, Timo Myyrä wrote:
>
>> Did some tcpbench testing and got following results:
>> Each test run with: tcpbench -s || tcpbench -t 15  commands.
>> Host AP: apu 2b4 with athn, client = thinkpad t430s with iwn (OpenBSD)
>> 
>> channel 9 running old snapshot etc:
>> 11n client -> server ~4, server -> ~0,
>> 11g client -> server ~16, server -> client ~0-6mbs
>> ---
>> updated to new snapshot: OpenBSD 6.0-current (GENERIC.MP) #206: Sat Mar  4 
>> 09:22:00 MST 2017
>> = added another antenna, moved the ap to better spot, switched old ap off, 
>> changed channel to 11
>> 11n client -> server: ~6mbps, server -> client ~0.2mbps
>> 11g client -> server: ~7mbps, server -> client ~3mbps
>> 
>> ---
>> switch to channel 3:
>> 11n client -> server: ~7mbps, server -> client: ~0-5mbps
>> 11g client -> server: 16mbps, server -> client: ~5mbps
>> 
>> ---
>> applied dyn rts patch:
>> 11n client -> server: 4-7mbps, server -> client 0.2-5.5mbps
>> 11g client -> server: ~4mbps, server -> client: ~5.5mbps
>
> What made 11n go down from 16 to 4?
> 11g is not affected by this patch so something else affected 11g.
> Could it be other networks on overlapping channels?

Yeah, I have around ten wireless networks around. Some seem to be mobile AP
which come and go. 

>  
> To tell whether the patch has any effect in your case I would like to
> know which HT protection setting your AP is using.
>
> Find a snapshot dated a bit after 2017/03/04 10:51:20 MST, or make sure 
> tcpdump
> sources are -current and rebuild and install tcpdump. Associate to the AP,
> and run: tcpdump -n -i iwn0 -y IEEE802_11_RADIO -s 4096 -v -l | grep beacon
> and in htop=<...> look for 'htprot'. If it shows 'htprot none' then dynamic
> RTS is used in 11n mode (i.e. my patch will switch RTS on/off as needed).
> Otherwise, you have some pre-11n devices in your environment so RTS must
> always be enabled and my patch makes no difference.

06:59:51.809091 802.11 flags=0<>: beacon, 
caps=2061<ESS,PRIVACY,SHORT_PREAMBLE,SHORT_SLOTTIME>, ssid (wifee), rates 1M* 
2M* 5M* 11M* 6M 9M 12M 18M, ds (chan 6), tim 0x0001, xrates 24M 36M 48M 
54M, rsn 0x010fac04010fac04010fac02, htcaps=<20MHz,A-MSDU 
3839,A-MPDU max 8191,RxMCS 0x>, htop=<20MHz chan 6,STA 
chanw 20MHz,htprot non-HT-mixed,basic MCS set 0x>, vendor 
0x0050f202010103a427a442435e0062322f00, 

So I guess its pre-11n devices somewhere ruining my day. 

>
> Note that the iwn(4) driver does not yet support MIMO in 11n mode.
> Once it does, 11n should become faster than 11g. I have seen an iwm(4) client
> which supports MIMO max out at 21 Mbps tcpbench towards my athn(4) AP, on a
> 5GHz channel with 'htprot none'.
> Unfortunately, tcpbench in the other direction (athn -> iwm) peaks at 10 Mbps.
> There is plenty of room for improvement.
>

I didn't think it would improve things yet but I had the antenna so I'd figure
I'd stick it in the AP while I'm tweaking it anyway.

Speaking of 5Ghz, my AP uses athn chipset AR9280 which seems to support 2.4Ghz
and 5Ghz. Can I use 5Ghz with my AP to see which devices would break after such
transition. I guess I would need to get 5Ghz antenna and just stick that to my
AP?
Can OpenBSD AP work on both frequencies at the same time or is that something
not yet supported?

>> At least what pops up in the measurements is that 11g gives more stable
>> behaviour. 11n speed seems to vary a lot in that 15s test compared to 11g.
>
> This could be explained by differences in rate scaling algorithms.
> In -current 11g uses AMRR, and 11n mode uses MiRa which jumps around more.
> In 6.0 everything used AMRR so a comparing how a 6.0 iwn client performs
> in 11n mode would be useful (you could e.g. install 6.0 to a USB stick
> and boot from it for running a speed test).

I have only very limited observations, like typing commands via ssh has usually
lag with 11n and works pretty smoothly on 11g.

timo



Re: 11n support for athn(4)

2017-03-05 Thread Timo Myyrä
Stefan Sperling <s...@stsp.name> writes:

> On Tue, Jan 31, 2017 at 07:10:04AM +0200, Timo Myyrä wrote:
>> 11g: Client->AP: ~15Mbps, AP->Client: ~5Mbps
>> 11n: Client->AP: ~3Mbps, AP->Client: ~5Mbps
>
> I just committed a change which makes RTS optional in 11n mode.
> The AP starts out with RTS enabled. Every 30 seconds the AP checks for
> the presence of non-11n devices and enables/disables RTS accordingly.
>
> Can you update to -current and measure again?
> I would like to know if 11g and 11n performance still differs.

Hi,

Did some tcpbench testing and got following results:
Each test run with: tcpbench -s || tcpbench -t 15  commands.
Host AP: apu 2b4 with athn, client = thinkpad t430s with iwn (OpenBSD)

channel 9 running old snapshot etc:
11n client -> server ~4, server -> ~0,
11g client -> server ~16, server -> client ~0-6mbs
---
updated to new snapshot: OpenBSD 6.0-current (GENERIC.MP) #206: Sat Mar  4 
09:22:00 MST 2017
= added another antenna, moved the ap to better spot, switched old ap off, 
changed channel to 11
11n client -> server: ~6mbps, server -> client ~0.2mbps
11g client -> server: ~7mbps, server -> client ~3mbps

---
switch to channel 3:
11n client -> server: ~7mbps, server -> client: ~0-5mbps
11g client -> server: 16mbps, server -> client: ~5mbps

---
applied dyn rts patch:
11n client -> server: 4-7mbps, server -> client 0.2-5.5mbps
11g client -> server: ~4mbps, server -> client: ~5.5mbps

At least what pops up in the measurements is that 11g gives more stable
behaviour. 11n speed seems to vary a lot in that 15s test compared to 11g.

Timo





Re: 11n support for athn(4)

2017-01-30 Thread Timo Myyrä
Stefan Sperling <s...@stsp.name> writes:

> On Sun, Jan 29, 2017 at 07:49:56AM +0200, Timo Myyrä wrote:
>> Hmm, I've been running the 11n for a while and it seems to be a lot slower 
>> than
>> 11g for me. Just did quick benchmark using tcpbench between OpenBSD hostAP 
>> (athn) and
>> laptop (iwn). It looks when my athn is in 11n mode I get around ~3 Mbps:
>
> Which direction did you measure? Client->AP or AP->Client?
> Have you measured both directions? I expect client->AP to be faster if a
> non-OpenBSD client is used. Such clients will likely use Tx aggregation.
> But OpenBSD does not (yet).
>

Initially I just measured Client->AP. I did another measurement yesterday
looking at traffic in both ways and got pretty poor results. The AP->Client when
in 11n mode had throughput between 0-2 Mbps. I did test it during evening so
there might have been a bit more interference but still, that wasn't very
promising result.

But good news is that I noticed your commits to athn and did a new benchmarks
with those changes:
11g: Client->AP: ~15Mbps, AP->Client: ~5Mbps
11n: Client->AP: ~3Mbps, AP->Client: ~5Mbps

So it seems to improved the AP->Client traffic somewhat or its just that I get
full bandwidth to myself in the mornings. 


>> Quickly looking it seems the 11g is 3x faster than 11n which seems a bit odd.
>> I'd assume they should be roughly the same.
>> 
>> Any idea what could explain the difference?
>
> It might be due to RTS/CTS.
> https://en.wikipedia.org/wiki/IEEE_802.11_RTS/CTS
>
> Without TX aggregation, RTS/CTS causes up to 77% overhead on a 20MHz
> channel and a 1500 byte MTU. See Figure 10 in
> http://www.nle.com/literature/Airmagnet_impact_of_legacy_devices_on_80211n.pdf
> Perhaps this overhead is making the difference?
>

Perhaps, I need to take a moment to digest the document.

> You could take a look at what is happening at the wifi frame layer and
> compare 11n and 11g. To do so, get an iwn(4) device and put it in monitor
> mode on the same channel, and use tcpdump's -y IEEE802_11_RADIO option.
> This will show control frames such as rts/cts (but only with iwn(4)
> because most other drivers unfortunately filter these frames).
>
> You'll see RTS/CTS being exchanged for every frame with an OpenBSD 11n hostap.
> This is because OpenBSD 11n hostap forces "HT protection" on. This forces
> clients (and the AP) to reserve the medium with an RTS frame before sending
> a data frame to avoid collisions from legacy 11a/b/g clients. This is the
> simplest way of being standard compliant in any environment.
>
> HT protection could be switched off if only 11n clients exist on the channel
> (not just on the same AP), and with good commercial APs you'll see this
> behaviour. But OpenBSD's wifi stack does not yet monitor the channel in
> a way that allows a decision to be made about turning HT protection off.
>
> In any case, this will likely get better once OpenBSD supports Tx aggregation.
> Then it will send multiple frames after reserving the medium with RTS/CTS
> and the overhead is reduced.

Ok, good to know. if I find a good moment I'll try to mess around with tcpdump a
bit to see if anything pops up.

>
>> There are 8 other wireless networks in range but all of those are on 
>> different channels.
>
> Are they on overlapping channels? See here for a good illustration:
> https://en.wikipedia.org/wiki/IEEE_802.11#/media/File:2.4_GHz_Wi-Fi_channels_(802.11b,g_WLAN).svg
>

Seems they are overlapping a bit. Will try to find clearer channel for my wifi.

> If no other networks overlap, or all other networks use 11n only, then
> you don't need HT protection. The crude patch below should disable it
> and might make 11n and 11g perform equally in your environment.
> Obviously this patch is not a real fix and I don't intend to commit it.
>
> Index: ieee80211_node.c
> ===
> RCS file: /cvs/src/sys/net80211/ieee80211_node.c,v
> retrieving revision 1.112
> diff -u -p -r1.112 ieee80211_node.c
> --- ieee80211_node.c  16 Jan 2017 09:35:43 -  1.112
> +++ ieee80211_node.c  29 Jan 2017 08:37:47 -
> @@ -364,8 +364,12 @@ ieee80211_create_ibss(struct ieee80211co
>* beacons from other networks) which proves that only HT
>* STAs are on the air.
>*/
> +#if 0
>   ni->ni_htop1 = IEEE80211_HTPROT_NONMEMBER;
>   ic->ic_protmode = IEEE80211_PROT_RTSCTS;
> +#else
> + ni->ni_htop1 = IEEE80211_HTPROT_NONE;
> +#endif
>  
>   /* Configure QoS EDCA parameters. */
>   for (aci = 0; aci < EDCA_NUM_AC; aci++) {
> @@ -1476,9 +1480,11

Re: 11n support for athn(4)

2017-01-28 Thread Timo Myyrä
Stefan Sperling  writes:

> This diff adds 11n support to the athn(4) driver.
> Requires -current net80211 code from today.
>
> Tested in hostap mode and client mode with:
> athn0 at pci1 dev 0 function 0 "Atheros AR9281" rev 0x01: apic 2 int 16
> athn0: AR9280 rev 2 (2T2R), ROM rev 22, adddress xx:xx:xx:xx:xx:xx
>
> And in client mode with:
> athn0 at uhub1 port 2 configuration 1 interface 0 "ATHEROS USB2.0 WLAN" rev 
> 2.00/1.08 addr 2
> athn0: AR9271 rev 1 (1T1R), ROM rev 13, address xx:xx:xx:xx:xx:xx
>
> Hostap performance is not perfect yet but should be no worse than
> 11a/b/g modes in the same environment.
>
> For Linux clients a fix for WME params is needed which I also posted to tech@.
>
> This diff does not modify the known-broken and disabled ar9003 code,
> apart from making sure it still builds.
>
> I'm looking for both tests and OKs.
>
> Index: dev/cardbus/if_athn_cardbus.c
> ===
> RCS file: /cvs/src/sys/dev/cardbus/if_athn_cardbus.c,v
> retrieving revision 1.14
> diff -u -p -r1.14 if_athn_cardbus.c
> --- dev/cardbus/if_athn_cardbus.c 24 Nov 2015 17:11:39 -  1.14
> +++ dev/cardbus/if_athn_cardbus.c 8 Jan 2017 09:31:28 -
> @@ -43,6 +43,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include 
> Index: dev/ic/ar5008.c
> ===
> RCS file: /cvs/src/sys/dev/ic/ar5008.c,v
> retrieving revision 1.37
> diff -u -p -r1.37 ar5008.c
> --- dev/ic/ar5008.c   29 Nov 2016 10:22:30 -  1.37
> +++ dev/ic/ar5008.c   9 Jan 2017 10:14:41 -
> @@ -51,6 +51,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include 
> @@ -217,7 +218,7 @@ ar5008_attach(struct athn_softc *sc)
>   sc->flags |= ATHN_FLAG_11A;
>   if (base->opCapFlags & AR_OPFLAGS_11G)
>   sc->flags |= ATHN_FLAG_11G;
> - if (base->opCapFlags & AR_OPFLAGS_11N)
> + if ((base->opCapFlags & AR_OPFLAGS_11N_DISABLED) == 0)
>   sc->flags |= ATHN_FLAG_11N;
>  
>   IEEE80211_ADDR_COPY(ic->ic_myaddr, base->macAddr);
> @@ -952,9 +953,11 @@ ar5008_tx_process(struct athn_softc *sc,
>   struct ifnet *ifp = >ic_if;
>   struct athn_txq *txq = >txq[qid];
>   struct athn_node *an;
> + struct ieee80211_node *ni;
>   struct athn_tx_buf *bf;
>   struct ar_tx_desc *ds;
>   uint8_t failcnt;
> + int txfail;
>  
>   bf = SIMPLEQ_FIRST(>head);
>   if (bf == NULL)
> @@ -970,13 +973,16 @@ ar5008_tx_process(struct athn_softc *sc,
>  
>   sc->sc_tx_timer = 0;
>  
> - if (ds->ds_status1 & AR_TXS1_EXCESSIVE_RETRIES)
> + txfail = (ds->ds_status1 & AR_TXS1_EXCESSIVE_RETRIES);
> + if (txfail)
>   ifp->if_oerrors++;
>  
>   if (ds->ds_status1 & AR_TXS1_UNDERRUN)
>   athn_inc_tx_trigger_level(sc);
>  
>   an = (struct athn_node *)bf->bf_ni;
> + ni = (struct ieee80211_node *)bf->bf_ni;
> +
>   /*
>* NB: the data fail count contains the number of un-acked tries
>* for the final series used.  We must add the number of tries for
> @@ -987,10 +993,27 @@ ar5008_tx_process(struct athn_softc *sc,
>   failcnt += MS(ds->ds_status9, AR_TXS9_FINAL_IDX) * 2;
>  
>   /* Update rate control statistics. */
> - an->amn.amn_txcnt++;
> - if (failcnt > 0)
> - an->amn.amn_retrycnt++;
> -
> + if (ni->ni_flags & IEEE80211_NODE_HT) {
> + an->mn.frames++;
> + an->mn.ampdu_size = bf->bf_m->m_pkthdr.len + IEEE80211_CRC_LEN;
> + an->mn.agglen = 1; /* XXX We do not yet support Tx agg. */
> + if (failcnt > 0)
> + an->mn.retries++;
> + if (txfail)
> + an->mn.txfail++;
> + if ((ic->ic_opmode == IEEE80211_M_STA &&
> + ic->ic_state == IEEE80211_S_RUN)
> +#ifndef IEEE80211_STA_ONLY
> + || (ic->ic_opmode == IEEE80211_M_HOSTAP &&
> + ni->ni_state == IEEE80211_STA_ASSOC)
> +#endif
> + )
> + ieee80211_mira_choose(>mn, ic, ni);
> + } else {
> + an->amn.amn_txcnt++;
> + if (failcnt > 0)
> + an->amn.amn_retrycnt++;
> + }
>   DPRINTFN(5, ("Tx done qid=%d status1=%d fail count=%d\n",
>   qid, ds->ds_status1, failcnt));
>  
> @@ -1110,7 +1133,7 @@ ar5008_swba_intr(struct athn_softc *sc)
>   ds->ds_ctl2 = SM(AR_TXC2_XMIT_DATA_TRIES0, 1);
>  
>   /* Write Tx rate. */
> - ridx = (ic->ic_curmode == IEEE80211_MODE_11A) ?
> + ridx = IEEE80211_IS_CHAN_5GHZ(ni->ni_chan) ?
>   ATHN_RIDX_OFDM6 : ATHN_RIDX_CCK1;
>   hwrate = athn_rates[ridx].hwrate;
>   ds->ds_ctl3 = SM(AR_TXC3_XMIT_RATE0, hwrate);
> @@ -1315,15 +1338,25 @@ ar5008_tx(struct athn_softc *sc, struct 
>   IEEE80211_FC0_TYPE_DATA) {
>   /* Use lowest rate for all tries. */
> 

Re: USB keyboards with multiple displays

2016-01-04 Thread Timo Myyrä
Mark Kettenis  writes:

> OpenBSD/amd64 and OpenBSD/i386 have been supporting multiple
> wsdisplay(4) devices for a while now.  Somewhat recently it became
> also possible to use inteldrm(4) as a secondary display device.  There
> have always been some issues with pairing wskbd(4) keyboard devices
> with wsdisplay(4) devices.  But since it is fairly common for people
> to have a desktop PC with both Intel integrated graphics and a
> discrete graphics card, these issues affect many more people now.
>
> On the amd64/i386 architecture, there is the concept of a primary
> graphics device.  This is the device that the BIOS uses for its output
> when the machine boots.  We try very hard to use this device as our
> console, and it should become wsdisplay0 if we detect it as such.  We
> also make sure that console keyboard attaches to wsdisplay0.  All
> other keyboards are attached, through wsmux(4), to the wsdisplay(4)
> device that attaches first.
>
> Now figuring out what the console keyboard is, is abit tricky on
> amd64/i386.  The BIOS is a very poor excuse for a firmware and doesn't
> really tell us.  So we always attach pckbd(4) as the console keyboard,
> and only if we don't detect a pckbc(4) controller, we bombard the
> first USB keyboard as the console keyboard.  Since most desktop PCs
> still contain a PC keyboard controller, this means that if you're
> using a USB keyboard, it is unlikely to become the console keyboard.
> As a result it attaches to the first wsdisplay(4) device that
> attaches.  If that happens to be wsdisplay1, the keyboard appears to
> be non-functional.  And since radeondrm(4) doesn't fully attach until
> it can load its firmware, the USB keyboard will almost certainly
> attach to wsdisplay1 at inteldrm0.
>
> I'm still looking for a clever way to detect that the USB keyboard
> should become the console keyboard even in the presence of pckbc0.
> But it would already help these people if wemade sure the non-console
> keyboards pair themselves with wsdisplay0 as well.  Fortunately, there
> is an easy way to do this.  These keyboards attach themselves to the
> keyboard wsmux(4) device wsmux1 (wsmux0 is used for mouse devices).
> By default wsdisplay(4) devices take control over wsmux1 ifnobody else
> has done so yet.  But we can prevent this from happening by specifying
> "mux -1" in the kernel configuration file.
>
> I don'thave a system with both inteldrm(4) and radeondrm(4), but on my
> system with two readeondrm(4) devices, this makes sure an additional
> USB keyboard connects itself to wsdisplay0 instead of wsdisplay1.
>
> ok if I commit the equivalent changes to i386 as well?
>
>
> Index: GENERIC
> ===
> RCS file: /cvs/src/sys/arch/amd64/conf/GENERIC,v
> retrieving revision 1.406
> diff -u -p -r1.406 GENERIC
> --- GENERIC   31 Dec 2015 13:06:49 -  1.406
> +++ GENERIC   3 Jan 2016 20:03:13 -
> @@ -313,12 +313,12 @@ agp*at intagp?
>  drm0 at inteldrm? console 1
>  drm* at inteldrm?
>  wsdisplay0   at inteldrm? console 1
> -wsdisplay*   at inteldrm?
> +wsdisplay*   at inteldrm? mux -1
>  radeondrm*   at pci? # ATI Radeon DRM driver
>  drm0 at radeondrm? console 1
>  drm* at radeondrm?
>  wsdisplay0   at radeondrm? console 1
> -wsdisplay*   at radeondrm?
> +wsdisplay*   at radeondrm? mux -1
>  
>  pcppi0   at isa?
>  

Just tested this on my desktop (amd64) and it attaches my usb keyboard
correctly.

Timo



Zlib regression tests

2015-12-26 Thread Timo Myyrä
Hi, 

I was bored a bit and noticed the latest zlib version contains tests in
example.c. I've cleaned up the file and added it to regress framework.
The testsuite shouldn't output anything if all is well.
I've changed the types quite a bit so someone with better understanding of C and
Zlib should review them so that they don't contain any suprises.

Timo

Index: regress/lib/Makefile
===
RCS file: /cvs/src/regress/lib/Makefile,v
retrieving revision 1.18
diff -u -u -p -r1.18 Makefile
--- regress/lib/Makefile31 Oct 2014 14:10:55 -  1.18
+++ regress/lib/Makefile26 Dec 2015 08:58:54 -
@@ -1,7 +1,7 @@
 #  $OpenBSD: Makefile,v 1.18 2014/10/31 14:10:55 jsing Exp $
 
 SUBDIR+= csu libc libcrypto libevent libm libpthread libskey libssl libtls \
-libutil
+libutil libz
 
 install:
 
Index: regress/lib/libz/Makefile
===
RCS file: regress/lib/libz/Makefile
diff -N regress/lib/libz/Makefile
--- /dev/null   1 Jan 1970 00:00:00 -
+++ regress/lib/libz/Makefile   26 Dec 2015 08:58:54 -
@@ -0,0 +1,6 @@
+PROG=zlib_testsuite
+LDADD=-lz
+
+CLEANFILES=foo.gz zlib_testsuite.o zlib_testsuite
+
+.include 
Index: regress/lib/libz/zlib_testsuite.c
===
RCS file: regress/lib/libz/zlib_testsuite.c
diff -N regress/lib/libz/zlib_testsuite.c
--- /dev/null   1 Jan 1970 00:00:00 -
+++ regress/lib/libz/zlib_testsuite.c   26 Dec 2015 08:58:54 -
@@ -0,0 +1,542 @@
+/*
+ * zlib_testsuite.c -- regression tests for zlib compression library
+ *
+ * Copyright (C) 1995-2006, 2011 Jean-loup Gailly.
+ * Copyright (C) 2015 Timo Myyrä.
+ *
+ * based on the example.c from upstream zlib distribution but modified for
+ * better fit for regression tests.
+ *
+ * For conditions of distribution and use, see copyright notice in zlib.h
+ */
+
+/* $OpenBSD$ */
+
+#include 
+#include 
+#include 
+#include 
+
+#define CHECK_ERR(err, msg) { \
+if (err != Z_OK) { \
+fprintf(stderr, "%s error: %d\n", msg, err); \
+exit(1); \
+} \
+}
+
+const char hello[] = "hello, hello!";
+/*
+ * "hello world" would be more standard, but the repeated "hello" stresses
+ * the compression code better, sorry...
+ */
+
+const char dictionary[] = "hello";
+unsigned long  dictId; /* Adler32 value of the dictionary */
+
+void   test_compress(void *, size_t, void *, size_t);
+void   test_deflate(void *, size_t);
+void   test_deflate_levels(void);
+void   test_dict_deflate(void *, size_t);
+void   test_dict_inflate(void *, size_t, void *, size_t);
+void   test_flush(void *, size_t);
+void   test_gzio(const char *, void *, size_t);
+void   test_inflate(void *, size_t, void *, size_t);
+void   test_large_deflate(void *, size_t, void *, size_t);
+void   test_large_inflate(void *, size_t, void *, size_t);
+void   test_sync(void *, size_t, void *, size_t);
+
+/*
+ * Test compress() and uncompress()
+ */
+void
+test_compress(void *compr, size_t comprLen, void *uncompr, size_t uncomprLen)
+{
+   int err;
+   size_t  len = strlen(hello) + 1;
+
+   err = compress(compr, , (const unsigned char *)hello, len);
+   CHECK_ERR(err, "compress");
+
+   strlcpy((char *)uncompr, "garbage", sizeof(uncompr));
+
+   err = uncompress(uncompr, , compr, comprLen);
+   CHECK_ERR(err, "uncompress");
+
+   if (strcmp((char *)uncompr, hello) != 0) {
+   fprintf(stderr, "bad uncompress\n");
+   exit(1);
+   }
+}
+
+/*
+ * Test deflate() with small buffers
+ */
+void
+test_deflate(void *compr, size_t comprLen)
+{
+   z_streamc_stream;
+   int err;
+   size_t  len = strlen(hello) + 1;
+
+   c_stream.zalloc = NULL;
+   c_stream.zfree = NULL;
+   c_stream.opaque = NULL;
+
+   err = deflateInit(_stream, Z_DEFAULT_COMPRESSION);
+   CHECK_ERR(err, "deflateInit");
+
+   c_stream.next_in = (unsigned char *)hello;
+   c_stream.next_out = compr;
+
+   while (c_stream.total_in != (off_t)len && c_stream.total_out < 
(off_t)comprLen) {
+   /* force small buffers */
+c_stream.avail_in = c_stream.avail_out = 1;
+   err = deflate(_stream, Z_NO_FLUSH);
+   CHECK_ERR(err, "deflate");
+   }
+   /* Finish the stream, still forcing small buffers: */
+   for (;;) {
+   c_stream.avail_out = 1;
+   err = deflate(_stream, Z_FINISH);
+   if (err == Z_STREAM_END)
+   break;
+   CHECK_ERR(err, "deflate");
+   }
+
+   err = deflateEnd(_stream);
+   CHECK_ERR(err, "deflateEnd");
+}
+
+void
+test_deflate_levels(void)
+{
+   z_streamstrm;

Re: initial 11n support for iwn (n, not m)

2015-12-22 Thread Timo Myyrä
Stefan Sperling  writes:

> On Sat, Dec 19, 2015 at 01:08:26PM +0100, Stefan Sperling wrote:
>> On Fri, Dec 18, 2015 at 05:40:39PM -0500, David Hill wrote:
>> > With sthen@'s patch I can associate, dhcp, and use net.
>> 
>> Here's an updated iwn diff with a better approach for Stuart's fix.
>> 
>> Thanks for helping, Stuart, and to everyone who sent beacons which
>> allowed us to narrow this problem down to protection settings being
>> set up the wrong way in iwn_run().
>
> And another update (hopefully) fixing some reported issues, with some
> uncommitted net80211 changes included.
>
> I haven't put these diffs in yet because I'm still hearing about regressions
> in some form or another. Sometimes it's unclear what people are running,
> so I hope this version will linger for a bit and get tested.
> Thanks for all the help so far from more people than I expected!
>
> Index: dev/pci/if_iwn.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_iwn.c,v
> retrieving revision 1.148
> diff -u -p -r1.148 if_iwn.c
> --- dev/pci/if_iwn.c  25 Nov 2015 03:09:59 -  1.148
> +++ dev/pci/if_iwn.c  20 Dec 2015 11:18:52 -
> @@ -148,7 +148,7 @@ int   iwn_newstate(struct ieee80211com *,
>  void iwn_iter_func(void *, struct ieee80211_node *);
>  void iwn_calib_timeout(void *);
>  int  iwn_ccmp_decap(struct iwn_softc *, struct mbuf *,
> - struct ieee80211_key *);
> + struct ieee80211_node *);
>  void iwn_rx_phy(struct iwn_softc *, struct iwn_rx_desc *,
>   struct iwn_rx_data *);
>  void iwn_rx_done(struct iwn_softc *, struct iwn_rx_desc *,
> @@ -189,7 +189,7 @@ int   iwn5000_add_node(struct iwn_softc *
>   int);
>  int  iwn_set_link_quality(struct iwn_softc *,
>   struct ieee80211_node *);
> -int  iwn_add_broadcast_node(struct iwn_softc *, int);
> +int  iwn_add_broadcast_node(struct iwn_softc *, int, int);
>  void iwn_updateedca(struct ieee80211com *);
>  void iwn_set_led(struct iwn_softc *, uint8_t, uint8_t, uint8_t);
>  int  iwn_set_critical_temp(struct iwn_softc *);
> @@ -280,7 +280,7 @@ void  iwn_stop(struct ifnet *, int);
>  #ifdef IWN_DEBUG
>  #define DPRINTF(x)   do { if (iwn_debug > 0) printf x; } while (0)
>  #define DPRINTFN(n, x)   do { if (iwn_debug >= (n)) printf x; } while (0)
> -int iwn_debug = 0;
> +int iwn_debug = 1;
>  #else
>  #define DPRINTF(x)
>  #define DPRINTFN(n, x)
> @@ -458,6 +458,15 @@ iwn_attach(struct device *parent, struct
>   IEEE80211_C_PMGT;   /* power saving supported */
>  
>  #ifndef IEEE80211_NO_HT
> + /* No optional HT features supported for now, */
> + ic->ic_htcaps = 0;
> + ic->ic_htxcaps = 0;
> + ic->ic_txbfcaps = 0;
> + ic->ic_aselcaps = 0;
> +#endif
> +
> +#ifdef notyet
> +#ifndef IEEE80211_NO_HT
>   if (sc->sc_flags & IWN_FLAG_HAS_11N) {
>   /* Set HT capabilities. */
>   ic->ic_htcaps =
> @@ -475,6 +484,7 @@ iwn_attach(struct device *parent, struct
>   ic->ic_htcaps |= IEEE80211_HTCAP_SMPS_DIS;
>   }
>  #endif   /* !IEEE80211_NO_HT */
> +#endif   /* notyet */
>  
>   /* Set supported legacy rates. */
>   ic->ic_sup_rates[IEEE80211_MODE_11B] = ieee80211_std_rateset_11b;
> @@ -487,10 +497,12 @@ iwn_attach(struct device *parent, struct
>   if (sc->sc_flags & IWN_FLAG_HAS_11N) {
>   /* Set supported HT rates. */
>   ic->ic_sup_mcs[0] = 0xff;   /* MCS 0-7 */
> +#ifdef notyet
>   if (sc->nrxchains > 1)
> - ic->ic_sup_mcs[1] = 0xff;   /* MCS 7-15 */
> + ic->ic_sup_mcs[1] = 0xff;   /* MCS 8-15 */
>   if (sc->nrxchains > 2)
>   ic->ic_sup_mcs[2] = 0xff;   /* MCS 16-23 */
> +#endif
>   }
>  #endif
>  
> @@ -515,9 +527,11 @@ iwn_attach(struct device *parent, struct
>  #ifndef IEEE80211_NO_HT
>   ic->ic_ampdu_rx_start = iwn_ampdu_rx_start;
>   ic->ic_ampdu_rx_stop = iwn_ampdu_rx_stop;
> +#ifdef notyet
>   ic->ic_ampdu_tx_start = iwn_ampdu_tx_start;
>   ic->ic_ampdu_tx_stop = iwn_ampdu_tx_stop;
>  #endif
> +#endif
>  
>   /* Override 802.11 state transition machine. */
>   sc->sc_newstate = ic->ic_newstate;
> @@ -1635,6 +1649,11 @@ iwn_read_eeprom_channels(struct iwn_soft
>   /* Save maximum allowed TX power for this channel. */
>   sc->maxpwr[chan] = channels[i].maxpwr;
>  
> +#ifndef IEEE80211_NO_HT
> + if (sc->sc_flags & IWN_FLAG_HAS_11N)
> + ic->ic_channels[chan].ic_flags |= IEEE80211_CHAN_HT;
> +#endif
> +
>   DPRINTF(("adding chan %d flags=0x%x maxpwr=%d\n",
>   chan, channels[i].flags, sc->maxpwr[chan]));
>   }
> @@ -1693,13 +1712,18 @@ 

Re: Unlock the reaper

2015-07-08 Thread Timo Myyrä
Mark Kettenis mark.kette...@xs4all.nl writes:

 I'm looking for testers for this diff.  This should be safe to run on
 amd64, i386 and sparc64.  But has been reported to lock up i386
 machines.  I can't reproduce this on any of my own systems.  So I'm
 looking for help.  I'm looking for people that are able to build a
 kernel with this diff and the MP_LOCKDEBUG option enabled
 (uncommented) in their GENERIC.MP kernel, run it on an MP machine and
 put some load on it to see if it locks up and/or panics.

 Being able to move forward with this would make OpenBSD run
 significantly better on MP systems.

 Thanks,

 Mark


 Index: uvm_addr.c
 ===
 RCS file: /home/cvs/src/sys/uvm/uvm_addr.c,v
 retrieving revision 1.13
 diff -u -p -r1.13 uvm_addr.c
 --- uvm_addr.c30 Mar 2015 21:08:40 -  1.13
 +++ uvm_addr.c4 Apr 2015 11:08:49 -
 @@ -287,14 +287,19 @@ uvm_addr_init(void)
  {
   pool_init(uaddr_pool, sizeof(struct uvm_addr_state),
   0, 0, PR_WAITOK, uaddr, NULL);
 + pool_setipl(uaddr_pool, IPL_VM);
   pool_init(uaddr_hint_pool, sizeof(struct uaddr_hint_state),
   0, 0, PR_WAITOK, uaddrhint, NULL);
 + pool_setipl(uaddr_hint_pool, IPL_VM);
   pool_init(uaddr_bestfit_pool, sizeof(struct uaddr_bestfit_state),
   0, 0, PR_WAITOK, uaddrbest, NULL);
 + pool_setipl(uaddr_bestfit_pool, IPL_VM);
   pool_init(uaddr_pivot_pool, sizeof(struct uaddr_pivot_state),
   0, 0, PR_WAITOK, uaddrpivot, NULL);
 + pool_setipl(uaddr_pivot_pool, IPL_VM);
   pool_init(uaddr_rnd_pool, sizeof(struct uaddr_rnd_state),
   0, 0, PR_WAITOK, uaddrrnd, NULL);
 + pool_setipl(uaddr_rnd_pool, IPL_VM);
  
   uaddr_kbootstrap.uaddr_minaddr = PAGE_SIZE;
   uaddr_kbootstrap.uaddr_maxaddr = -(vaddr_t)PAGE_SIZE;
 Index: uvm_map.c
 ===
 RCS file: /home/cvs/src/sys/uvm/uvm_map.c,v
 retrieving revision 1.191
 diff -u -p -r1.191 uvm_map.c
 --- uvm_map.c 23 Apr 2015 00:49:37 -  1.191
 +++ uvm_map.c 28 Apr 2015 20:55:03 -
 @@ -1842,8 +1842,10 @@ uvm_unmap_kill_entry(struct vm_map *map,
  {
   /* Unwire removed map entry. */
   if (VM_MAPENT_ISWIRED(entry)) {
 + KERNEL_LOCK();
   entry-wired_count = 0;
   uvm_fault_unwire_locked(map, entry-start, entry-end);
 + KERNEL_UNLOCK();
   }
  
   /* Entry-type specific code. */
 @@ -2422,18 +2424,20 @@ void
  uvm_map_teardown(struct vm_map *map)
  {
   struct uvm_map_deadq dead_entries;
 - int  i, waitok = 0;
   struct vm_map_entry *entry, *tmp;
  #ifdef VMMAP_DEBUG
   size_t   numq, numt;
  #endif
 + int  i;
  
 - if ((map-flags  VM_MAP_INTRSAFE) == 0)
 - waitok = 1;
 - if (waitok) {
 - if (rw_enter(map-lock, RW_NOSLEEP | RW_WRITE) != 0)
 - panic(uvm_map_teardown: rw_enter failed on free map);
 - }
 + KERNEL_ASSERT_LOCKED();
 + KERNEL_UNLOCK();
 + KERNEL_ASSERT_UNLOCKED();
 +
 + KASSERT((map-flags  VM_MAP_INTRSAFE) == 0);
 +
 + if (rw_enter(map-lock, RW_NOSLEEP | RW_WRITE) != 0)
 + panic(uvm_map_teardown: rw_enter failed on free map);
  
   /* Remove address selectors. */
   uvm_addr_destroy(map-uaddr_exe);
 @@ -2466,8 +2470,7 @@ uvm_map_teardown(struct vm_map *map)
   if ((entry = RB_ROOT(map-addr)) != NULL)
   DEAD_ENTRY_PUSH(dead_entries, entry);
   while (entry != NULL) {
 - if (waitok)
 - uvm_pause();
 + sched_pause();
   uvm_unmap_kill_entry(map, entry);
   if ((tmp = RB_LEFT(entry, daddrs.addr_entry)) != NULL)
   DEAD_ENTRY_PUSH(dead_entries, tmp);
 @@ -2477,8 +2480,7 @@ uvm_map_teardown(struct vm_map *map)
   entry = TAILQ_NEXT(entry, dfree.deadq);
   }
  
 - if (waitok)
 - rw_exit(map-lock);
 + rw_exit(map-lock);
  
  #ifdef VMMAP_DEBUG
   numt = numq = 0;
 @@ -2488,7 +2490,10 @@ uvm_map_teardown(struct vm_map *map)
   numq++;
   KASSERT(numt == numq);
  #endif
 - uvm_unmap_detach(dead_entries, waitok ? UVM_PLA_WAITOK : 0);
 + uvm_unmap_detach(dead_entries, UVM_PLA_WAITOK);
 +
 + KERNEL_LOCK();
 +
   pmap_destroy(map-pmap);
   map-pmap = NULL;
  }
 @@ -3185,6 +3190,8 @@ void
  uvmspace_init(struct vmspace *vm, struct pmap *pmap, vaddr_t min, vaddr_t 
 max,
  boolean_t pageable, boolean_t remove_holes)
  {
 + KASSERT(pmap == NULL || pmap == pmap_kernel());
 +
   if (pmap)
   pmap_reference(pmap);
   else

Hi,

Compiled kernel with this on my Thinkpad X201 (amd64).
I run few compilations in parallel (kernel + few ports) in background with 
normal desktop use and
got load bumped to ~8.0.
So far no regressions 

[PATCH] add TMP to /usr/share/misc/airport

2014-12-25 Thread Timo Myyrä
Hi,

Noticed that Tampere airport is missing from the list.

Timo

Index: airport
===
RCS file: /cvs/src/share/misc/airport,v
retrieving revision 1.44
diff -u -r1.44 airport
--- airport 7 Dec 2014 22:54:05 -   1.44
+++ airport 25 Dec 2014 20:57:53 -
@@ -1647,6 +1647,7 @@
 TLL:Ulemiste, Tallinn, Estonia
 TLS:Blagnac, Toulouse, France
 TLV:Ben-Gurion, Tel Aviv, Israel
+TMP:Tampere-Pirkkala Airport, Tampere, Finland
 TMS:Sao Tome International, Sao Tome and Principe
 TMW:Tamworth, New South Wales, Australia
 TNG:Boukhalef Souahel, Tangier, Morocco



Re: [PATCH] update zlib to 1.2.8

2014-12-24 Thread Timo Myyrä
Stuart Henderson st...@openbsd.org writes:

 On 2014/12/23 22:54, Timo Myyrä wrote:
 I was trying to port notmuch mail indexer but got little stuck with it as it
 requires newer Zlib version than whats in base.  I got little spare time in 
 my
 hands and here's a rough patch which updates the base libz 1.2.3 to version 
 1.2.8.  So
 far I've only compiled the library itself and tried it with notmuch on my 
 amd64.

 This diff removes our local changes, most of which relate to saving space
 on ramdisks, we still need these.


Yeah, I noticed few of them but missed quite lot of others. I'll be more
careful in the future.

 Note that there are two main copies of libz (lib/libz and sys/lib/libz)
 to keep in sync. (there are also other copies in Perl and cvs, I'm not sure
 whether they are used or not).

Seems that the update isn't as trivial as I initially thought. 


 I prepared an update to 1.2.6 before (my method was to diff the two
 upstream versions and apply that to our tree, resolving conflicts by
 hand, so as not to lose local changes), but as the reaction was less
 than positive I didn't go as far as updating the kernel copy.

That would be clearer, no need to figure out which change was due to version
update and which is local change.

Timo



Re: [PATCH] update zlib to 1.2.8

2014-12-24 Thread Timo Myyrä
Marc Espie es...@nerim.net writes:

 Updating zlib correctly is surprisingly difficult.

 It's not just a question of whipping out the new version and doing a diff.
 You should make tests.

 Like, you missed the #ifdef SMALL at first. Those tests include making a
 full release, and making sure it still works.

 It's also likely some of those local diffs are there for some specific
 architectures. If you don't have them, you have to ask for tests.

 I'm not saying it's not doable.  But the fact you don't see the newer
 zlib in our tree kind-of indicates it takes quite a bit more testing than
 you would think.

Yes, I did lazy version first. Crude diff and just tested my own case. Idea was
to get a little feedback to see if I'm heading in the right direction.

Obviously diff to update all the zlib would require full release build with and
without SMALL defined to cover the alternatives. Also with different
architectures to cover any arch-specific issues.

My original idea was to get notmuch working but as pointed out, it was trivially
patched to work with base version.

Timo



Re: inteldrm/radeondrm suspend/resume diff

2014-03-13 Thread Timo Myyrä
Mark Kettenis mark.kette...@xs4all.nl writes:

 The recent inteldrm suspend/resume regression thread pointed out
 that suspend/resume was quite horribly broken and only worked somewhat
 if you didn't heavily use the 3D acceleration stuff.  Here's a diff
 that should fix most of the problems, by making sure userland programs
 are properly blocked if they try to use drm while we're suspending or
 resuming the machine.

 I would like to see this diff tested some more by people who actually
 use all that eye candy.  The thing to watch for is hangs when you try
 to suspend your machine.

 Thanks,

 Mark

 P.S. This seems to make hibernation (ZZZ) work with both inteldrm(4)
 and radeondrm(4) on my t400.


 Index: drmP.h
 ===
 RCS file: /cvs/src/sys/dev/pci/drm/drmP.h,v
 retrieving revision 1.169
 diff -u -p -r1.169 drmP.h
 --- drmP.h9 Mar 2014 07:42:29 -   1.169
 +++ drmP.h12 Mar 2014 21:38:43 -
 @@ -785,6 +785,10 @@ struct drm_device {
   bus_dma_tag_t   dmat;
   bus_space_tag_t bst;
  
 + struct mutexquiesce_mtx;
 + int quiesce;
 + int quiesce_count;
 +
   char  *unique;  /* Unique identifier: e.g., busid  */
   int   unique_len;   /* Length of unique field  */
   
 Index: drm_drv.c
 ===
 RCS file: /cvs/src/sys/dev/pci/drm/drm_drv.c,v
 retrieving revision 1.124
 diff -u -p -r1.124 drm_drv.c
 --- drm_drv.c 9 Mar 2014 07:42:29 -   1.124
 +++ drm_drv.c 12 Mar 2014 21:38:43 -
 @@ -63,8 +63,12 @@ int drm_lastclose(struct drm_device *);
  void  drm_attach(struct device *, struct device *, void *);
  int   drm_probe(struct device *, void *, void *);
  int   drm_detach(struct device *, int);
 +void  drm_quiesce(struct drm_device *);
 +void  drm_wakeup(struct drm_device *);
 +int   drm_activate(struct device *, int);
  int   drmprint(void *, const char *);
  int   drmsubmatch(struct device *, void *, void *);
 +int   drm_do_ioctl(struct drm_device *, int, u_long, caddr_t);
  int   drm_dequeue_event(struct drm_device *, struct drm_file *, size_t,
struct drm_pending_event **);
  
 @@ -212,6 +216,7 @@ drm_attach(struct device *parent, struct
  
   rw_init(dev-dev_lock, drmdevlk);
   mtx_init(dev-event_lock, IPL_TTY);
 + mtx_init(dev-quiesce_mtx, IPL_NONE);
  
   TAILQ_INIT(dev-maplist);
   SPLAY_INIT(dev-files);
 @@ -293,9 +298,47 @@ drm_detach(struct device *self, int flag
   return 0;
  }
  
 +void
 +drm_quiesce(struct drm_device *dev)
 +{
 + mtx_enter(dev-quiesce_mtx);
 + dev-quiesce = 1;
 + while (dev-quiesce_count  0) {
 + msleep(dev-quiesce_count, dev-quiesce_mtx,
 + PZERO, drmqui, 0);
 + }
 + mtx_leave(dev-quiesce_mtx);
 +}
 +
 +void
 +drm_wakeup(struct drm_device *dev)
 +{
 + mtx_enter(dev-quiesce_mtx);
 + dev-quiesce = 0;
 + wakeup(dev-quiesce);
 + mtx_leave(dev-quiesce_mtx);
 +}
 +
 +int
 +drm_activate(struct device *self, int act)
 +{
 + struct drm_device *dev = (struct drm_device *)self;
 +
 + switch (act) {
 + case DVACT_QUIESCE:
 + drm_quiesce(dev);
 + break;
 + case DVACT_WAKEUP:
 + drm_wakeup(dev);
 + break;
 + }
 +
 + return (0);
 +}
 +
  struct cfattach drm_ca = {
   sizeof(struct drm_device), drm_probe, drm_attach,
 - drm_detach
 + drm_detach, drm_activate
  };
  
  struct cfdriver drm_cd = {
 @@ -540,20 +583,13 @@ done:
   return (retcode);
  }
  
 -/* drmioctl is called whenever a process performs an ioctl on /dev/drm.
 - */
  int
 -drmioctl(dev_t kdev, u_long cmd, caddr_t data, int flags, 
 -struct proc *p)
 +drm_do_ioctl(struct drm_device *dev, int minor, u_long cmd, caddr_t data)
  {
 - struct drm_device *dev = drm_get_device_from_kdev(kdev);
   struct drm_file *file_priv;
  
 - if (dev == NULL)
 - return ENODEV;
 -
   DRM_LOCK();
 - file_priv = drm_find_file_by_minor(dev, minor(kdev));
 + file_priv = drm_find_file_by_minor(dev, minor);
   DRM_UNLOCK();
   if (file_priv == NULL) {
   DRM_ERROR(can't find authenticator\n);
 @@ -715,6 +751,34 @@ drmioctl(dev_t kdev, u_long cmd, caddr_t
   return (EINVAL);
  }
  
 +/* drmioctl is called whenever a process performs an ioctl on /dev/drm.
 + */
 +int
 +drmioctl(dev_t kdev, u_long cmd, caddr_t data, int flags, struct proc *p)
 +{
 + struct drm_device *dev = drm_get_device_from_kdev(kdev);
 + int error;
 +
 + if (dev == NULL)
 + return ENODEV;
 +
 + mtx_enter(dev-quiesce_mtx);
 + while (dev-quiesce)
 + msleep(dev-quiesce, dev-quiesce_mtx, PZERO, drmioc, 0);
 + dev-quiesce_count++;
 + mtx_leave(dev-quiesce_mtx);
 +
 + error = drm_do_ioctl(dev, minor(kdev), 

Re: Somewhat important ACPI diff

2013-05-20 Thread Timo Myyrä
Mark Kettenis mark.kette...@xs4all.nl writes:

 As diagnosed by some other people (armani@, jcs@?) a while ago, our
 code to deal with IndexField() operators in our AML interpreter is
 quite broken.  It works for fields that are less than a byte in size,
 but anything else is pretty much completely busted.  I wouldn't be
 surprised if this is the cause of many ACPI-related problems.  One
 that comes to mind are the ridiculous temperatures reported by
 acpitz(4) that have been plaguing us since basically forever.

 I don't have a lot of machines that have AML with IndexField()
 operators.  I've verified that those return the correct values, but
 this defenitely could use further testing on a wide variety of
 i386/amd64 hardware please.


 Index: dsdt.c
 ===
 RCS file: /cvs/src/sys/dev/acpi/dsdt.c,v
 retrieving revision 1.200
 diff -u -p -r1.200 dsdt.c
 --- dsdt.c10 Apr 2013 01:35:55 -  1.200
 +++ dsdt.c20 May 2013 16:55:27 -
 @@ -2221,8 +2221,9 @@ aml_evalhid(struct aml_node *node, struc
   return (0);
  }
  
 -void aml_rwfield(struct aml_value *, int, int, struct aml_value *, int);
  void aml_rwgas(struct aml_value *, int, int, struct aml_value *, int, int);
 +void aml_rwindexfield(struct aml_value *, struct aml_value *val, int);
 +void aml_rwfield(struct aml_value *, int, int, struct aml_value *, int);
  
  /* Get PCI address for opregion objects */
  int
 @@ -2341,6 +2342,81 @@ aml_rwgas(struct aml_value *rgn, int bpo
   aml_freevalue(tmp);
  }
  
 +
 +void
 +aml_rwindexfield(struct aml_value *fld, struct aml_value *val, int mode)
 +{
 + struct aml_value tmp, *ref1, *ref2;
 + void *tbit, *vbit;
 + int vpos, bpos, blen;
 + int indexval;
 + int sz, len;
 +
 + ref2 = fld-v_field.ref2;
 + ref1 = fld-v_field.ref1;
 + bpos = fld-v_field.bitpos;
 + blen = fld-v_field.bitlen;
 +
 + memset(tmp, 0, sizeof(tmp));
 + tmp.refcnt = 99;
 +
 + /* Get field access size */
 + switch (AML_FIELD_ACCESS(fld-v_field.flags)) {
 + case AML_FIELD_WORDACC:
 + sz = 2;
 + break;
 + case AML_FIELD_DWORDACC:
 + sz = 4;
 + break;
 + case AML_FIELD_QWORDACC:
 + sz = 8;
 + break;
 + default:
 + sz = 1;
 + break;
 + }
 +
 + if (blen  aml_intlen) {
 + if (mode == ACPI_IOREAD) {
 + /* Read from a large field: create buffer */
 + _aml_setvalue(val, AML_OBJTYPE_BUFFER,
 + (blen + 7)  3, 0);
 + }
 + vbit = val-v_buffer;
 + } else {
 + if (mode == ACPI_IOREAD) {
 + /* Read from a short field: initialize integer */
 + _aml_setvalue(val, AML_OBJTYPE_INTEGER, 0, 0);
 + }
 + vbit = val-v_integer;
 + }
 + tbit = tmp.v_integer;
 + vpos = 0;
 +
 + indexval = (bpos  3)  ~(sz - 1);
 + bpos = bpos - (indexval  3);
 +
 + while (blen  0) {
 + len = min(blen, (sz  3) - bpos);
 +
 + /* Write index register */
 + _aml_setvalue(tmp, AML_OBJTYPE_INTEGER, indexval, 0);
 + aml_rwfield(ref2, 0, aml_intlen, tmp, ACPI_IOWRITE);
 + indexval += sz;
 +
 + /* Read/write data register */
 + _aml_setvalue(tmp, AML_OBJTYPE_INTEGER, 0, 0);
 + if (mode == ACPI_IOWRITE)
 + aml_bufcpy(tbit, 0, vbit, vpos, len);
 + aml_rwfield(ref1, bpos, len, tmp, mode);
 + if (mode == ACPI_IOREAD)
 + aml_bufcpy(vbit, vpos, tbit, 0, len);
 + vpos += len;
 + blen -= len;
 + bpos = 0;
 + }
 +}
 +
  void
  aml_rwfield(struct aml_value *fld, int bpos, int blen, struct aml_value *val,
  int mode)
 @@ -2356,10 +2432,7 @@ aml_rwfield(struct aml_value *fld, int b
   memset(tmp, 0, sizeof(tmp));
   aml_addref(tmp, fld.write);
   if (fld-v_field.type == AMLOP_INDEXFIELD) {
 - _aml_setvalue(tmp, AML_OBJTYPE_INTEGER, fld-v_field.ref3, 0);
 - aml_rwfield(ref2, 0, aml_intlen, tmp, ACPI_IOWRITE);
 - aml_rwfield(ref1, fld-v_field.bitpos, fld-v_field.bitlen,
 - val, mode);
 + aml_rwindexfield(fld, val, mode);
   } else if (fld-v_field.type == AMLOP_BANKFIELD) {
   _aml_setvalue(tmp, AML_OBJTYPE_INTEGER, fld-v_field.ref3, 0);
   aml_rwfield(ref2, 0, aml_intlen, tmp, ACPI_IOWRITE);
 @@ -2414,10 +2487,6 @@ aml_createfield(struct aml_value *field,
   opcode == AMLOP_BANKFIELD) ?
   AML_OBJTYPE_FIELDUNIT :
   AML_OBJTYPE_BUFFERFIELD;
 - if (opcode == AMLOP_INDEXFIELD) {
 - indexval = bpos  3;
 - bpos = 7;
 - }
  
   if (field-type == AML_OBJTYPE_BUFFERFIELD 
   data-type != 

Add missing comment to Sysctl.h

2012-04-07 Thread Timo Myyrä
The sysctl.h doesn't include comment for struct kinfo_proc.p_comm.
Add field comment as all other fields seem to be commented.

Timo

Index: sysctl.h
===
RCS file: /cvs/src/sys/sys/sysctl.h,v
retrieving revision 1.121
diff -u -u -r1.121 sysctl.h
--- sysctl.h23 Mar 2012 15:51:26 -  1.121
+++ sysctl.h7 Apr 2012 09:14:32 -
@@ -388,7 +388,7 @@
u_int16_t p_xstat;  /* U_SHORT: Exit status for
wait; also stop signal. */
u_int16_t p_acflag; /* U_SHORT: Accounting flags. */

-   charp_comm[KI_MAXCOMLEN];
+   charp_comm[KI_MAXCOMLEN];   /* original command name */

charp_wmesg[KI_WMESGLEN];   /* wchan message */
u_int64_t p_wchan;  /* PTR: sleep address. */