Re: Improve the accuracy of the TSC frequency calibration - Updated Patch

2017-08-23 Thread Adam Steen
On Thu, Aug 24, 2017 at 2:35 AM, Mike Larkin  wrote:
> On Wed, Aug 23, 2017 at 09:29:15PM +0800, Adam Steen wrote:
>> On Thu, Aug 17, 2017 at 12:19:28PM +0800, Adam Steen wrote:
>> > On Tue, Aug 8, 2017 at 10:12 PM, Mike Belopuhov  wrote:
>> > > On Tue, Aug 08, 2017 at 08:18 +0800, Adam Steen wrote:
>> > >> On Mon, Jul 31, 2017 at 3:58 PM, Mike Belopuhov  
>> > >> wrote:
>> > >> > On Mon, Jul 31, 2017 at 09:48 +0800, Adam Steen wrote:
>> > >> >> Ted Unangst  wrote:
>> > >> >> > we don't currently export this info, but we could add some 
>> > >> >> > sysctls. there's
>> > >> >> > some cpufeatures stuff there, but generally stuff isn't exported 
>> > >> >> > until
>> > >> >> > somebody finds a use for it... it shouldn't be too hard to add 
>> > >> >> > something to
>> > >> >> > amd64/machdep.c sysctl if you're interested.
>> > >> >>
>> > >> >> I am interested, as i need the info, i will look into it and 
>> > >> >> hopefully
>> > >> >> come back with a patch.
>> > >> >
>> > >> > This is a bad idea because TSC as the time source is only usable
>> > >> > by OpenBSD on Skylake and Kaby Lake CPUs since they encode the TSC
>> > >> > frequency in the CPUID. All older CPUs have their TSCs measured
>> > >> > against the PIT. Currently the measurement done by the kernel isn't
>> > >> > very precise and if TSC is selected as a timecounter, the machine
>> > >> > would be gaining time on a pace that cannot be corrected by our NTP
>> > >> > daemon. (IIRC, about an hour a day on my Haswell running with NTP).
>> > >> >
>> > >> > To be able to use TSC as a timecounter source on OpenBSD or Solo5
>> > >> > you'd have to improve the in-kernel measurement of the TSC frequency
>> > >> > first. I've tried to perform 10 measurements and take an average and
>> > >> > it does improve accuracy, however I believe we need to poach another
>> > >> > bit from Linux and re-calibrate TSC via HPET:
>> > >> >
>> > >> >  
>> > >> > http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L409
>> > >> >
>> > >> > I think this is the most sane thing we can do. Here's a complete
>> > >> > procedure that Linux kernel undertakes:
>> > >> >
>> > >> >  
>> > >> > http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L751
>> > >> >
>> > >> > Regards,
>> > >> > Mike
>> > >>
>> > >> Hi Mike/All
>> > >>
>> > >> I would like to improve the accuracy of TSC frequency calibration as
>> > >> Mike B. describes above.
>> > >>
>> > >> I initially thought the calibration would take place at line 470 of
>> > >> amd64/identcpu.c
>> > >> (https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/arch/amd64/amd64/identcpu.c?annotate=1.87)
>> > >>
>> > >
>> > > Indeed, it cannot happen there simply because you don't know at
>> > > that point whether or not HPET actually exists.
>> > >
>> > >> But I looked into using the acpihpet directly but it is never exposed
>> > >> outside of acpihpet.c.
>> > >>
>> > >
>> > > And it shouldn't be.
>> > >
>> > >> Could someone point me to were if would be appropriate to complete
>> > >> this calibration and how to use the acpihpet?
>> > >
>> > > The way I envision this is a multi-step approach:
>> > >
>> > > 1) TSC frequency is approximated with the PIT (possibly performing
>> > > multiple measurements and averaging them out; also keep in mind that
>> > > doing it 8 times means you can shift the sum right by 3 instead of
>> > > using actual integer division).  This is what should happen around
>> > > the line 470 of identcpu.c
>> > >
>> > > 2) A function can be provided by identcpu.c to further adjust the
>> > > TSC frequency once acpitimer(4) (this is a PM timer) and acpihpet(4)
>> > > (or any other timer for that matter) are attached.
>> > >
>> > > 3) Once acpitimer(4) or acpihpet(4) or any other timecounter source
>> > > are attached and are verified to be operating correctly, they can
>> > > perform TSC re-calibration and update the TSC frequency with their
>> > > measurements.  The idea here is that the function (or functions) that
>> > > facilitate this must abstract enough logic so that you don't have to
>> > > duplicate it in the acpitimer or acpihpet themselves.
>> > >
>> > >> (Will it need to be
>> > >> exposed like i8254_delay/delay_func/delay in machdep.c and cpu.h)
>> > >>
>> > >
>> > > No it won't.
>> > >
>> > >> Lastly should the calibration be done using both delay(i8254 pit) and
>> > >> hpet timers similar to Linux described above or just using the hpet?
>> > >>
>> > >
>> > > Well, that's what I was arguing for.  As I said in my initial mail
>> > > on misc (not quoted here), the TSC must be calibrated using separate
>> > > known clocks sources.
>> >
>> > Hi Mike
>> >
>> > Please see the below diff to improve the accuracy of the TSC
>> > frequency. It is model after the linux calibration you linked to
>> > earlier. https://marc.info/?l=openbsd-misc=150148792804747=2
>> >
>> > I feel like i don't know enough about the 

memory leak in sdmmc_io_scan()

2017-08-23 Thread Jonathan Gray
Coverity CID 1453042.

Index: sdmmc_io.c
===
RCS file: /cvs/src/sys/dev/sdmmc/sdmmc_io.c,v
retrieving revision 1.28
diff -u -p -U4 -r1.28 sdmmc_io.c
--- sdmmc_io.c  6 Apr 2017 17:00:53 -   1.28
+++ sdmmc_io.c  24 Aug 2017 04:14:49 -
@@ -136,9 +136,9 @@ sdmmc_io_scan(struct sdmmc_softc *sc)
sf0 = sdmmc_function_alloc(sc);
sf0->number = 0;
if (sdmmc_set_relative_addr(sc, sf0) != 0) {
printf("%s: can't set I/O RCA\n", DEVNAME(sc));
-   SET(sf0->flags, SFF_ERROR);
+   sdmmc_function_free(sf0);
return;
}
sc->sc_fn0 = sf0;
SIMPLEQ_INSERT_TAIL(>sf_head, sf0, sf_list);



freezero(NULL, 0)

2017-08-23 Thread Damien Miller
Hi,

memset(NULL, 0, 0) is (strictly speaking) undefined behaviour, but
there's no reason that freezero(3) needs to follow suit.

This mentions that freezero(NULL, 0) is valid in the manpage, so that
anyone who copies this API should get it right too.

ok?

Index: malloc.3
===
RCS file: /cvs/src/lib/libc/stdlib/malloc.3,v
retrieving revision 1.115
diff -u -p -r1.115 malloc.3
--- malloc.315 May 2017 18:05:34 -  1.115
+++ malloc.324 Aug 2017 01:31:52 -
@@ -210,6 +210,12 @@ argument must be equal or smaller than t
 that returned
 .Fa ptr .
 .Fn freezero
+may be called with a
+.Dv NULL
+pointer argument if the
+.Fa size
+argument is zero.
+.Fn freezero
 guarantees the memory range starting at
 .Fa ptr
 with length



Re: ospfd: add IMSG_IFADDRADD to deal with "sh /etc/netstart if"

2017-08-23 Thread Remi Locherer
On Wed, Aug 23, 2017 at 12:22:03AM +0200, Florian Riehm wrote:
> On 08/21/17 18:57, Remi Locherer wrote:
> > On Mon, Jul 24, 2017 at 04:59:46PM +0200, Remi Locherer wrote:
> > > On Fri, Jul 21, 2017 at 06:24:06PM +0200, Remi Locherer wrote:
> > > > On Fri, Jul 21, 2017 at 02:45:03PM +0200, Florian Riehm wrote:
> > > > > On 06/25/17 23:47, Remi Locherer wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > ospfd does not react nicely when running "sh /etc/netstart if".
> > > > > > 
> > > > > > This is because adding the same address again do an interface 
> > > > > > results
> > > > > > in RTM_DELADDR and RTM_NEWADDR. ospfd handles the former but the 
> > > > > > later.
> > > > > > If this happens ospfd says "interface vether0:192.168.250.1 gone".
> > > > > > Adjacencies on that interface are down and ospfd can not recover.
> > > > > > 
> > > > > > The below patch adds IMSG_IFADDRADD to deal with that. With it ospfd
> > > > > > logs the following after "ifconfig vether0 192.168.250.1/24" (same 
> > > > > > address
> > > > > > as active before):
> > > > > > 
> > > > > 
> > > > > Hi Remi,
> > > > > 
> > > > > thanks for your report and your patch.
> > > > > I think it is the right approach, but unfortunately it doesn't work 
> > > > > in my setup.
> > > > > If I run 'sh /etc/netstart vio1' vio1 goes down and stays down.
> > > > > Please have a look to my config / logs. What is the differece between 
> > > > > your and
> > > > > my test?
> > > > 
> > > > I tested with an interface connected to stub network. Your output shows 
> > > > an
> > > > interface connected to a transit network. In my test setup I did not 
> > > > get the
> > > > error message: "if_join_group: error IP_ADD_MEMBERSHIP"
> > > > 
> > > > I'll look into it and try to fix this.
> > > 
> > > I could reproduce the behaviour you see with my patch when testing with
> > > vmm and vio interfaces. It looks like in the IFADDRDEL case the interface
> > > can not leave the multicast group.
> > > 
> > > I do not see this when testing with vether, pair or vlan (over ix)
> > > interfaces. Could this be a bug with multicast handling in vio?
> > 
> > 
> > Today I did a test with an unpatched ospfd and a vio interface in vmm.
> > 
> > I started ospfd and waited till adjacency was up. Then I did
> > "ifconfig vio0 192.168.250.101/24" (same ip as set before).
> > 
> > The debug output from ospfd:
> > 
> > [...]
> > if_leave_group: error IP_DROP_MEMBERSHIP, interface vio0 address 224.0.0.5: 
> > Can't assign requested address
> > if_leave_group: error IP_DROP_MEMBERSHIP, interface vio0 address 224.0.0.6: 
> > Can't assign requested address
> > orig_rtr_lsa: area 0.0.0.0
> > orig_rtr_lsa: stub net, interface vio0
> > orig_rtr_lsa: stub net, interface vether0
> > if_act_elect: interface vio0 old dr 192.168.250.1 new dr 192.168.250.101, 
> > old bdr 192.168.250.101 new bdr none
> > orig_rtr_lsa: area 0.0.0.0
> > [...]
> > 
> > Doing the same with a pair or vether interface does not produce the message
> > "if_leave_group: error IP_DROP_MEMBERSHIP ".
> > 
> > My conclusion of this is that the error is problem with the vio driver and
> > not with my patch for ospfd.
> > 
> > Could we proceed with the proposed ospfd patch and attack the vio problem
> > separately?
> > 
> 
> Actually I would be fine with that approach after I am sure it is a vio(4)
> problem.
> 
> I think vether(4) and pair(4) are a bit too exotic to prove your patch works 
> ;)
> Tomorrow I will test your change with em(4).

I see. I also tested with vlan on top of trunk on top of ix. Looking forward
to your test results with em.

> 
> My problem is not the multicast error message. I just can't 'see' your
> fix, since my interfaces stay down after netstart.
 
You are talking about the interface state in ospfd (ospfctl show int) and
not the state displayed by ifconfig, right?

> Does your interface come up again, if the multicast error message occurs?

No. When this multicast error occurs ospfd can't join the multicast groups
anymore. But the interface as displayed with ifconfig is up and has link.

I noticed that the vio0 interface has the flag "ALLMULTI" set in addition
to "MULTICASST". The interfaces I tested with do not have "ALLMULTI" set.

Remi



iked: Do not accept superfluous arguments

2017-08-23 Thread Klemens Nanni
Calling `iked reload' when I meant `ikectl reload' showed that iked
happily returned 0 and and fired up another daemon.

Feedback?

Index: iked.c
===
RCS file: /cvs/src/sbin/iked/iked.c,v
retrieving revision 1.34
diff -u -p -r1.34 iked.c
--- iked.c  23 Mar 2017 05:29:48 -  1.34
+++ iked.c  23 Aug 2017 20:39:12 -
@@ -109,6 +109,11 @@ main(int argc, char *argv[])
usage();
}
}
+   argc -= optind;
+   argv += optind;
+
+   if (argc)
+   usage();
 
if ((env = calloc(1, sizeof(*env))) == NULL)
fatal("calloc: env");



Re: Improve the accuracy of the TSC frequency calibration - Updated Patch

2017-08-23 Thread Mike Larkin
On Wed, Aug 23, 2017 at 09:29:15PM +0800, Adam Steen wrote:
> On Thu, Aug 17, 2017 at 12:19:28PM +0800, Adam Steen wrote:
> > On Tue, Aug 8, 2017 at 10:12 PM, Mike Belopuhov  wrote:
> > > On Tue, Aug 08, 2017 at 08:18 +0800, Adam Steen wrote:
> > >> On Mon, Jul 31, 2017 at 3:58 PM, Mike Belopuhov  
> > >> wrote:
> > >> > On Mon, Jul 31, 2017 at 09:48 +0800, Adam Steen wrote:
> > >> >> Ted Unangst  wrote:
> > >> >> > we don't currently export this info, but we could add some sysctls. 
> > >> >> > there's
> > >> >> > some cpufeatures stuff there, but generally stuff isn't exported 
> > >> >> > until
> > >> >> > somebody finds a use for it... it shouldn't be too hard to add 
> > >> >> > something to
> > >> >> > amd64/machdep.c sysctl if you're interested.
> > >> >>
> > >> >> I am interested, as i need the info, i will look into it and hopefully
> > >> >> come back with a patch.
> > >> >
> > >> > This is a bad idea because TSC as the time source is only usable
> > >> > by OpenBSD on Skylake and Kaby Lake CPUs since they encode the TSC
> > >> > frequency in the CPUID. All older CPUs have their TSCs measured
> > >> > against the PIT. Currently the measurement done by the kernel isn't
> > >> > very precise and if TSC is selected as a timecounter, the machine
> > >> > would be gaining time on a pace that cannot be corrected by our NTP
> > >> > daemon. (IIRC, about an hour a day on my Haswell running with NTP).
> > >> >
> > >> > To be able to use TSC as a timecounter source on OpenBSD or Solo5
> > >> > you'd have to improve the in-kernel measurement of the TSC frequency
> > >> > first. I've tried to perform 10 measurements and take an average and
> > >> > it does improve accuracy, however I believe we need to poach another
> > >> > bit from Linux and re-calibrate TSC via HPET:
> > >> >
> > >> >  
> > >> > http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L409
> > >> >
> > >> > I think this is the most sane thing we can do. Here's a complete
> > >> > procedure that Linux kernel undertakes:
> > >> >
> > >> >  
> > >> > http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L751
> > >> >
> > >> > Regards,
> > >> > Mike
> > >>
> > >> Hi Mike/All
> > >>
> > >> I would like to improve the accuracy of TSC frequency calibration as
> > >> Mike B. describes above.
> > >>
> > >> I initially thought the calibration would take place at line 470 of
> > >> amd64/identcpu.c
> > >> (https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/arch/amd64/amd64/identcpu.c?annotate=1.87)
> > >>
> > >
> > > Indeed, it cannot happen there simply because you don't know at
> > > that point whether or not HPET actually exists.
> > >
> > >> But I looked into using the acpihpet directly but it is never exposed
> > >> outside of acpihpet.c.
> > >>
> > >
> > > And it shouldn't be.
> > >
> > >> Could someone point me to were if would be appropriate to complete
> > >> this calibration and how to use the acpihpet?
> > >
> > > The way I envision this is a multi-step approach:
> > >
> > > 1) TSC frequency is approximated with the PIT (possibly performing
> > > multiple measurements and averaging them out; also keep in mind that
> > > doing it 8 times means you can shift the sum right by 3 instead of
> > > using actual integer division).  This is what should happen around
> > > the line 470 of identcpu.c
> > >
> > > 2) A function can be provided by identcpu.c to further adjust the
> > > TSC frequency once acpitimer(4) (this is a PM timer) and acpihpet(4)
> > > (or any other timer for that matter) are attached.
> > >
> > > 3) Once acpitimer(4) or acpihpet(4) or any other timecounter source
> > > are attached and are verified to be operating correctly, they can
> > > perform TSC re-calibration and update the TSC frequency with their
> > > measurements.  The idea here is that the function (or functions) that
> > > facilitate this must abstract enough logic so that you don't have to
> > > duplicate it in the acpitimer or acpihpet themselves.
> > >
> > >> (Will it need to be
> > >> exposed like i8254_delay/delay_func/delay in machdep.c and cpu.h)
> > >>
> > >
> > > No it won't.
> > >
> > >> Lastly should the calibration be done using both delay(i8254 pit) and
> > >> hpet timers similar to Linux described above or just using the hpet?
> > >>
> > >
> > > Well, that's what I was arguing for.  As I said in my initial mail
> > > on misc (not quoted here), the TSC must be calibrated using separate
> > > known clocks sources.
> > 
> > Hi Mike
> > 
> > Please see the below diff to improve the accuracy of the TSC
> > frequency. It is model after the linux calibration you linked to
> > earlier. https://marc.info/?l=openbsd-misc=150148792804747=2
> > 
> > I feel like i don't know enough about the kernel internals, the
> > consistency of the results across reboots are not as close as i would
> > have liked, i feel the call to do the actual calibration should be
> > later in the boot cycle, 

Re: Improve the accuracy of the TSC frequency calibration - Updated Patch

2017-08-23 Thread Adam Steen
On Thu, Aug 17, 2017 at 12:19:28PM +0800, Adam Steen wrote:
> On Tue, Aug 8, 2017 at 10:12 PM, Mike Belopuhov  wrote:
> > On Tue, Aug 08, 2017 at 08:18 +0800, Adam Steen wrote:
> >> On Mon, Jul 31, 2017 at 3:58 PM, Mike Belopuhov  wrote:
> >> > On Mon, Jul 31, 2017 at 09:48 +0800, Adam Steen wrote:
> >> >> Ted Unangst  wrote:
> >> >> > we don't currently export this info, but we could add some sysctls. 
> >> >> > there's
> >> >> > some cpufeatures stuff there, but generally stuff isn't exported until
> >> >> > somebody finds a use for it... it shouldn't be too hard to add 
> >> >> > something to
> >> >> > amd64/machdep.c sysctl if you're interested.
> >> >>
> >> >> I am interested, as i need the info, i will look into it and hopefully
> >> >> come back with a patch.
> >> >
> >> > This is a bad idea because TSC as the time source is only usable
> >> > by OpenBSD on Skylake and Kaby Lake CPUs since they encode the TSC
> >> > frequency in the CPUID. All older CPUs have their TSCs measured
> >> > against the PIT. Currently the measurement done by the kernel isn't
> >> > very precise and if TSC is selected as a timecounter, the machine
> >> > would be gaining time on a pace that cannot be corrected by our NTP
> >> > daemon. (IIRC, about an hour a day on my Haswell running with NTP).
> >> >
> >> > To be able to use TSC as a timecounter source on OpenBSD or Solo5
> >> > you'd have to improve the in-kernel measurement of the TSC frequency
> >> > first. I've tried to perform 10 measurements and take an average and
> >> > it does improve accuracy, however I believe we need to poach another
> >> > bit from Linux and re-calibrate TSC via HPET:
> >> >
> >> >  
> >> > http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L409
> >> >
> >> > I think this is the most sane thing we can do. Here's a complete
> >> > procedure that Linux kernel undertakes:
> >> >
> >> >  
> >> > http://elixir.free-electrons.com/linux/v4.12.4/source/arch/x86/kernel/tsc.c#L751
> >> >
> >> > Regards,
> >> > Mike
> >>
> >> Hi Mike/All
> >>
> >> I would like to improve the accuracy of TSC frequency calibration as
> >> Mike B. describes above.
> >>
> >> I initially thought the calibration would take place at line 470 of
> >> amd64/identcpu.c
> >> (https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/arch/amd64/amd64/identcpu.c?annotate=1.87)
> >>
> >
> > Indeed, it cannot happen there simply because you don't know at
> > that point whether or not HPET actually exists.
> >
> >> But I looked into using the acpihpet directly but it is never exposed
> >> outside of acpihpet.c.
> >>
> >
> > And it shouldn't be.
> >
> >> Could someone point me to were if would be appropriate to complete
> >> this calibration and how to use the acpihpet?
> >
> > The way I envision this is a multi-step approach:
> >
> > 1) TSC frequency is approximated with the PIT (possibly performing
> > multiple measurements and averaging them out; also keep in mind that
> > doing it 8 times means you can shift the sum right by 3 instead of
> > using actual integer division).  This is what should happen around
> > the line 470 of identcpu.c
> >
> > 2) A function can be provided by identcpu.c to further adjust the
> > TSC frequency once acpitimer(4) (this is a PM timer) and acpihpet(4)
> > (or any other timer for that matter) are attached.
> >
> > 3) Once acpitimer(4) or acpihpet(4) or any other timecounter source
> > are attached and are verified to be operating correctly, they can
> > perform TSC re-calibration and update the TSC frequency with their
> > measurements.  The idea here is that the function (or functions) that
> > facilitate this must abstract enough logic so that you don't have to
> > duplicate it in the acpitimer or acpihpet themselves.
> >
> >> (Will it need to be
> >> exposed like i8254_delay/delay_func/delay in machdep.c and cpu.h)
> >>
> >
> > No it won't.
> >
> >> Lastly should the calibration be done using both delay(i8254 pit) and
> >> hpet timers similar to Linux described above or just using the hpet?
> >>
> >
> > Well, that's what I was arguing for.  As I said in my initial mail
> > on misc (not quoted here), the TSC must be calibrated using separate
> > known clocks sources.
> 
> Hi Mike
> 
> Please see the below diff to improve the accuracy of the TSC
> frequency. It is model after the linux calibration you linked to
> earlier. https://marc.info/?l=openbsd-misc=150148792804747=2
> 
> I feel like i don't know enough about the kernel internals, the
> consistency of the results across reboots are not as close as i would
> have liked, i feel the call to do the actual calibration should be
> later in the boot cycle, when things have calmed down a little, but
> couldn't figure out the best way of doing this.
> 
> please bear with me i haven't been programming c for long, but the
> only way to get things done is to do it your self.
> 
> Cheers
> Adam



Thank you Mike on the feedback on the last patch, 

Re: allow iwm_stop to sleep

2017-08-23 Thread Stefan Sperling
Has anybody tried this diff?

On Mon, Aug 14, 2017 at 09:07:37AM +0200, Stefan Sperling wrote:
> This diff makes iwm_stop() always run in a process context.
> I want iwm_stop() to be able to sleep so that it can wait for asynchronous
> driver tasks, and perhaps even wait for firmware commands, in the future.
> 
> If the interrupt handler detects a fatal condition, instead of calling
> iwm_stop() directly, defer to the init task. The init task looks at flags
> to understand what happened and restarts or stops the device as appropriate.
> 
> I found that toggling the RF kill switch can trigger a fatal firmware error.
> Hence I am letting the interrupt handler check RF kill before checking for
> fatal firmware error. Provides better error reporting when the kill switch
> is used.
> 
> During suspend, bring the device down during the QUIESCE stage which
> is allowed to sleep.
> 
> dhclient/down/scan in a loop still works (as far as it always has, with
> various errors reported in dmesg). Suspend/resume still works.
> 
> ok?
> 
> Index: if_iwm.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_iwm.c,v
> retrieving revision 1.211
> diff -u -p -r1.211 if_iwm.c
> --- if_iwm.c  13 Aug 2017 18:08:03 -  1.211
> +++ if_iwm.c  14 Aug 2017 06:44:59 -
> @@ -6517,6 +6517,7 @@ iwm_stop(struct ifnet *ifp)
>   sc->sc_flags &= ~IWM_FLAG_BINDING_ACTIVE;
>   sc->sc_flags &= ~IWM_FLAG_STA_ACTIVE;
>   sc->sc_flags &= ~IWM_FLAG_TE_ACTIVE;
> + sc->sc_flags &= ~IWM_FLAG_HW_ERR;
>  
>   sc->sc_newstate(ic, IEEE80211_S_INIT, -1);
>  
> @@ -7169,7 +7170,6 @@ int
>  iwm_intr(void *arg)
>  {
>   struct iwm_softc *sc = arg;
> - struct ifnet *ifp = IC2IFP(>sc_ic);
>   int handled = 0;
>   int r1, r2, rv = 0;
>   int isperiodic = 0;
> @@ -7218,6 +7218,15 @@ iwm_intr(void *arg)
>   /* ignored */
>   handled |= (r1 & (IWM_CSR_INT_BIT_ALIVE /*| IWM_CSR_INT_BIT_SCD*/));
>  
> + if (r1 & IWM_CSR_INT_BIT_RF_KILL) {
> + handled |= IWM_CSR_INT_BIT_RF_KILL;
> + if (iwm_check_rfkill(sc)) {
> + task_add(systq, >init_task);
> + rv = 1;
> + goto out;
> + }
> + }
> +
>   if (r1 & IWM_CSR_INT_BIT_SW_ERR) {
>  #ifdef IWM_DEBUG
>   int i;
> @@ -7238,7 +7247,6 @@ iwm_intr(void *arg)
>  #endif
>  
>   printf("%s: fatal firmware error\n", DEVNAME(sc));
> - iwm_stop(ifp);
>   task_add(systq, >init_task);
>   rv = 1;
>   goto out;
> @@ -7248,7 +7256,8 @@ iwm_intr(void *arg)
>   if (r1 & IWM_CSR_INT_BIT_HW_ERR) {
>   handled |= IWM_CSR_INT_BIT_HW_ERR;
>   printf("%s: hardware error, stopping device \n", DEVNAME(sc));
> - iwm_stop(ifp);
> + sc->sc_flags |= IWM_FLAG_HW_ERR;
> + task_add(systq, >init_task);
>   rv = 1;
>   goto out;
>   }
> @@ -7262,13 +7271,6 @@ iwm_intr(void *arg)
>   wakeup(>sc_fw);
>   }
>  
> - if (r1 & IWM_CSR_INT_BIT_RF_KILL) {
> - handled |= IWM_CSR_INT_BIT_RF_KILL;
> - if (iwm_check_rfkill(sc) && (ifp->if_flags & IFF_UP)) {
> - iwm_stop(ifp);
> - }
> - }
> -
>   if (r1 & IWM_CSR_INT_BIT_RX_PERIODIC) {
>   handled |= IWM_CSR_INT_BIT_RX_PERIODIC;
>   IWM_WRITE(sc, IWM_CSR_INT, IWM_CSR_INT_BIT_RX_PERIODIC);
> @@ -7739,23 +7741,27 @@ iwm_init_task(void *arg1)
>  {
>   struct iwm_softc *sc = arg1;
>   struct ifnet *ifp = >sc_ic.ic_if;
> - int s;
> + int s = splnet();
>   int generation = sc->sc_generation;
> + int fatal = (sc->sc_flags & (IWM_FLAG_HW_ERR | IWM_FLAG_RFKILL));
>  
>   rw_enter_write(>ioctl_rwl);
>   if (generation != sc->sc_generation) {
>   rw_exit(>ioctl_rwl);
> + splx(s);
>   return;
>   }
> - s = splnet();
>  
>   if (ifp->if_flags & IFF_RUNNING)
>   iwm_stop(ifp);
> - if ((ifp->if_flags & (IFF_UP | IFF_RUNNING)) == IFF_UP)
> + else if (sc->sc_flags & IWM_FLAG_HW_ERR)
> + sc->sc_flags &= ~IWM_FLAG_HW_ERR;
> +
> + if (!fatal && (ifp->if_flags & (IFF_UP | IFF_RUNNING)) == IFF_UP)
>   iwm_init(ifp);
>  
> - splx(s);
>   rw_exit(>ioctl_rwl);
> + splx(s);
>  }
>  
>  int
> @@ -7778,9 +7784,12 @@ iwm_activate(struct device *self, int ac
>   int err = 0;
>  
>   switch (act) {
> - case DVACT_SUSPEND:
> - if (ifp->if_flags & IFF_RUNNING)
> + case DVACT_QUIESCE:
> + if (ifp->if_flags & IFF_RUNNING) {
> + rw_enter_write(>ioctl_rwl);
>   iwm_stop(ifp);
> + rw_exit(>ioctl_rwl);
> + }
>   break;
>   case DVACT_RESUME:
>   err = iwm_resume(sc);
> Index: if_iwmvar.h
> 

Re: [PATCH] Fix likely typo in if_iwm.c

2017-08-23 Thread Stefan Sperling
On Wed, Aug 23, 2017 at 09:57:46PM +0900, Bryan Linton wrote:
> I spotted what looks to be a typo in if_iwm.c given the context in
> which it occurs.  Patch attached.  Apologies if I'm mistaken.

Committed, thanks.

> 
> -- 
> Bryan
> 
> Index: if_iwm.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_iwm.c,v
> retrieving revision 1.211
> diff -u -r1.211 if_iwm.c
> --- if_iwm.c  13 Aug 2017 18:08:03 -  1.211
> +++ if_iwm.c  23 Aug 2017 12:43:19 -
> @@ -5643,7 +5643,7 @@
>  
>   err = iwm_power_update_device(sc);
>   if (err) {
> - printf("%s: could send power command (error %d)\n",
> + printf("%s: could not send power command (error %d)\n",
>   DEVNAME(sc), err);
>   return err;
>   }
> @@ -6313,7 +6313,7 @@
>  
>   err = iwm_power_update_device(sc);
>   if (err) {
> - printf("%s: could send power command (error %d)\n",
> + printf("%s: could not send power command (error %d)\n",
>   DEVNAME(sc), err);
>   goto err;
>   }
> 



[PATCH] Fix likely typo in if_iwm.c

2017-08-23 Thread Bryan Linton
I spotted what looks to be a typo in if_iwm.c given the context in
which it occurs.  Patch attached.  Apologies if I'm mistaken.

-- 
Bryan

Index: if_iwm.c
===
RCS file: /cvs/src/sys/dev/pci/if_iwm.c,v
retrieving revision 1.211
diff -u -r1.211 if_iwm.c
--- if_iwm.c13 Aug 2017 18:08:03 -  1.211
+++ if_iwm.c23 Aug 2017 12:43:19 -
@@ -5643,7 +5643,7 @@
 
err = iwm_power_update_device(sc);
if (err) {
-   printf("%s: could send power command (error %d)\n",
+   printf("%s: could not send power command (error %d)\n",
DEVNAME(sc), err);
return err;
}
@@ -6313,7 +6313,7 @@
 
err = iwm_power_update_device(sc);
if (err) {
-   printf("%s: could send power command (error %d)\n",
+   printf("%s: could not send power command (error %d)\n",
DEVNAME(sc), err);
goto err;
}



Re: relayd interrupted transfers

2017-08-23 Thread Rivo Nurges
Hi!

I'll look into it.

Rivo

On 23/08/2017, 14:42, "Alexander Bluhm"  wrote:

On Tue, Aug 22, 2017 at 05:31:17PM +, Rivo Nurges wrote:
> relay_error() sets se_done even if write buffer is not drained and
> relay_write() will close the connection if se_done is set

I have seen a sporadic fail with chunked encoding in the daily
regression test run.  So something might be wrong.  Your idea of
not closing the relay if there is data in the buffer, makes sense.

Unfortunately the regression tests in /usr/src/regress/usr.sbin/relayd
fail with your patch.  It hangs as the EOF is not properly propagated.
I think the change in relay_error() affects too much.

bluhm




Re: relayd interrupted transfers

2017-08-23 Thread Alexander Bluhm
On Tue, Aug 22, 2017 at 05:31:17PM +, Rivo Nurges wrote:
> relay_error() sets se_done even if write buffer is not drained and
> relay_write() will close the connection if se_done is set

I have seen a sporadic fail with chunked encoding in the daily
regression test run.  So something might be wrong.  Your idea of
not closing the relay if there is data in the buffer, makes sense.

Unfortunately the regression tests in /usr/src/regress/usr.sbin/relayd
fail with your patch.  It hangs as the EOF is not properly propagated.
I think the change in relay_error() affects too much.

bluhm