Re: power-off delay/hang due to commit 6d25be57 (mainline)

2021-01-24 Thread Stephen Berman
On Mon, 04 Jan 2021 16:38:43 +0100 Stephen Berman  
wrote:

> On Thu, 31 Dec 2020 21:46:11 +0100 "Rafael J. Wysocki"  
> wrote:
>
>> ATM, I'm tempted to do something like the patch below (with the rationale
>> that it shouldn't be necessary to read the temperature right after updating
>> the trip points if polling is in use, because the next update through polling
>> will cause it to be read anyway and it will trigger trip point actions as
>> needed).
>>
>> Stephen, can you give it a go, please?
>
> On Sat, 02 Jan 2021 12:03:17 +0100 "Rafael J. Wysocki"  
> wrote:
>
>> There is one more way to address this, probably better: instead of checking 
>> the
>> temperature right away in acpi_thermal_notify(), queue that on
>> acpi_thermal_pm_queue
>> and so only if another thermal check is not pending.
>>
>> This way there will be at most one temperature check coming from
>> acpi_thermal_notify() queued up at any time which should prevent the
>> build-up of work items from taking place.
>>
>> So something like this:
>
> Thanks for the patches.  I'll try them as soon as I can.

FTR, since this is the thread I started for this bug, I've confirmed in
https://lore.kernel.org/lkml/87y2gi78sg@gmx.net/T/#t that the latest
patch fixes the bug.

Steve Berman


Re: [PATCH] ACPI: thermal: Do not call acpi_thermal_check() directly

2021-01-24 Thread Stephen Berman
On Fri, 22 Jan 2021 17:42:59 +0100 "Rafael J. Wysocki"  
wrote:

> On Fri, Jan 22, 2021 at 5:39 PM Stephen Berman  wrote:
>>
>> On Fri, 22 Jan 2021 17:23:36 +0100 "Rafael J. Wysocki"  
>> wrote:
>>
>> > On Thu, Jan 14, 2021 at 7:35 PM Rafael J. Wysocki  
>> > wrote:
>> >>
>> >> From: Rafael J. Wysocki 
>> >>
>> >> Calling acpi_thermal_check() from acpi_thermal_notify() directly
>> >> is problematic if _TMP triggers Notify () on the thermal zone for
>> >> which it has been evaluated (which happens on some systems), because
>> >> it causes a new acpi_thermal_notify() invocation to be queued up
>> >> every time and if that takes place too often, an indefinite number of
>> >> pending work items may accumulate in kacpi_notify_wq over time.
>> >>
>> >> Besides, it is not really useful to queue up a new invocation of
>> >> acpi_thermal_check() if one of them is pending already.
>> >>
>> >> For these reasons, rework acpi_thermal_notify() to queue up a thermal
>> >> check instead of calling acpi_thermal_check() directly and only allow
>> >> one thermal check to be pending at a time.  Moreover, only allow one
>> >> acpi_thermal_check_fn() instance at a time to run
>> >> thermal_zone_device_update() for one thermal zone and make it return
>> >> early if it sees other instances running for the same thermal zone.
>> >>
>> >> While at it, fold acpi_thermal_check() into acpi_thermal_check_fn(),
>> >> as it is only called from there after the other changes made here.
>> >>
>> >> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208877
>> >> Reported-by: Stephen Berman 
>> >> Diagnosed-by: Sebastian Andrzej Siewior 
>> >> Signed-off-by: Rafael J. Wysocki 
>> >
>> > Well, it's been over a week since this was posted.
>> >
>> > Does anyone have any comments?
>>
>> Sorry, I haven't been able to make time to test the patch yet, but I'll
>> try to do so this weekend.  Is it just the patch below that I should
>> apply, ignoring the previous patches you sent?
>
> Yes.
>
>> And can I apply it to the current mainline kernel?
>
> Yes, it should be applicable to the current mainline (at least as of 
> 5.11-rc4).
>
> Thanks!

I've now updated my local repo to 5.11.0-rc4+, installed your patch,
rebuilt and installed the kernel, rebooted (without adding
'thermal.tzp=300' to the kernel command line), did some normal activity,
then ran 'shutdown -h now', and the machine did just that.  So your
patch seems to have fixed the problem I reported.  Many thanks!

Steve Berman


Re: [PATCH] ACPI: thermal: Do not call acpi_thermal_check() directly

2021-01-22 Thread Stephen Berman
On Fri, 22 Jan 2021 17:23:36 +0100 "Rafael J. Wysocki"  
wrote:

> On Thu, Jan 14, 2021 at 7:35 PM Rafael J. Wysocki  wrote:
>>
>> From: Rafael J. Wysocki 
>>
>> Calling acpi_thermal_check() from acpi_thermal_notify() directly
>> is problematic if _TMP triggers Notify () on the thermal zone for
>> which it has been evaluated (which happens on some systems), because
>> it causes a new acpi_thermal_notify() invocation to be queued up
>> every time and if that takes place too often, an indefinite number of
>> pending work items may accumulate in kacpi_notify_wq over time.
>>
>> Besides, it is not really useful to queue up a new invocation of
>> acpi_thermal_check() if one of them is pending already.
>>
>> For these reasons, rework acpi_thermal_notify() to queue up a thermal
>> check instead of calling acpi_thermal_check() directly and only allow
>> one thermal check to be pending at a time.  Moreover, only allow one
>> acpi_thermal_check_fn() instance at a time to run
>> thermal_zone_device_update() for one thermal zone and make it return
>> early if it sees other instances running for the same thermal zone.
>>
>> While at it, fold acpi_thermal_check() into acpi_thermal_check_fn(),
>> as it is only called from there after the other changes made here.
>>
>> BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208877
>> Reported-by: Stephen Berman 
>> Diagnosed-by: Sebastian Andrzej Siewior 
>> Signed-off-by: Rafael J. Wysocki 
>
> Well, it's been over a week since this was posted.
>
> Does anyone have any comments?

Sorry, I haven't been able to make time to test the patch yet, but I'll
try to do so this weekend.  Is it just the patch below that I should
apply, ignoring the previous patches you sent?  And can I apply it to
the current mainline kernel?

Thanks,
Steve Berman

>> ---
>>  drivers/acpi/thermal.c |   46 +-
>>  1 file changed, 33 insertions(+), 13 deletions(-)
>>
>> Index: linux-pm/drivers/acpi/thermal.c
>> ===
>> --- linux-pm.orig/drivers/acpi/thermal.c
>> +++ linux-pm/drivers/acpi/thermal.c
>> @@ -174,6 +174,8 @@ struct acpi_thermal {
>> struct thermal_zone_device *thermal_zone;
>> int kelvin_offset;  /* in millidegrees */
>> struct work_struct thermal_check_work;
>> +   struct mutex thermal_check_lock;
>> +   refcount_t thermal_check_count;
>>  };
>>
>>  /* 
>> --
>> @@ -495,14 +497,6 @@ static int acpi_thermal_get_trip_points(
>> return 0;
>>  }
>>
>> -static void acpi_thermal_check(void *data)
>> -{
>> -   struct acpi_thermal *tz = data;
>> -
>> -   thermal_zone_device_update(tz->thermal_zone,
>> -  THERMAL_EVENT_UNSPECIFIED);
>> -}
>> -
>>  /* sys I/F for generic thermal sysfs support */
>>
>>  static int thermal_get_temp(struct thermal_zone_device *thermal, int *temp)
>> @@ -900,6 +894,12 @@ static void acpi_thermal_unregister_ther
>>   Driver Interface
>> 
>> -- */
>>
>> +static void acpi_queue_thermal_check(struct acpi_thermal *tz)
>> +{
>> +   if (!work_pending(>thermal_check_work))
>> +   queue_work(acpi_thermal_pm_queue, >thermal_check_work);
>> +}
>> +
>>  static void acpi_thermal_notify(struct acpi_device *device, u32 event)
>>  {
>> struct acpi_thermal *tz = acpi_driver_data(device);
>> @@ -910,17 +910,17 @@ static void acpi_thermal_notify(struct a
>>
>> switch (event) {
>> case ACPI_THERMAL_NOTIFY_TEMPERATURE:
>> -   acpi_thermal_check(tz);
>> +   acpi_queue_thermal_check(tz);
>> break;
>> case ACPI_THERMAL_NOTIFY_THRESHOLDS:
>> acpi_thermal_trips_update(tz, ACPI_TRIPS_REFRESH_THRESHOLDS);
>> -   acpi_thermal_check(tz);
>> +   acpi_queue_thermal_check(tz);
>> acpi_bus_generate_netlink_event(device->pnp.device_class,
>>   dev_name(>dev), 
>> event, 0);
>> break;
>> case ACPI_THERMAL_NOTIFY_DEVICES:
>> acpi_thermal_trips_update(tz, ACPI_TRIPS_REFRESH_DEVICES);
>> -   acpi_thermal_check(tz);

Re: power-off delay/hang due to commit 6d25be57 (mainline)

2021-01-04 Thread Stephen Berman
On Thu, 31 Dec 2020 21:46:11 +0100 "Rafael J. Wysocki"  
wrote:

> ATM, I'm tempted to do something like the patch below (with the rationale
> that it shouldn't be necessary to read the temperature right after updating
> the trip points if polling is in use, because the next update through polling
> will cause it to be read anyway and it will trigger trip point actions as
> needed).
>
> Stephen, can you give it a go, please?

On Sat, 02 Jan 2021 12:03:17 +0100 "Rafael J. Wysocki"  
wrote:

> There is one more way to address this, probably better: instead of checking 
> the
> temperature right away in acpi_thermal_notify(), queue that on 
> acpi_thermal_pm_queue
> and so only if another thermal check is not pending.
>
> This way there will be at most one temperature check coming from
> acpi_thermal_notify() queued up at any time which should prevent the
> build-up of work items from taking place.
>
> So something like this:

Thanks for the patches.  I'll try them as soon as I can.

Steve Berman


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-08-11 Thread Stephen Berman
On Sun, 19 Jul 2020 12:07:14 +0200 Stephen Berman  
wrote:

> On Tue, 14 Jul 2020 16:11:35 +0200 Sebastian Andrzej Siewior
>  wrote:
>
[...]
>> Stephen, the patch attached adds a WARN_ON() statement which will
>> produce a stack trace (4 or so). Could please run 'dmesg' after a while
>> and send it back. There should be a
>> "WARNING in drivers/acpi/acpica/evmisc.c"
>>
>> statement or something along the lines.
>
> I've now rebuilt 5.6.4 with the new patch; sorry for the delay.  Output
> of dmesg attached.

FWIW, while you were away I got an additional datapoint: someone who saw
my original post about this problem to one of the Linux From Scratch
mailing lists wrote me that he had the same problem, also getting
messages about the cdrom timing out ("sr 3:0:0:0" in his case); I told
him about your workaround of adding 'thermal.tzp=300' to the kernel
commandline, and he replied that this works for him too.  And it turns
out we have similar motherboards: I have a Gigabyte Z390 M Gaming
Rev. 1001 board and he has Gigabyte Z390 Designare rev 1.0.

Steve Berman


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-07-19 Thread Stephen Berman
On Tue, 14 Jul 2020 16:11:35 +0200 Sebastian Andrzej Siewior 
 wrote:

> On 2020-07-14 15:54:57 [+0200], Rafael J. Wysocki wrote:
>> On Tue, Jul 14, 2020 at 3:44 PM Sebastian Andrzej Siewior
>>  wrote:>
>> > On 2020-06-24 23:49:52 [+0200], Stephen Berman wrote:
>> >
>> > Let me summarize the thread here:
>> >
>> > On Stephen's system, ACPI informs the thermal zone driver to poll the
>> > temperature every second and the driver does so.
>> > The driver queries the temperature by invoking acpi_evaluate_integer()
>> > which invokes (at some point) acpi_ev_queue_notify_request().
>>
>> Well, I don't quite see how acpi_ev_queue_notify_request() can be
>> invoked from the acpi_evaluate_integer() code path.
>>
>> Do you have a call trace showing that?
>
> So the trace in
>  https://lore.kernel.org/linux-acpi/87o8pjh1i0@gmx.net/
>
> shows the pattern and we nailed it down that it comes from
> thermal_get_temp(). I assumed acpi_ex_opcode_2A_0T_0R() since the other
> candidate was acpi_ev_asynch_execute_gpe_method().
>
> Stephen, the patch attached adds a WARN_ON() statement which will
> produce a stack trace (4 or so). Could please run 'dmesg' after a while
> and send it back. There should be a
> "WARNING in drivers/acpi/acpica/evmisc.c"
>
> statement or something along the lines.

I've now rebuilt 5.6.4 with the new patch; sorry for the delay.  Output
of dmesg attached.

Steve Berman



dmesg-5.6.4.gz
Description: Binary data


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-24 Thread Stephen Berman
On Wed, 24 Jun 2020 22:11:56 +0200 Sebastian Andrzej Siewior 
 wrote:

> On 2020-06-17 23:09:44 [+0200], Stephen Berman wrote:
>>
>> Attached.
>
> I did not forget about this but had recently little time to look into
> this.

No problem!

Steve Berman


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-17 Thread Stephen Berman
On Wed, 17 Jun 2020 16:27:34 +0200 Sebastian Andrzej Siewior 
 wrote:

> On 2020-06-16 22:28:43 [+0200], Stephen Berman wrote:
>> Your assessment and predictions are right on the mark!
> perfect.
>
>> I'm fine with the thermal.tzp=300 workaround, but it would be good to
>> find out why this problem started with commit 6d25be57, if my git
>> bisection was correct, or if it wasn't, then at least somewhere between
>> 5.1.0 and 5.2.0.  Or can you already deduce why?  If not, I'd be more
>> than happy to continue applying any patches or trying any suggestions
>> you have, if you want to continue debugging this issue.  In any case,
>> thanks for pursuing it to this point.
>
> I have no idea why the commit in question should make any difference.
> Could please apply the tracing patch on v5.1 and send the trace?

Attached.

Steve Berman



trace-5.1.0.txt.gz
Description: Binary data


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-16 Thread Stephen Berman
On Tue, 16 Jun 2020 17:55:01 +0200 Sebastian Andrzej Siewior 
 wrote:

> On 2020-06-16 10:13:27 [+0200], Stephen Berman wrote:
>> Yes, thanks, that did it.  Trace attached.
>
> So TZ10 is a temperature sensor of some kind on your motherboard. In
> your v5.6 dmesg there is:
> | thermal LNXTHERM:00: registered as thermal_zone0
> | ACPI: Thermal Zone [TZ10] (17 C)
>
> So. In /sys/class/thermal/thermal_zone0/device/path you should also see
> TZ10. And /sys/class/thermal/thermal_zone0/temp should show the actual
> value.
> This comes from the "thermal" module.

Yes, TZ10 was in the thermal_zone0/device/path and the value in
thermal_zone0/temp was 16800.

> Looking at the trace, might query the temperature every second which
> somehow results in "Dispatching Notify on". I don't understand how it
> gets from reading of the temperature to the notify part, maybe it is
> part of the ACPI…
>
> However. Could you please make sure that the thermal module is not
> loaded at system startup? Adding
> thermal.off=1
>
> to the kernel commandline should do the trick. And you should see
>thermal control disabled
>
> in dmesg. 

Confirmed.  And the value in thermal_zone0/temp was now 33000.

>   That means your thermal_zone0 with TZ10 does not show up in
> /sys and nothing should schedule the work-items. This in turn should
> allow you to shutdown your system without the delay.

It did!

> If this works, could you please try to load the module with tzp=300?
> If you add this
>   thermal.tzp=300
>   
> to the kernel commandline then it should do the trick. You can verify it
> by
>cat /sys/module/thermal/parameters/tzp 
>
> This should change the polling interval from what ACPI says to 30secs.
> This should ensure that you don't have so many worker waiting. So you
> should also be able to shutdown the system.

Your assessment and predictions are right on the mark!

I'm fine with the thermal.tzp=300 workaround, but it would be good to
find out why this problem started with commit 6d25be57, if my git
bisection was correct, or if it wasn't, then at least somewhere between
5.1.0 and 5.2.0.  Or can you already deduce why?  If not, I'd be more
than happy to continue applying any patches or trying any suggestions
you have, if you want to continue debugging this issue.  In any case,
thanks for pursuing it to this point.

Steve Berman


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-16 Thread Stephen Berman
On Tue, 16 Jun 2020 09:38:27 +0200 Sebastian Andrzej Siewior 
 wrote:

> On 2020-06-16 09:14:37 [+0200], Stephen Berman wrote:
>>
>> I set CONFIG_ACPI_DEBUG=y, applied the patch and rebuilt 5.6.4, but:
>>
>> $ cat /sys/kernel/debug/tracing/trace > trace.txt
>> cat: /sys/kernel/debug/tracing/trace: No such file or directory
>>
>> Here are all the 5.6.4 config options with "TRACING" or "TRACE"; do I
>> need to set any that aren't set?
>
> I see that "Kernel hacking  ---> Tracers" is enabled. You should have
> one tracer enabled for that trace_printk() to work. You have context
> switch tracer so it should be enough.
>
> You might need CONFIG_DEBUG_FS (but it is set in your 5.1 config) or it
> is just not mounted. So I have here:
>
> | debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
> | tracefs on /sys/kernel/debug/tracing type tracefs 
> (rw,nosuid,nodev,noexec,relatime)
>
> Does it work for you if you mount it?

# mount -t debugfs debugfs /sys/kernel/debug
# mount -t tracefs tracefs /sys/kernel/debug/tracing
# mount
...
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
tracefs on /sys/kernel/debug/tracing type tracefs (rw,relatime)

Yes, thanks, that did it.  Trace attached.

Steve Berman



trace.txt.gz
Description: Binary data


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-16 Thread Stephen Berman
On Mon, 15 Jun 2020 16:51:30 +0200 Sebastian Andrzej Siewior 
 wrote:

> I attached a modified acpi_dbg.patch. Please enable:
> - CONFIG_ACPI_DEBUG=y
>
> Looking at your 5.1 you have tracing enabled (hope it still is).
>
> The attached patch will dump the date into the tracing buffer, so you
> console should remain "clean". Once it records 300 of those "requests
> for acpi_ev_notify_dispatch" it will stop recording.
> After 4-5 minutes please do
>   cat /sys/kernel/debug/tracing/trace > file.txt
>
> compress it and send it.

I set CONFIG_ACPI_DEBUG=y, applied the patch and rebuilt 5.6.4, but:

$ cat /sys/kernel/debug/tracing/trace > trace.txt
cat: /sys/kernel/debug/tracing/trace: No such file or directory

Here are all the 5.6.4 config options with "TRACING" or "TRACE"; do I
need to set any that aren't set?

CONFIG_SYSCTL_EXCEPTION_TRACE=y
CONFIG_TRACEPOINTS=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_PM_TRACE=y
CONFIG_PM_TRACE_RTC=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_HAVE_ARCH_TRACEHOOK=y
CONFIG_HAVE_RELIABLE_STACKTRACE=y
# CONFIG_MAC80211_MESSAGE_TRACING is not set
# CONFIG_DMA_FENCE_TRACE is not set
CONFIG_IWLWIFI_DEVICE_TRACING=y
# CONFIG_TRACE_SINK is not set
# HW tracing support
# end of HW tracing support
CONFIG_STACKTRACE=y
CONFIG_RCU_TRACE=y
CONFIG_USER_STACKTRACE_SUPPORT=y
CONFIG_NOP_TRACER=y
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_REGS=y
CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_TRACE_CLOCK=y
CONFIG_EVENT_TRACING=y
CONFIG_CONTEXT_SWITCH_TRACER=y
CONFIG_TRACING=y
CONFIG_GENERIC_TRACER=y
CONFIG_TRACING_SUPPORT=y
CONFIG_FTRACE=y
# CONFIG_BOOTTIME_TRACING is not set
# CONFIG_FUNCTION_TRACER is not set
# CONFIG_STACK_TRACER is not set
# CONFIG_IRQSOFF_TRACER is not set
# CONFIG_SCHED_TRACER is not set
# CONFIG_HWLAT_TRACER is not set
# CONFIG_MMIOTRACE is not set
# CONFIG_FTRACE_SYSCALLS is not set
# CONFIG_TRACER_SNAPSHOT is not set
CONFIG_BLK_DEV_IO_TRACE=y
# CONFIG_TRACE_EVENT_INJECT is not set
# CONFIG_TRACEPOINT_BENCHMARK is not set
# CONFIG_TRACE_EVAL_MAP_FILE is not set
# CONFIG_FTRACE_STARTUP_TEST is not set
CONFIG_TRACE_IRQFLAGS_SUPPORT=y
CONFIG_HAVE_MMIOTRACE_SUPPORT=y
CONFIG_CC_HAS_SANCOV_TRACE_PC=y
# CONFIG_BACKTRACE_SELF_TEST is not set

Steve Berman


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-15 Thread Stephen Berman
On Mon, 15 Jun 2020 17:58:46 +0200 Sebastian Andrzej Siewior 
 wrote:

> On 2020-06-15 17:41:06 [+0200], Stephen Berman wrote:
>> > If you do this "t" then there should be a lot of output on your console.
>> > If you do this from an xterm then you can see the output after typing
>> > "dmesg". The output should appear also in your system log.
>> 
>> Ah, ok, I do see it in the log, it looks basically the same as the call
>> trace I posted upthread.  I wonder why there was no such output in the
>> console, could there be some setting that suppresses it even though
>> /proc/sys/kernel/sysrq has `1'?
>
> You have ignore_loglevel soo everything should appear on your console.
> The is true for your tty / ctrl+alt+f1 but your xterm shell.

Yet that appears not to be the case here, or I'm still doing something
wrong 路

>> There were in fact 99 "Start .* acpi_ev_notify_dispatch" messages and 98
>> "End .* acpi_ev_notify_dispatch" messages.  Here's the last of them
>> before I rebooted:
[...] 
>> acpi_os_execute_deferred_notify(851) Start 8d7aa8758a80
>> acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e70a0)
>> Jun 15 08:58:26 strobe-jhalfs kernel: [ 194.339017] acpi_os_execute(1109)
>> Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa99a6730 
>> Jun 15 08:58:27 strobe-jhalfs kernel: [ 195.363005] acpi_os_execute(1109)
>> Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa99a6780 
>> Jun 15 08:58:27 strobe-jhalfs kernel: [ 195.367978]
>> acpi_os_execute_deferred_notify(853) End 8d7aa8758a80
[...] 

> Okay. So there is "Adding" of four events/work items in total, each
> almost every second.
> Processing of one work-item took two seconds, the other one took a
> second. These events are "old" so I don't see the "adding" line for the
> Start/End.
> You could look in the log for 8d7aa8758a80 to seen when it was
> enqueued but it feels like ACPI enqueues more events then it is capable
> of processing.

It's there, 196 lines above the Start 8d7aa8758a80 line:

Jun 15 08:56:58 strobe-jhalfs kernel: [  106.275356] acpi_os_execute(1109) 
Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa84e70a0 

Steve Berman


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-15 Thread Stephen Berman
On Mon, 15 Jun 2020 16:51:30 +0200 Sebastian Andrzej Siewior 
 wrote:

> On 2020-06-15 09:58:00 [+0200], Stephen Berman wrote:
>> Ok, sorry, I had misunderstood, but now I've looked at the
>> documentation.  I had in fact already done `echo t >
>> /proc/sysrq-trigger' in an xterm (as root) and there was no output.
>> Later, after booting kernel 5.1.0 because of the message flooding with
>> 5.6.4, I did `echo t > /proc/sysrq-trigger' in a virtual tty (also as
>> root) and the only output was: `[ ] sysrq: Show State'.  Is
>> this expected?  (In /proc/sys/kernel/sysrq there is `1'.)  I couldn't
>> try it in a 5.6.4 virtual tty because of the message flooding (see
>> below).
>
> If you do this "t" then there should be a lot of output on your console.
> If you do this from an xterm then you can see the output after typing
> "dmesg". The output should appear also in your system log.

Ah, ok, I do see it in the log, it looks basically the same as the call
trace I posted upthread.  I wonder why there was no such output in the
console, could there be some setting that suppresses it even though
/proc/sys/kernel/sysrq has `1'?

>> The kernel log shows 305 of these messages in the 4 minutes and 17
>> seconds between the start of klogd and when I rebooted.
>
> Okay, this is a lot.
>
>> > The interesting part is to see if there is a acpi_os_execute() adding a
>> > specific event multiple times which does not complete. Maybe at runtime,
>> > maybe at shutdown time. If that is the case then ignoring this specific
>> > event might fix the shutdown problem. With all this information so far,
>> > I don't see a relation with this problem and the commit…
>> 
>> In those 3 minutes and 8 seconds there were 5 "Adding
>> acpi_ev_asynch_enable_gpe" messages at the beginning, then 185 "Adding
>> acpi_ev_notify_dispatch" messages, which kept coming until I rebooted.
>
> 5 acpi_ev_asynch_enable_gpe() in three minutes since boot isn't much.
> 185 of "Adding acpi_ev_notify_dispatch" is a lot.
>
> But to be clear, only 
>   acpi_os_execute(1109) Adding acpi_ev_notify_dispatch" messages?
> No
>   "acpi_os_execute_deferred_notify() Start/End acpi_ev_notify_dispatch"? 
>
> The syslog should have captured the log on disk.

There were in fact 99 "Start .* acpi_ev_notify_dispatch" messages and 98
"End .* acpi_ev_notify_dispatch" messages.  Here's the last of them
before I rebooted:

Jun 15 08:58:25 strobe-jhalfs kernel: [  193.315014] acpi_os_execute(1109) 
Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa99a66e0 
Jun 15 08:58:25 strobe-jhalfs kernel: [  193.319929] 
acpi_os_execute_deferred_notify(853) End   8d7aa8758d00 
acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e7eb0)
Jun 15 08:58:25 strobe-jhalfs kernel: [  193.321242] 
acpi_os_execute_deferred_notify(851) Start 8d7aa8758a80 
acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e70a0)
Jun 15 08:58:26 strobe-jhalfs kernel: [  194.339017] acpi_os_execute(1109) 
Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa99a6730 
Jun 15 08:58:27 strobe-jhalfs kernel: [  195.363005] acpi_os_execute(1109) 
Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa99a6780 
Jun 15 08:58:27 strobe-jhalfs kernel: [  195.367978] 
acpi_os_execute_deferred_notify(853) End   8d7aa8758a80 
acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e70a0)
Jun 15 08:58:27 strobe-jhalfs kernel: [  195.369234] 
acpi_os_execute_deferred_notify(851) Start 8d7aa8758240 
acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e7a00)
Jun 15 08:58:28 strobe-jhalfs kernel: [  196.387017] acpi_os_execute(1109) 
Adding acpi_ev_notify_dispatch+0x0/0x55 8d7aa99a68c0 
Jun 15 08:58:28 strobe-jhalfs kernel: [  196.388508] 
acpi_os_execute_deferred_notify(853) End   8d7aa8758240 
acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e7a00)
Jun 15 08:58:28 strobe-jhalfs kernel: [  196.390128] 
acpi_os_execute_deferred_notify(851) Start 8d7aa8758b80 
acpi_ev_notify_dispatch+0x0/0x55(8d7aa84e7460)
Jun 15 08:58:29 strobe-jhalfs kernel: [  197.208714] wlan1: deauthenticating 
from 7c:ff:4d:08:df:22 by local choice (Reason: 3=DEAUTH_LEAVING)
Jun 15 08:58:29 strobe-jhalfs kernel: [  197.232214] ip (4614) used greatest 
stack depth: 11272 bytes left
Jun 15 08:58:29 strobe-jhalfs kernel: Kernel logging (proc) stopped.
Jun 15 08:58:29 strobe-jhalfs kernel: Kernel log daemon terminating.

> I attached a modified acpi_dbg.patch. Please enable:
> - CONFIG_ACPI_DEBUG=y
>
> Looking at your 5.1 you have tracing enabled (hope it still is).
>
> The attached patch will dump the date into the tracing buffer, so you
> console should remain "clean". Once it records 300 of those "requests
> for acpi_ev_notify_dispatch" it will stop recording.
> After 4-5 minutes please do
>   cat /sys/kernel/debug/tracing/trace > file.txt
>
> compress it and send it.

Will do as soon as I can, hopefully later today or tomorrow.  Thanks.

Steve Berman


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-15 Thread Stephen Berman
On Sun, 14 Jun 2020 19:10:05 +0200 Sebastian Andrzej Siewior 
 wrote:

> On 2020-06-14 14:12:18 [+0200], Stephen Berman wrote:
[...]
>> What am I supposed to do after "echo t > /proc/sysrq-trigger"?  Both
>> before and after doing that I get an error trying to open the file:
>> 
>> root [ ~ ]# cat /proc/sysrq-trigger
>> cat: /proc/sysrq-trigger: Input/output error
>
>  echo "t > /proc/sysrq-trigger"
>
> not cat.

Ok, sorry, I had misunderstood, but now I've looked at the
documentation.  I had in fact already done `echo t >
/proc/sysrq-trigger' in an xterm (as root) and there was no output.
Later, after booting kernel 5.1.0 because of the message flooding with
5.6.4, I did `echo t > /proc/sysrq-trigger' in a virtual tty (also as
root) and the only output was: `[ ] sysrq: Show State'.  Is
this expected?  (In /proc/sys/kernel/sysrq there is `1'.)  I couldn't
try it in a 5.6.4 virtual tty because of the message flooding (see
below).

[...]
>> I applied this patch to 5.6.4 and recompiled, and on the next boot with
>> that kernel, the kernel buffer (and kernel and system logs) began to get
>> flooded with these messages:
>> 
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.943987]
>> acpi_os_execute_deferred(843) Start 8fb82c7b6500
>> 3edf1e05(8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944102] acpi_os_execute(1101)
>> Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8fb82c492990
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944104]
>> acpi_os_execute_deferred(845) End 8fb82c7b6500
>> 3edf1e05(8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944105]
>> acpi_os_execute_deferred(843) Start 8fb82b844800
>> 2ba560ea(8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944124]
>> acpi_os_execute_deferred(845) End 8fb82b844800
>> 2ba560ea(8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944288]
>> acpi_os_execute_deferred(843) Start 8fb82c7b6540
>> 3edf1e05(8fb82c492990)
>> Jun 14 10:37:13 strobe-jhalfs kernel: [ 5.944387] acpi_os_execute(1101)
>> Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8fb82c492990
>> 
>> and so on without stopping.  I could start X and avoid seeing the
>> messages, but was afraid the logs would fill up the root partition if I
>> let it keep going, so I rebooted with another kernel.
>> 
>> Was this message flood because I booted with "ignore_loglevel
>> initcall_debug"?  In the logs there are also lots of messages like this:
>
> Is there a acpi_os_execute_* flood? The *few* at what appears to system
> startup and might be normal. If there appear *many* more and are
> constantly printing (check with dmesg) then we might be to something.

The kernel log shows 305 of these messages in the 4 minutes and 17
seconds between the start of klogd and when I rebooted.

[...]
> I attached the updated acpi patch. It limits the prints to the
> kacpi_notify_wq queue which appears to stuck at shutdown.

I applied this patch and rebuilt the kernel.  After booting 5.6.4 this
morning there was again message flooding and this time I was unable to
log in because of it, so had to do a hard reboot and booted with 5.1.0.
The kernel log shows 207 acpi_os_execute_deferred_notify messages in 3
minutes and 6 seconds; here are the first:

Jun 15 08:55:22 strobe-jhalfs kernel: [5.943848] acpi_os_execute(1109) 
Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8d7aaba18990 
Jun 15 08:55:22 strobe-jhalfs kernel: [5.943851] 
acpi_os_execute_deferred_notify(851) Start 8d7aab157fc0 
acpi_ev_asynch_enable_gpe+0x0/0x2f(8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [5.943865] 
acpi_os_execute_deferred_notify(853) End   8d7aab157fc0 
acpi_ev_asynch_enable_gpe+0x0/0x2f(8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [5.944104] acpi_os_execute(1109) 
Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8d7aaba18990 
Jun 15 08:55:22 strobe-jhalfs kernel: [5.944106] 
acpi_os_execute_deferred_notify(851) Start 8d7aab157fc0 
acpi_ev_asynch_enable_gpe+0x0/0x2f(8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [5.944115] 
acpi_os_execute_deferred_notify(853) End   8d7aab157fc0 
acpi_ev_asynch_enable_gpe+0x0/0x2f(8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [5.944406] acpi_os_execute(1109) 
Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8d7aaba18990 
Jun 15 08:55:22 strobe-jhalfs kernel: [5.944408] 
acpi_os_execute_deferred_notify(851) Start 8d7aab157fc0 
acpi_ev_asynch_enable_gpe+0x0/0x2f(8d7aaba18990)
Jun 15 08:55:22 strobe-jhalfs kernel: [5.944417] 
acpi_os_execute_deferred_notify(853) End   8d7aab157fc0 
acpi_ev_asynch_enable_gpe+0x0/0x2f(8d7aaba18990)

>

Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-14 Thread Stephen Berman
On Fri, 12 Jun 2020 13:01:22 +0200 Sebastian Andrzej Siewior 
 wrote:

> + ACPI in case the ACPI folks see something obvious.
[...]
>> The "acpi_os_execute_deferred" messages were repeated many times in the
>> above line, then every 20-30 seconds again for several minutes.  Then
>> suddenly a call trace appeared which was similar but not identical to
>> the one I posted upthread, and each line of the trace was followed by
>> the line ", acpi_os_execute_deferred".  This went by quite quickly even
>> with the printk_delay I added, and I was unable to photograph the start
>> of it and couldn't get all of the subsequent output, but the screenshots
>> show some of it.  After several minutes of this output I again did a
>> hard reboot.
>
> All good. I though that you will have one worker that is blocking but
> you have a lot of them. It appears that one is active and stuck and more
> are waiting.
>
> Could you look at acpi in /proc/interrupts 10 secs apart to see if it
> increments?
>
>grep -E 'acpi|smbus' /proc/interrupts

I tried this several times, 10 seconds apart and longer, but saw no
change, it was always this:

steve [ ~ ]$ grep -E 'acpi|smbus' /proc/interrupts
   9:  0  5  0  0  0  0 
 0  0  0  0  0  0   IO-APIC
9-fasteoi   acpi
  16:  0  0  0  0  0  0 
 0  0  0  0  0  0   IO-APIC   
16-fasteoi   i801_smbus

> You could also do "echo t > /proc/sysrq-trigger" which gives you a lot
> of task state information, but at the end you will also see "howing busy
> workqueues and worker pools:" regarding the workqueue state. I am
> curious to see if you already have worker stuck in kacpid_notify with
> acpi_os_execute_deferred.

What am I supposed to do after "echo t > /proc/sysrq-trigger"?  Both
before and after doing that I get an error trying to open the file:

root [ ~ ]# cat /proc/sysrq-trigger
cat: /proc/sysrq-trigger: Input/output error

> Now that we know that know that acpi_os_execute_deferred() is stuck,
> lets shed some light in what it is trying to do. The patch at the end
> will dump this information into the console buffer (The `dmesg' command
> will print the whole kernel buffer). I have no idea if this starts
> printing while the system is running or during shutdown.  I would expect
> to see the Start line in acpi_os_execute_deferred() but not the End one.
>
> diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
> index 41168c027a5a4..0e983c558bcb5 100644
> --- a/drivers/acpi/osl.c
> +++ b/drivers/acpi/osl.c
> @@ -840,7 +840,9 @@ static void acpi_os_execute_deferred(struct work_struct 
> *work)
>  {
>   struct acpi_os_dpc *dpc = container_of(work, struct acpi_os_dpc, work);
>
> + pr_err("%s(%d) Start %px %pF(%px)\n", __func__, __LINE__, dpc, 
> dpc->function, dpc->context);
>   dpc->function(dpc->context);
> + pr_err("%s(%d) End   %px %pF(%px)\n", __func__, __LINE__, dpc, 
> dpc->function, dpc->context);
>   kfree(dpc);
>  }
>
> @@ -1096,6 +1098,8 @@ acpi_status acpi_os_execute(acpi_execute_type type,
>*/
>   if (type == OSL_NOTIFY_HANDLER) {
>   queue = kacpi_notify_wq;
> + pr_err("%s(%d) Adding %pS %px\n", __func__, __LINE__, function,
> +context);
>   INIT_WORK(>work, acpi_os_execute_deferred);
>   } else if (type == OSL_GPE_HANDLER) {
>   queue = kacpid_wq;

I applied this patch to 5.6.4 and recompiled, and on the next boot with
that kernel, the kernel buffer (and kernel and system logs) began to get
flooded with these messages:

Jun 14 10:37:13 strobe-jhalfs kernel: [5.943987] 
acpi_os_execute_deferred(843) Start 8fb82c7b6500 
3edf1e05(8fb82c492990)
Jun 14 10:37:13 strobe-jhalfs kernel: [5.944102] acpi_os_execute(1101) 
Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8fb82c492990
Jun 14 10:37:13 strobe-jhalfs kernel: [5.944104] 
acpi_os_execute_deferred(845) End   8fb82c7b6500 
3edf1e05(8fb82c492990)
Jun 14 10:37:13 strobe-jhalfs kernel: [5.944105] 
acpi_os_execute_deferred(843) Start 8fb82b844800 
2ba560ea(8fb82c492990)
Jun 14 10:37:13 strobe-jhalfs kernel: [5.944124] 
acpi_os_execute_deferred(845) End   8fb82b844800 
2ba560ea(8fb82c492990)
Jun 14 10:37:13 strobe-jhalfs kernel: [5.944288] 
acpi_os_execute_deferred(843) Start 8fb82c7b6540 
3edf1e05(8fb82c492990)
Jun 14 10:37:13 strobe-jhalfs kernel: [5.944387] acpi_os_execute(1101) 
Adding acpi_ev_asynch_enable_gpe+0x0/0x2f 8fb82c492990

and so on without stopping.  I could start X and avoid seeing the
messages, but was afraid the logs would fill up the root partition if I
let it keep going, so I rebooted with another kernel.

Was this message flood because I booted with "ignore_loglevel
initcall_debug"?  In the logs there are also 

Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-11 Thread Stephen Berman
On Thu, 11 Jun 2020 00:49:26 +0200 Stephen Berman  
wrote:

> On Wed, 10 Jun 2020 12:25:14 +0200 Sebastian Andrzej Siewior
>  wrote:
> [...]
>>> By the other patch do you mean the following?  (This email was also put
>>> into my spam by GMX and AFAICT has not yet shown up on the vger list.)
>>
>> Yes. But now that I played a little with it, I made dis:
> [...]
>> So you should have output like:
>> |[   14.131696] Flushing events_freezable_power_
>> the flush that hangs
>>
>> |[   17.203246] Flushing events_freezable_power_ over, 0
>> It has been noticed (0 at the end)
>>
>> |[   17.203849] Showing busy workqueues and worker pools:
>> |[   17.204400] workqueue events_freezable_power_: flags=0x84
>> |[   17.205009]   pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 
>> refcnt=2
>> |[   17.205774] in-flight: 2147:disk_events_workfn
>> |[   17.206268] workqueue kblockd: flags=0x18
>> |[   17.206682]   pwq 13: cpus=6 node=0 flags=0x0 nice=-20 active=1/256 
>> refcnt=2
>> |[   17.207428] pending: blk_mq_requeue_work
>> |[ 17.207958] pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=2s workers=3 idle:
>> | 535 42
>>
>> A dump of all pending worqueue items.
>>
>> |[   20.275239] Flushing events_freezable_power_ over, 0
>> |[   20.275812] Showing busy workqueues and worker pools:
>> |[   20.276385] workqueue events_freezable_power_: flags=0x84
>> |[   20.276972]   pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 
>> refcnt=2
>> |[   20.277712] in-flight: 2147:disk_events_workfn
>> |[   20.278245] workqueue kblockd: flags=0x18
>> |[   20.278685]   pwq 13: cpus=6 node=0 flags=0x0 nice=-20 active=1/256 
>> refcnt=2
>> |[   20.279471] pending: blk_mq_requeue_work
>> |[ 20.279989] pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=5s workers=3 idle:
>> | 535 42
>>
>> Another atempt. As you see `blk_mq_requeue_work' is references in both
>> outputs but it makes progress (it is scheduled multiple times). The item
>> `disk_events_workfn' stucks.
>
> I rebuilt kernel 5.6.4 with that patch, but before calling `shutdown -h
> now' I guess I should have set printk_delay, because this time there was
> a lot of output and it went by too fast.  I'll try to delay it the next
> time.

I've done that now.  I've sent you screenshots offlist.  Here's a brief
summary: The initial shutdown log output is essentially the same as the
transcription I posted upthread, except that it is interspersed with
messages from your patch like "Flush hid_reset+0x0/0x80".  Then comes:

ACPI: Preparing to enter system sleep state S5
Flushing kacpid
Flushing kacpid_notify
Flushing kacpid_notify over, 0
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
  pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
 pending: drm_fb_helper_diry_work
workqueue kacpid_notify: flags=0x0
  pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1 refcnt=4185
 in-flight: 1578:acpi_os_execute_deferred
 delayed: acpi_os_execute_deferred, acpi_os_execute_deferred, ...

The "acpi_os_execute_deferred" messages were repeated many times in the
above line, then every 20-30 seconds again for several minutes.  Then
suddenly a call trace appeared which was similar but not identical to
the one I posted upthread, and each line of the trace was followed by
the line ", acpi_os_execute_deferred".  This went by quite quickly even
with the printk_delay I added, and I was unable to photograph the start
of it and couldn't get all of the subsequent output, but the screenshots
show some of it.  After several minutes of this output I again did a
hard reboot.

Steve Berman


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-10 Thread Stephen Berman
On Wed, 10 Jun 2020 12:25:14 +0200 Sebastian Andrzej Siewior 
 wrote:

> On 2020-06-10 10:21:29 [+0200], Stephen Berman wrote:
>> (GMX put your email into my spam folder, so I didn't see it before I
>> sent my followup about removing the wifi firmware.)
>
> okay.
>
>> On Tue, 9 Jun 2020 22:23:39 +0200 Sebastian Andrzej Siewior
>>  wrote:
>> >   scripts/decode_stacktrace.sh vmlinux $(pwd)/ kernel_power_off+0x3d/0x70
>> >
>> > should reveal that.
>>
>> I saved the call trace lines to a file and did this (with `$(pwd)/' the
>> script did not run):
>>
>> steve [ /sources/linux-5.6.4 ]$ ./scripts/decode_stacktrace.sh vmlinux
>> . kernel_power_off+0x3d/0x70 < /home/steve/temp/call_trace
>> [1352.***] Call Trace:
>> [1352.***] ? __schedule (core.c:?)
>> [1352.***] schedule (??:?)
>> [1352.***] schedule_timeout (??:?)
>> [1352.***] ? acpi_os_release_object (??:?)
>> [1352.***] ? acpi_ut_update_object_reference (??:?)
>> [1352.***] wait_for_completion (??:?)
>> [1352.***] ? wake_up_q (??:?)
>> [1352.***] flush_workqueue (??:?)
>> [1352.***] kernel_power_off (??:?)
>> [1352.***] __do_sys_reboot (reboot.c:?)
>> [1352.***] ? do_send_sig_info (??:?)
>> [1352.***] ? kill_pid_info (??:?)
>> [1352.***] ? kill_something_info (signal.c:?)
>> [1352.***] ? __x64_sys_kill (??:?)
>> [1352.***] do_syscall_64 (??:?)
>> [1352.***] entry_SYSCALL_64_after_hwframe (??:?)
>> [1352.***] RIP: 0033:0x7f95dd6992c3
>> [1352.***] Code: Bad RIP value.
>> objdump: '/tmp/tmp.BlpNDdSQpG.o': No such file
>>
>> Did I do something wrong or do I need to pass certain compiler flags
>> when building the kernel (when I build Emacs I pass CFLAGS='-Og -g3')?
>
> the arguments last argument should not be kernel_power_off+0x3d/0x70. I
> should have been entered after "enter" but you can also pipe the whole
> file as you did.
> If it gives you the ?? then might need
>   CONFIG_DEBUG_INFO

Thanks.  I set that, recompiled and now get this:



decode_stacktrace
Description: application/text/plain

[...]
>> By the other patch do you mean the following?  (This email was also put
>> into my spam by GMX and AFAICT has not yet shown up on the vger list.)
>
> Yes. But now that I played a little with it, I made dis:
[...]
> So you should have output like:
> |[   14.131696] Flushing events_freezable_power_
> the flush that hangs
>
> |[   17.203246] Flushing events_freezable_power_ over, 0
> It has been noticed (0 at the end)
>
> |[   17.203849] Showing busy workqueues and worker pools:
> |[   17.204400] workqueue events_freezable_power_: flags=0x84
> |[   17.205009]   pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
> |[   17.205774] in-flight: 2147:disk_events_workfn
> |[   17.206268] workqueue kblockd: flags=0x18
> |[   17.206682]   pwq 13: cpus=6 node=0 flags=0x0 nice=-20 active=1/256 
> refcnt=2
> |[   17.207428] pending: blk_mq_requeue_work
> |[ 17.207958] pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=2s workers=3 idle:
> | 535 42
>
> A dump of all pending worqueue items.
>
> |[   20.275239] Flushing events_freezable_power_ over, 0
> |[   20.275812] Showing busy workqueues and worker pools:
> |[   20.276385] workqueue events_freezable_power_: flags=0x84
> |[   20.276972]   pwq 12: cpus=6 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
> |[   20.277712] in-flight: 2147:disk_events_workfn
> |[   20.278245] workqueue kblockd: flags=0x18
> |[   20.278685]   pwq 13: cpus=6 node=0 flags=0x0 nice=-20 active=1/256 
> refcnt=2
> |[   20.279471] pending: blk_mq_requeue_work
> |[ 20.279989] pool 12: cpus=6 node=0 flags=0x0 nice=0 hung=5s workers=3 idle:
> | 535 42
>
> Another atempt. As you see `blk_mq_requeue_work' is references in both
> outputs but it makes progress (it is scheduled multiple times). The item
> `disk_events_workfn' stucks.

I rebuilt kernel 5.6.4 with that patch, but before calling `shutdown -h
now' I guess I should have set printk_delay, because this time there was
a lot of output and it went by too fast.  I'll try to delay it the next
time.  After the initial output (which went by too fast for me to
photograph) there were endlessly repeated screenfuls of output like this:

[ 2578.402398] , acpi_os_execute_deferred, acpi_os_execute_deferred, 
acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, 
acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred, 
acpi_os_execute_deferred, acpi_os_execute_deferred, acpi_os_execute_deferred

and at the bottom of the screen this:

[ 2578.427680] pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=0s workers=4 idle: 
1331

After this line there was a brief pause (so I was able to photograph it)
and then it repeated, always the same.  In the last line pool, cpus,
node, flags, nice and hung were always 0, and workers was always 4.
Only idle changed, but seemed to cycle through a few values.  After
watching this repetition for several minutes, I finally did a hard
reboot.

Steve Berman


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-10 Thread Stephen Berman
(GMX put your email into my spam folder, so I didn't see it before I
sent my followup about removing the wifi firmware.)

On Tue, 9 Jun 2020 22:23:39 +0200 Sebastian Andrzej Siewior 
 wrote:

> On 2020-06-09 12:06:23 [+0200], Stephen Berman wrote:
>> I recompiled kernel 5.6.4 with the printk() call you suggested, then
>> booted the kernel with "ignore_loglevel initcall_debug" (but leaving the
>> CDROM and wifi intact for now).  After working as I normally do, I
>> called `shutdown -h now', again as usual.  After the "Bringing down the
>> loopback interface" message there were these two messages:
>> 
>> reboot: __do_sys_reboot(317)CMD: 89abcdef
>> reboot: __do_sys_reboot(317)CMD: 4321fedc
>> 
>> Then nothing more for two minutes, then, as previously:
>> 
>> sr 5:0:0:0: tag#10 timing out command, waited 120 seconds.
>> 
>> Then I did a hard reboot.
>> 
>> This morning I detached the cables to the CDROM (but left the disk in
>> the box) and again booted 5.6.4 with "ignore_loglevel initcall_debug".
>> After working for a bit, I called `shutdown -h now', and now there were
>> quite a few more messages, but again the machine did not power off.
>> Here is the ouput, which I transcribed carefully, hopefully without
>> error (I omitted the fractional parts of the timestamps and some of the
>> usb1-portX messages):
>
> If it helps you could delay printks on shutdown (via
> /proc/sys/kernel/printk_delay) and record a video clip. I could make
> storage available for an upload.

So far this hasn't been necessary, since the new output until the first
two-minute pause fit on one screen and I could photograph it.

> | *   Bringing down the loopback interface...
> | [1123.***] reboot: __do_sys_reboot(317)CMD: 89abcdef
> | [1123.***] reboot: __do_sys_reboot(317)CMD: 4321fedc
> This happens quicker than I expected. Initially I assumed that you are
> not there yet and userland is still poking at your cdrom. Accordin
> …
> | [1124.***] sd 4:0:0:0: shutdown
> | [1124.***] sd 4:0:0:0: shutdown [sda] Synchronizing SCSI cache
> | [1124.***] sd 4:0:0:0: shutdown [sda] Stopping disk
>
> After this, there shouldn't be any outstanding disk requests.
> …
> | [1125.***] ahci :00:17.0: shutdown
>
> This disables the ahci controller which means disk/cdrom requests won't
> be answered by the hardware.
> …
> | [1125.***] ACPI: Preparing to enter system sleep state S5
> | [1352.***] INFO: task halt:5187 blocked for more than 122 seconds.
> | [1352.***]  Not tainted 5.6.4 #4
> | [1352.***] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
> message
> | [1352.***] haltD13856  5187   5186 0x4000
> | [1352.***] Call Trace:
> | [1352.***]  ? __schedule+0x272/0x5b0
> | [1352.***]  schedule+0x45/0xb0
> | [1352.***]  schedule_timeout+0x204/0x2f0
> | [1352.***]  ? acpi_os_release_object+0x5/0x10
> | [1352.***]  ? acpi_ut_update_object_reference+0x14e/0x1d2
> | [1352.***]  wait_for_completion+0xa3/0x100
> | [1352.***]  ? wake_up_q+0x90/0x90
> | [1352.***]  flush_workqueue+0x130/0x420
>
> I have no idea where this flush_workqueue() is coming from. The command
>   scripts/decode_stacktrace.sh vmlinux $(pwd)/ kernel_power_off+0x3d/0x70
>
> should reveal that.

I saved the call trace lines to a file and did this (with `$(pwd)/' the
script did not run):

steve [ /sources/linux-5.6.4 ]$ ./scripts/decode_stacktrace.sh vmlinux . 
kernel_power_off+0x3d/0x70 < /home/steve/temp/call_trace 
[1352.***] Call Trace:
[1352.***] ? __schedule (core.c:?) 
[1352.***] schedule (??:?) 
[1352.***] schedule_timeout (??:?) 
[1352.***] ? acpi_os_release_object (??:?) 
[1352.***] ? acpi_ut_update_object_reference (??:?) 
[1352.***] wait_for_completion (??:?) 
[1352.***] ? wake_up_q (??:?) 
[1352.***] flush_workqueue (??:?) 
[1352.***] kernel_power_off (??:?) 
[1352.***] __do_sys_reboot (reboot.c:?) 
[1352.***] ? do_send_sig_info (??:?) 
[1352.***] ? kill_pid_info (??:?) 
[1352.***] ? kill_something_info (signal.c:?) 
[1352.***] ? __x64_sys_kill (??:?) 
[1352.***] do_syscall_64 (??:?) 
[1352.***] entry_SYSCALL_64_after_hwframe (??:?) 
[1352.***] RIP: 0033:0x7f95dd6992c3
[1352.***] Code: Bad RIP value.
objdump: '/tmp/tmp.BlpNDdSQpG.o': No such file

Did I do something wrong or do I need to pass certain compiler flags
when building the kernel (when I build Emacs I pass CFLAGS='-Og -g3')?

> The cdrom is polled by the kernel every two seconds. I *think* the
> kernel is blocked for some reason and then the cdrom polled. This can't
> complete and you see the srX warning. The window is quite small but not
> impossible. 
> I managed to fabricate this case but after 
> |[  137.581613] sr 3:0:0:0: tag#1 timing ou

Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-09 Thread Stephen Berman
On Tue, 09 Jun 2020 12:06:23 +0200 Stephen Berman  
wrote:

> Do you still want me to check whether removing the iwlwifi driver makes
> a differece?  And with the CDROM still detached, or does that not
> matter?

I'm not actually sure just what you mean by removing the wifi driver,
but I just now tried moving the iwlwifi firmware out of /lib/firmware,
then booted 5.6.4 as before (with "ignore_loglevel initcall_debug" and
with the CDROM still detached), and of course there was now no wireless
interface.  I started X, Emacs, Firefox (of course with no internet),
VLC, used them a bit then closed them all, then called `shutdown -h now'
and the message output was essentially identical to what I posted
previously (but of course without "iwlwifi :02:00.0: shutdown") and
the machine did not power off.  So if this was the test of removing the
wifi driver, then it made no difference.

Steve Berman


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-06-09 Thread Stephen Berman
On Fri, 22 May 2020 18:40:12 +0200 Sebastian Andrzej Siewior 
 wrote:

> Sorry for the late reply.

No problem, since as it turned out, I didn't have to time till now to
follow up on your latest suggestions.  Details below.

> On 2020-05-14 23:39:40 [+0200], Stephen Berman wrote:
>> >> How will I know if that happens, is there a specific message in the tty?
>> >
>> > On the tty console where you see the "timing out command, waited"
>> > message, there should be something starting with
>> > |BUG: workqueue lockup - pool
>> >
>> > following with the pool information that got stuck. That code checks the
>> > workqueues every 30secs by default. So if you waited >= 60secs then
>> > system is not detecting a stall.
>>
>> As you can see in the photo, there was no message about a workqueue
>> lockup, only "task halt:5320 blocked for more than  seconds" every
>> two minutes.  I suppose that comes from one of the other options I
>> enabled.  Does it reveal anything about the problem?
>
>>From the picture, you are on your way to level 0, which would issue the
> final shutdown command, but you are not quite there yet.
>
> If you add a printk() to the reboot syscall, then I wouldn't expect you
> to see it. (something like that):
>
> diff --git a/kernel/reboot.c b/kernel/reboot.c
> index c4d472b7f1b42..19bc35bc0cda0 100644
> --- a/kernel/reboot.c
> +++ b/kernel/reboot.c
> @@ -314,6 +314,7 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, 
> unsigned int, cmd,
>   char buffer[256];
>   int ret = 0;
>
> + pr_err("%s(%d)CMD: %lx\n", __func__, __LINE__, cmd);
>   /* We only trust the superuser with rebooting the system. */
>   if (!ns_capable(pid_ns->user_ns, CAP_SYS_BOOT))
>   return -EPERM;
>
> If you add "ignore_loglevel initcall_debug" to the command line then you
> should see the init callbacks of each driver. But there will be nothing
> on your shutdown (as I expect it).
>
> The "task X blocked for more than 120 secs" is part of the hung task
> detector. With the "ignore_loglevel" above you should be able to see the
> callchain of the task. I suspect that the task poked the cd-drive which
> isn't answering. So from detector's point of view, the task issued a
> system call which appears to hang an makes no progress.
>
>> > Could
>> > you please check if the stall-dector says something?
>>
>> Is that the message I repeated above or do you mean the workqueue?
>
> The hung message is not workqueue related. It is the task `halt' that
> makes no progress. There is not stall of the workqueue as far as the
> system can tell.
> The two boot options, I suggested above, may reveal additional
> information that are printed but suppressed due to the loglevel.
>
> My guess now is that maybe shutting down wifi also paused the AHCI
> controller which makes no progress now. So booting without cdrom/disk
> should not cause any problems.
>
> Could you please:
> - try booting with "ignore_loglevel initcall_debug" and see if
>   additional information is printed on the console.
>
> - Remove cd / ATA-disk to check if anything else causes a stall. As by
>   your report you only mentioned those two (and if I see it correctly,
>   you rootFS is on nvme so removing the disk might be doable).
>
> - Could you remove the Wifi (just the driver, no the physical hw) to see
>   if it makes any difference?

I recompiled kernel 5.6.4 with the printk() call you suggested, then
booted the kernel with "ignore_loglevel initcall_debug" (but leaving the
CDROM and wifi intact for now).  After working as I normally do, I
called `shutdown -h now', again as usual.  After the "Bringing down the
loopback interface" message there were these two messages:

reboot: __do_sys_reboot(317)CMD: 89abcdef
reboot: __do_sys_reboot(317)CMD: 4321fedc

Then nothing more for two minutes, then, as previously:

sr 5:0:0:0: tag#10 timing out command, waited 120 seconds.

Then I did a hard reboot.

This morning I detached the cables to the CDROM (but left the disk in
the box) and again booted 5.6.4 with "ignore_loglevel initcall_debug".
After working for a bit, I called `shutdown -h now', and now there were
quite a few more messages, but again the machine did not power off.
Here is the ouput, which I transcribed carefully, hopefully without
error (I omitted the fractional parts of the timestamps and some of the
usb1-portX messages):



no-cdrom-shutdown-5.6.4
Description: application/text/plain

After the last message here, the "echo", Call Trace, RIP and register
messages were repeated exactly as above.  At this point I did a hard
reboot.  A web search found something similar to the above output
("System hangs (bad RIP value) when disk used in pool is removed"), but
didn't otherwise seem to be related to the problem I'm having.

Do you still want me to check whether removing the iwlwifi driver makes
a differece?  And with the CDROM still detached, or does that not
matter?

Steve Berman


Re: power-off delay/hang due to commit 6d25be57 (mainline)

2020-05-14 Thread Stephen Berman
On Thu, 14 May 2020 00:04:28 +0200 Sebastian Andrzej Siewior 
 wrote:

> On 2020-05-08 23:30:45 [+0200], Stephen Berman wrote:
>> > Can you log the output on the serial console?
>>
>> How do I do that?
>
> The spec for your mainboard says "serial port header". You would need to
> connect a cable there to another computer and log its output.
> The alternative would be to delay the output on the console and use a
> camera.

It's easiest for me to take a picture, since there isn't much output and
in any case the delay happens on it's own ;-).  I'm sending you the
image (from kernel 5.6.4) off-list since even after reducing it it's 1.2
MB large.

>> > If the commit you cited is really the problem then it would mean that a
>> > worker isn't scheduled for some reason. Could you please enable
>> > CONFIG_WQ_WATCHDOG to see if workqueue core code notices that a worker
>> > isn't making progress?

I enabled that and also CONFIG_SOFTLOCKUP_DETECTOR,
CONFIG_HARDLOCKUP_DETECTOR and CONFIG_DETECT_HUNG_TASK, which had all
been unset previously.

>> How will I know if that happens, is there a specific message in the tty?
>
> On the tty console where you see the "timing out command, waited"
> message, there should be something starting with
> |BUG: workqueue lockup - pool
>
> following with the pool information that got stuck. That code checks the
> workqueues every 30secs by default. So if you waited >= 60secs then
> system is not detecting a stall.

As you can see in the photo, there was no message about a workqueue
lockup, only "task halt:5320 blocked for more than  seconds" every
two minutes.  I suppose that comes from one of the other options I
enabled.  Does it reveal anything about the problem?

> As far as I can tell, there is nothing special on your system. The CD
> and disk drives are served by the AHCI controller. There is no special
> SCSI/SATA/SAS controller.
> Right now I have no idea how the workqueues fit in the picture. Could
> you please check if the stall-dector says something?

Is that the message I repeated above or do you mean the workqueue?

> Is it possible to show me output when the timeout message comes? My
> guess is that the system is going down and before unounting/remount RO
> the filesystem it flushes its last data. But this is done before issuing
> the "halt-syscall".

The entire output from `shutdown -h now' is in the picture; after the
fourth "timing out command" message, I pressed the reset button.

Steve Berman


power-off delay/hang due to commit 6d25be57 (mainline)

2020-05-01 Thread Stephen Berman
I'm experiencing a delay or hang in powering off my computer after `halt
-d -f -i -p' and I've bisected it to this commit in the mainline tree:

commit 6d25be5782e482eb93e3de0c94d0a517879377d0
Author: Thomas Gleixner 
Date:   Wed Mar 13 17:55:48 2019 +0100

sched/core, workqueues: Distangle worker accounting from rq lock

The delays have varied in length from ~20 seconds to seeminingly
indefinitely long (the longest I've waited before pressing the start
button on the machine is 6 minutes).  With kernels prior to this commit,
my machine powers off within 4 seconds after the halt invocation.

I first noticed this problem with the stable kernel 5.5.9 that I built
with Linux From Scratch (LFS).  My previous LFS system had stable kernel
4.20.12 and there was no such delay.  I also tried stable kernel 5.6.4
and the delay/hang also happens with it.  After consulting with LFS
developers I cloned the mainline kernel repository and determined that
problem happens with commit with v5.2 (0ecfebd2) but not with v5.1
(e93c9c99), then ran the bisection with those boundaries.

With mainline kernels (I only tried with v.5.3 and then commits between
v5.1 and v5.2 during the bisection) there are no kernel messages during
the delay/hang, but with 5.5.9 and 5.6.4, when the delay is longer than
two minutes, a message like the following is displayed:

  sr 5:0:0:0: tag#21 timing out command, waited 120s

repeating every 2 minutes with a different tag#.  This made me think the
problem was due to the cdrom device in the machine, but when I compiled
5.3.0 without /dev/cdrom (by not setting CONFIG_BLK_DEV_SR, which also
left CONFIG_CDROM unset), the delay/hang still happened.

The length of the delay seems to depend not only on the kernel but also
the number or kind of programs used.  The 20 second delay has happened
only went I invoke `shutdown -h now' (which runs `halt -d -f -i -p' in
LFS) immediately after booting.  When I ran `more' in one tty and
`shutdown -h now' in another, the machine powered off in less than two
minutes.  But after running emacs in a tty (and then killing it), the
delay was seemingly indefinite, as it is with my usual workflow with X,
emacs, firefox, etc. running (but I always kill them before invoking
`shutdown -h now' from a tty).  During the bisection I tested each
kernel build by running startx, emacs, firefox, killing these, then
`shutdown -h now'.  I've attached the git log of the bisection augmented
with power-off timing notes.  (FWIW, when I reboot the machine with
`shutdown -r now' there is no delay with the problematic kernels.)

One piece of advice I got from the LFS mailing list was to try and
revert the above commit pinpointed by the bisection in a recent mainline
or stable kernel.  But it fails to revert cleanly (using `patch -R') in
such kernels and I am too unfamiliar with the code to try and revert it
manually.

Another pointer I got was to the recent commit 62849a96, which fixes a
bug due to the commit that causes my problem.  I applied 62849a96 to
5.6.4, but it did not prevent the delay/hang.

I've found nothing on the web about this problem, so it seems to be
rare.  That's why I'm seeking your help (on the advice of LFS
developers).  In addition to the bisection log I've also attached the
kernel 5.1.0 config file (I used that during the bisection, accepting
all defaults of later options, as with the configs of 5.5.9 and 5.6.4)
and the contents of /proc/cpuinfo.  Please let me know if there's any
other information I can provide or anything else I can do to help with
debugging this issue.  I hope you can fix it.

Thanks,
Steve Berman



bisect.log
Description: application/text/plain


config-5.1.0
Description: application/text/plain
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 158
model name  : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
stepping: 10
microcode   : 0xaa
cpu MHz : 1225.910
cache size  : 12288 KB
physical id : 0
siblings: 12
core id : 0
cpu cores   : 6
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 22
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb 
rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology 
nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl 
vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe 
popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch 
cpuid_fault invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi 
flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms 
invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 
xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp flush_l1d
bugs: cpu_meltdown spectre_v1