There are certainly no user inactivity monitors in the default Ubuntu
Server install and we have never seen systemd-suspend.service triggered
on any of our systems. I don't think this is a systemd bug - it's
either a configuration error, or it's a bug in whatever is triggering
systemd-suspend.service (which systemd itself does not do).
This bug report indicates that there is desktop software installed on
the system where this bug was encountered. There certainly may be idle
timers in desktop software that trigger a suspend; and there may be bugs
in how those triggers detect that the system is "idle"; possibly
including platform-specific bugs. Canonical does not support the Ubuntu
Desktop on POWER. Is this problem reproducible for you with a pristine
install of Ubuntu Server?
** Changed in: systemd (Ubuntu)
Status: New => Incomplete
--
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1758273
Title:
DD2.2 freezes/hangs after 20mins of uptime
Status in The Ubuntu-power-systems project:
Triaged
Status in systemd package in Ubuntu:
Incomplete
Bug description:
== Comment: #0 - Application Cdeadmin - 2018-03-19 09:30:53 ==
== Comment: #1 - Application Cdeadmin <> - 2018-03-19 09:30:55 ==
== Comment: #2 - Application Cdeadmin <> - 2018-03-19 09:30:57 ==
------- Comment From brihh 2018-03-19 09:10:30 EDT -------
Needless to say, machine is pretty unusable. Needed for Performance testing
for release.
== Comment: #3 - Application Cdeadmin <> - 2018-03-19 10:10:59 ==
------- Comment From vaibhav92 2018-03-19 10:06:17 EDT -------
@pridhiviraj Any idea who can look into this ?
------- Comment From dougmill-ibm 2018-03-19 10:09:59 EDT -------
If machine is locked-up and console is unresponsive, try collecting eSEL data
from the BMC.
== Comment: #4 - Application Cdeadmin <c> - 2018-03-19 10:40:59 ==
------- Comment From pridhiviraj 2018-03-19 10:38:47 EDT -------
@brihh Can you use latest 03/15 PNOR and re-create the issue. And also before
it hangs please collect OPAL and kernel logs. @vaibhav92 If it is re-creatable
OPAL/EM team need to look at it.
== Comment: #5 - Application Cdeadmin <> - 2018-03-19 11:01:58 ==
------- Comment From vaibhav92 2018-03-19 10:41:53 EDT -------
Saw this in the kernel-log of the system:
[ 1247.404962] PM: suspend entry (s2idle)
[ 1247.404970] PM: Syncing filesystems ... done.
Looks like its getting suspended after 20 mins of inactivity. Looking at the
/etc/systemd/logind.conf see IdleAction by default is 'ignore':
#IdleAction=ignore
#IdleActionSec=30min
But clearly someone is issuing a suspend to the system. So this probably need
to be looked by the distro/Power-Management/EM team
------- Comment From megcurry 2018-03-19 10:43:14 EDT -------
Please advise re. Assignments and Labels that would get the right team
working on this.....Mirror label gets a Bz opened, and that is necessary or at
least useful for some of the LTC teams to look at things, right?
------- Comment From brihh 2018-03-19 10:47:45 EDT -------
odd about inactivity - i once had a test running for 20mins and it still
froze:
`08:52:23 up 20 min, 4 users, load average: 64.40, 98.94, 64.32`
------- Comment From brihh 2018-03-19 10:49:30 EDT -------
@pridhiviraj where is the latest 3/15 PNOR that I can load?
```
== Comment: #8 - Application Cdeadmin <> - 2018-03-19 12:40:54 ==
------- Comment From pridhiviraj 2018-03-19 12:34:18 EDT -------
```
Mar 19 11:35:01 p215n15 rsyslogd-2007: action 'action 13' suspended, next
retry is Mon Mar 19 11:35:31 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Mar 19 11:35:01 p215n15 CRON[6464]: (root) CMD (command -v debian-sa1 >
/dev/null && debian-sa1 1 1)
Mar 19 11:42:39 p215n15 systemd[1]: Starting Cleanup of Temporary
Directories...
Mar 19 11:42:39 p215n15 rsyslogd-2007: action 'action 13' suspended, next
retry is Mon Mar 19 11:43:39 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Mar 19 11:42:40 p215n15 systemd-tmpfiles[6504]:
[/usr/lib/tmpfiles.d/var.conf:14] Duplicate line for path "/var/log", ignoring.
Mar 19 11:42:40 p215n15 systemd[1]: Started Cleanup of Temporary Directories.
Mar 19 11:45:01 p215n15 rsyslogd-2007: action 'action 13' suspended, next
retry is Mon Mar 19 11:46:01 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Mar 19 11:45:01 p215n15 CRON[6524]: (root) CMD (command -v debian-sa1 >
/dev/null && debian-sa1 1 1)
Mar 19 11:47:39 p215n15 systemd[1]: Starting Message of the Day...
Mar 19 11:47:39 p215n15 rsyslogd-2007: action 'action 13' suspended, next
retry is Mon Mar 19 11:48:39 2018 [v8.16.0 try http://www.rsyslog.com/e/2007 ]
Mar 19 11:47:40 p215n15 50-motd-news[6528]: * Meltdown, Spectre and Ubuntu:
What are the attack vectors,
Mar 19 11:47:40 p215n15 50-motd-news[6528]: how the fixes work, and
everything else you need to know
Mar 19 11:47:40 p215n15 50-motd-news[6528]: - https://ubu.one/u2Know
Mar 19 11:47:40 p215n15 systemd[1]: Started Message of the Day.
Mar 19 11:48:02 p215n15 NetworkManager[4662]: <info> [1521474482.7252]
manager: sleep: sleep requested (sleeping: no enabled: yes)
Mar 19 11:48:02 p215n15 NetworkManager[4662]: <info> [1521474482.7259]
manager: NetworkManager state is now ASLEEP
Mar 19 11:48:02 p215n15 gnome-shell[5538]: Screen lock is locked down, not
locking
Mar 19 11:48:02 p215n15 whoopsie[5264]: [11:48:02] offline
Mar 19 11:48:02 p215n15 systemd[1]: Reached target Sleep.
Mar 19 11:48:02 p215n15 systemd[1]: Starting Suspend...
Mar 19 11:48:02 p215n15 systemd-sleep[6576]: Suspending system...
Mar 19 11:48:02 p215n15 kernel: [ 1246.906320] PM: suspend entry (s2idle)
```
@vaibhav92 You are right, from the above messages looks like system is
suspending by some cron job i guess.
== Comment: #9 - Application Cdeadmin <> - 2018-03-19 13:30:57 ==
------- Comment From brihh 2018-03-19 13:28:36 EDT -------
@pridhiviraj interesting. know offhand how i can shut that off? seems a bit
annoying .. :-)
== Comment: #11 - Application Cdeadmin <> - 2018-03-19 17:10:51 ==
------- Comment From mzipse 2018-03-19 16:52:11 EDT -------
At our daily defect call, it was suggested that we check to see if Opal-PRD
is running, which is a prereq for the Firmware recovery to work properly.
Opal-PRD is an app that should be part of the distro and should automatically
be started. Never-the-less, if you are want to check on it, here's the command
to run at the OS.....
sudo service opal-prd status
You'll need root authority to do this.
The output will look something like this.....
# sudo service opal-prd status
? opal-prd.service - OPAL PRD daemon
Loaded: loaded (/lib/systemd/system/opal-prd.service; enabled; vendor
preset: enabled)
Active: active (running) since Wed 2018-03-14 19:04:36 CDT; 4 days ago
Docs: man:opal-prd(8)
Main PID: 5085 (opal-prd)
Tasks: 1
CGroup: /system.slice/opal-prd.service
??5085 /usr/sbin/opal-prd --pnor /dev/mtd0
Mar 18 19:52:21 ws003p1 opal-prd[5085]: HBRT: HTMGT:~[0x00d0] 1d000000
bd021c00 0000bf02 1c000000 *................*
Mar 18 19:52:21 ws003p1 opal-prd[5085]: HBRT: HTMGT:~[0x00e0] c1021b00
0000c302 1d000000 00030000 *................*
Mar 18 19:52:21 ws003p1 opal-prd[5085]: HBRT: HTMGT:~[0x00f0] 0000ff06
21465245 51000206 16000000 *....!FREQ.......*
Mar 18 19:52:21 ws003p1 opal-prd[5085]: HBRT: HTMGT:~[0x0100] 5d08f600
00006008 f6000000 6308f600 *].....`.....c...*
Mar 18 19:52:21 ws003p1 opal-prd[5085]: HBRT: HTMGT:~[0x0110] 00006608
f6000000 6908f600 00006c08 *..f.....i.....l.*
Mar 18 19:52:21 ws003p1 opal-prd[5085]: HBRT: HTMGT:~[0x0120] f6000000
6f08f600 00007208 *....o.....r. *
Mar 18 19:52:21 ws003p1 opal-prd[5085]: HBRT: HTMGT:OCC1 rsp status=0x00,
length=0x01CC
Mar 18 19:52:21 ws003p1 opal-prd[5085]: HBRT: HTMGT:rsp data: (up to 16 bytes)
Mar 18 19:52:21 ws003p1 opal-prd[5085]: HBRT: HTMGT:~[0x0000] 03100200
03000000 00000000 00000000 *................*
Mar 18 19:52:21 ws003p1 opal-prd[5085]: HBRT: HTMGT:<<processOccError()
== Comment: #13 - Application Cdeadmin <> - 2018-03-19 17:30:51 ==
------- Comment From brihh 2018-03-19 17:22:06 EDT -------
@mzipse seems to be there and running:
```
root@p215n15:~# service opal-prd status
? opal-prd.service - OPAL PRD daemon
Loaded: loaded (/lib/systemd/system/opal-prd.service; enabled; vendor
preset: enabled)
Active: active (running) since Mon 2018-03-19 18:19:58 EDT; 1min 35s ago
Docs: man:opal-prd(8)
Main PID: 4648 (opal-prd)
Tasks: 1 (limit: 27033)
CGroup: /system.slice/opal-prd.service
??4648 /usr/sbin/opal-prd
Mar 19 18:19:58 p215n15 opal-prd[4648]: SCOM: read: chip 0x8, addr 0x8010a50,
val 0x0, rc 0
Mar 19 18:19:58 p215n15 opal-prd[4648]: SCOM: read: chip 0x8, addr 0x8010a54,
val 0x0, rc 0
Mar 19 18:19:58 p215n15 opal-prd[4648]: HBRT: PRDF:<<PRDF::noLock_initialize()
Mar 19 18:19:58 p215n15 opal-prd[4648]: HBRT: PRDF:<<PRDF::initialize()
Mar 19 18:19:58 p215n15 opal-prd[4648]: HBRT:
ATTN_SLOW:I>Service::enableAttns() enter
Mar 19 18:19:58 p215n15 opal-prd[4648]: HBRT:
ATTN_SLOW:I>Service::enableAttns() exit
Mar 19 18:19:58 p215n15 opal-prd[4648]: HBRT:
ATTN_SLOW:I><<ATTN_RT::enableAttns rc: 0
Mar 19 18:19:58 p215n15 opal-prd[4648]: HBRT: calling get_ipoll_events
Mar 19 18:19:58 p215n15 opal-prd[4648]: HBRT: enabling IPOLL events
0x5b90000000000000
Mar 19 18:19:58 p215n15 opal-prd[4648]: FW: writing init message
```
== Comment: #14 - Application Cdeadmin <> - 2018-03-20 03:50:53 ==
------- Comment From vaibhav92 2018-03-20 03:47:08 EDT -------
Did some kernel tracing and it seems that someone is starting the
systemd-suspend.service that invokes systemd-sleep which then forces a suspend
by writing to /sys/power/state file. Further investigation is needed as to who
is invoking the systemd-suspend.service. In the meantime you can issue command
on the host that will disable the service and prevent system from getting
suspended:
`systemctl mask systemd-suspend.service`
Actually, it is also possible that the presence of the suspend config
indicates user error. If Ubuntu Server was installed, I don't think suspend
should have been configured. Did someone install "desktop Ubuntu" (or some
other non-server config) on a server?
Also, 18.04 should be used, not 17.10 (and never 17.04), on P9.
== Comment: #25 - Application Cdeadmin <> - 2018-03-23 00:20:53 ==
------- Comment From vaibhav92 2018-03-23 00:19:40 EDT -------
Hi @mzipse @stewart-ibm. AFAIK this issue is not related to CAPP or CAPI at
all. @brian_horton had confirmed that he was seeing this even without
enabling/running any CAPI workloads. Disabling the systemd-suspend.service made
the problem go away. Hence asked the bug to be mirrored to canonical. @mzipse
not sure what CAPP errors PRD team saw. Can you please ask them to get in touch
with me.
== Comment: #26 - Vaibhav Jain <> - 2018-03-23 00:39:45 ==
Summary of the issue:
Ubuntu 18.04 is forcing a system suspend after 20 mins of system boot.
Suspend is forced even if system is running a workload or user is logged on to
the terminal and performing any activity. The issue goes away if
systemd-suspend service is disabled via:
"systemctl mask systemd-suspend.service"
So requesting canonical to look into this issue as a possible bug in
systemd or user inactivity monitor.
~ Vaibhav
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1758273/+subscriptions
--
Mailing list: https://launchpad.net/~touch-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~touch-packages
More help : https://help.launchpad.net/ListHelp