You have been subscribed to a public bug:

 
### uname -a (64-bit ARM, official image):
`Linux ubuntu 5.4.0-1015-raspi #15-Ubuntu SMP Fri Jul 10 05:34:24 UTC 2020 
aarch64 aarch64 aarch64 GNU/Linux`
### LSB release (Ubuntu *Server*, focal):
Description:    Ubuntu 20.04.1 LTS
### Interesting packages installed
- zfs-dkms (with initramfs support) @ 0.8.3-1ubuntu12.2  
  * spl-dkms @ 0.8.3-1ubuntu12.2  
- dphys-swapfile
### Hardware model:
Raspberry Pi 3 Model B
- 32 GiB SD card with root partition
  * had a swap partition; now unused
  * migrated to dphys-swapfile
- Attached 32 GiB USB stick as zpool for storage (not root FS)
- Current PSU reportedly outputs 2.4A supply for the Pi
  * Still have occasional undervolt warnings (formally requires 2.5A)
  * Lightning indicator not present however
- Connected over wireless networking

## Issue
- When under significant computational load at some point, the machine appears 
to freeze.
  * I usually log in in a headless manner via ssh, so externally the machine is 
frozen and I need to pull the power cable
- Connectig the HDMI monitor the following messsages appear, in various orders 
each time:

```terminal
cpu cpu0: dev_pm_opp_set_rate: failed to find current OPP for freq 
9,223,372,036,854,775,698 ({illegible on my photograph, presumably -110})
hwmon hwmon1: Failed to get throttled (-110)
raspberrypi-clk firmware clocks: Failed to change plib frequency: -110
mmc0: timeout waiting for hardware interrupt
# mmc0 would be the root partition

### ... typically later on in the output

rcu: INFO: rcu_sched detected stalls on CPU/tasks
rcu: $1-...0: (1 GPs behind) idle=.../1/0x40000{more 0s...}02 
softirq=66377/66378 {or 26106/26107} fqs={this value varies}
INFO: task kworker/{...} blocked for more than 120 seconds
   TAINTED: P    WC OE 5.4.0-1015-raspi #15-ubuntu
watchdog: BUG: soft lockup - CPU #3 stuck for 22s!

```

The OPP frequency above looks to me like it may be the cause of the
issue, I have added the commas myself to the output but it would appear
to be a rubbish value; [this](https://lkml.org/lkml/2020/7/24/683)
mailing list archive I found whilst searching for terms found in the
messages appears to back up my belief that we should be seeing a
sensible CPU frequency here, expressed in integer Hz; the above would be
9.2 EHz assuming Hertz are the base unit, higher still if it's k/M/GHz
etc. My most sensible guess is this value has been brought up somewhere
as garbage, and understandably the system fails to scale the clock
speed, with the resultant crashes presumably due to this.

Beyond this point, there is no kernel panic, however the machine locks
up externally; does not respond to USB keyboard NumLock and is invisible
on the network, with more and more errors gradually being output to the
console via the HDMI display; the most notable being the SD card is not
responding

Just before encountering this issue I had added a swap aprtition, to the
SD card, as I had none by default and the system seemed to be hanging
when it presumably was sending bad_allocs to userland processes as it
failed to allocate memory. As the SD card was mentioned, I have tried a
variety of power supplies (as I was getting several undervolt warnings)
and eventually removed the swap partition and used a swapfile with
`dphys-swapfile` knowing that the way the Pi accesses the SD card is
somewhat different from a typical machine. However, neither of these two
seems to have resolved the issue, giving further evidence that the
frequency scaling may well be the primary issue and the rest is simply
the carnage that ensues.


## Steps to Reproduce
- Seems to happen sporadically when the machine is under stress, within 5-25 
minutes
- Currently I am trying to set up a rootless docker compose file
  * Attempting to pull the images eventually leads to the issue
  * The images are being downloaded to the zpool on the USB stick and *not* the 
SD card
- The system seems to hang initially waiting on the SDcard to respond to an IRQ
- however I believe that the CPU scaling message seems to be the root cause
- Do not have any of the importat messages in the `syslog`, I need an external 
HDMI monitor to get the output on screen from the kernel ring buffer

## Links

- [Related AskUbuntu 
question](https://askubuntu.com/questions/1241412/ubuntu-20-04-lts-hangs-with-error-hwmon1-failed-to-get-throttled-110)
- [Potentially related bug - the frequency issue seems to be the same, however 
the specific cause and a workaround are 
different](https://bugs.launchpad.net/ubuntu/+source/linux-raspi/+bug/1875148)


## Extra
- Attaching /proc/cpuinfo
- Please let me know if any more diagnostics required; I would use hardinfo or 
inxi but both want to install large parts of X which I don't want to do

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: bot-comment raspberrypi
-- 
Raspberry Pi 3B hangs - dev_pm_opp_set_rate: failed to find current OPP,   
Failed to get throttled, Failed to change plib frequency; mmc timeout waiting 
for hardware interrupt
https://bugs.launchpad.net/bugs/1889637
You received this bug notification because you are a member of Kernel Packages, 
which is subscribed to linux in Ubuntu.

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to