------- Comment From mranw...@us.ibm.com 2018-12-06 02:22 EDT-------
I recreated the problem where I could see the errors in dmesg (and the console) 
and then added the firmware to /lib/firmware/nvidia/gx100.  After that:
mranweil@ltc-wspoon5:~$ dmesg|grep -i nouv
[    6.632529] nouveau 0004:04:00.0: enabling device (0140 -> 0142)
[    6.632613] nouveau 0004:04:00.0: Using 32-bit DMA via iommu
[    6.632721] nouveau 0004:04:00.0: NVIDIA GV100 (140000a1)
<snip>
[    7.061963] nouveau 0035:03:00.0: DRM: Pointer to TMDS table invalid
[    7.061966] nouveau 0035:03:00.0: DRM: DCB version 4.1
[    7.063141] nouveau 0035:03:00.0: DRM: MM: using COPY for buffer copies
[    7.063154] [drm] Initialized nouveau 1.3.1 20120801 for 0035:03:00.0 on 
minor 2
mranweil@ltc-wspoon5:~$

So looks like the firmware from the current git tree addresses the error
messages.  I didn't do anything further with the driver.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1794055

Title:
  [Witherspoon-DD2.2][Ubu 18.10] [4.18.0-7-generic ] OS booting thrown
  with nouveau errors; OS booted successfully

Status in The Ubuntu-power-systems project:
  Incomplete
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Cosmic:
  Incomplete

Bug description:
  == Comment: #0 - Kalpana Shetty <kalsh...@in.ibm.com> - 2018-09-15 23:55:13 ==
  ---Problem Description---
  [Witherspoon-DD2.2][Ubu 18.10] [4.18.0-7-generic ] OS booting thrown with 
nouveau errors 
   
  Contact Information = kalsh...@in.ibm.com, preeti.tha...@in.ibm.com 
   
  ---uname output---
  root@ltc-wcwsp3:~# uname -a Linux ltc-wcwsp3 4.18.0-7-generic #8-Ubuntu SMP 
Tue Aug 28 18:20:56 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux
   
  Machine Type = Witherspoon DD2.2 LC 
   
  Steps:
  1. Netinstall Ubu 18.10 on Witherspoon-LC-DD2.2 6GPU system ------> PASS
  2. Boot the OS ---> PASS but error thrown on the console related open source 
NVIDIA driver.

    [Disk: sdb2 / c0302064-c5a3-49a7-8bd4-402283e6fcbe]
      Ubuntu, with Linux 4.18.0-7-generic (recovery mode)
      Ubuntu, with Linux 4.18.0-7-generic
      Ubuntu
    [Disk: nvme0n1p2 / c5d042f1-812e-49e0-94b2-ade477084061]
      Ubuntu, with Linux 4.18.0-7-generic (recovery mode)
   *  Ubuntu, with Linux 4.18.0-7-generic                                       
 
      Ubuntu

    System information
    System configuration
    System status log
    Language
    Rescan devices
    Retrieve config from URL
    Plugins (0)
    Exit to shell
   
??????????????????????????????????????????????????????????????????????????????
   Enter=accept, e=edit, n=new, x=exit, l=language, g=log, h=help
  The system is going down NOW!
  Sent SIGTERM to all processes
  Sent SIGKILL to all processes
  [   57.513329] kexec_core: Starting new kernel
  [  149.358703978,5] OPAL: Switch to big-endian OS
  [  153.355498935,5] OPAL: Switch to little-endian OS
  [    2.943735] integrity: Unable to open file: /etc/keys/x509_ima.der (-2)
  [    2.943738] integrity: Unable to open file: /etc/keys/x509_evm.der (-2)
  [    3.132733] vio vio: uevent: failed to send synthetic uevent
  [    4.058698] nouveau 0004:04:00.0: gr: failed to load gr/sw_nonctx
  [    4.129215] nouveau 0004:04:00.0: DRM: failed to create kernel channel, -22
  [   19.126509] nouveau 0004:04:00.0: DRM: failed to idle channel 0 [DRM]
  [   19.281450] nouveau 0004:05:00.0: gr: failed to load gr/sw_nonctx
  [   19.351322] nouveau 0004:05:00.0: DRM: failed to create kernel channel, -22
  [   34.350509] nouveau 0004:05:00.0: DRM: failed to idle channel 0 [DRM]
  [   34.502063] nouveau 0004:06:00.0: gr: failed to load gr/sw_nonctx
  [   34.572144] nouveau 0004:06:00.0: DRM: failed to create kernel channel, -22
  [   49.570509] nouveau 0004:06:00.0: DRM: failed to idle channel 0 [DRM]
  [   49.734754] nouveau 0035:03:00.0: gr: failed to load gr/sw_nonctx
  [   49.805057] nouveau 0035:03:00.0: DRM: failed to create kernel channel, -22
  [   64.802510] nouveau 0035:03:00.0: DRM: failed to idle channel 0 [DRM]
  [   64.955442] nouveau 0035:04:00.0: gr: failed to load gr/sw_nonctx
  [   65.025537] nouveau 0035:04:00.0: DRM: failed to create kernel channel, -22

  [   80.022509] nouveau 0035:04:00.0: DRM: failed to idle channel 0 [DRM]
  [   80.181169] nouveau 0035:05:00.0: gr: failed to load gr/sw_nonctx
  [   80.251481] nouveau 0035:05:00.0: DRM: failed to create kernel channel, -22
  [   95.250509] nouveau 0035:05:00.0: DRM: failed to idle channel 0 [DRM]
  /dev/nvme0n1p2: recovering journal
  /dev/nvme0n1p2: clean, 72569/97681408 files, 7384418/390701312 blocks
  -.mount
  kmod-static-nodes.service
  dev-hugepages.mount
  dev-mqueue.mount
  sys-kernel-debug.mount
  ufw.service
  lvm2-lvmetad.service
  systemd-remount-fs.service
  systemd-random-seed.service
  systemd-sysusers.service
  keyboard-setup.service
  systemd-tmpfiles-setup-dev.service
  lvm2-monitor.service
  finalrd.service
  console-setup.service
  swapfile.swap
  ebtables.service
  systemd-udevd.service
  systemd-journald.service
  systemd-journal-flush.service
  systemd-tmpfiles-setup.service
  systemd-update-utmp.service
  [  100.997765] vio vio: uevent: failed to send synthetic uevent
  systemd-udev-trigger.service
  systemd-timesyncd.service
  apparmor.service
  lvm2-pvscan@8:3.service
  systemd-modules-load.service
  sys-kernel-config.mount
  sys-fs-fuse-connections.mount
  systemd-sysctl.service
  ondemand.service
  dbus.service
  irqbalance.service
  opal-prd.service
  lxcfs.service
  atd.service
  cron.service
  iprdump.service
  iprinit.service
  systemd-logind.service
  iprupdate.service
  systemd-networkd.service
  rsyslog.service
  polkit.service
  accounts-daemon.service
  lxd-containers.service
  networkd-dispatcher.service
  var-lib-lxcfs.mount
  tmp-selftest\x2dmountpoint\x2d039055037.mount
  snapd.service
  snapd.seeded.service
  systemd-resolved.service
  systemd-networkd-wait-online.service
  blk-availability.service
  systemd-user-sessions.service
  apport.service

  Ubuntu Cosmic Cuttlefish (development branch) ltc-wcwsp3 hvc0

  ltc-wcwsp3 login:

  == Comment: #2 - Kalpana Shetty <kalsh...@in.ibm.com> - 2018-09-16 00:07:26 ==
  sosreport -> 
http://9.114.13.132/repo/bugs/ubu/sosreport-BZ171506.171506-20180915235600.tar.xz

  == Comment: #3 - Kalpana Shetty <kalsh...@in.ibm.com> - 2018-09-16
  00:33:02 ==

  
  == Comment: #4 - Praveen K. Pandey <praveen.pan...@in.ibm.com> - 2018-09-19 
05:52:23 ==
  facing nouveau related error on power8 system as well  

  
  [    4.764818] nouveau 0002:01:00.0: fifo: fault 00 [READ] at 
0000000000020000 engine 0c [HOST6] client 06 [GPC0/L1_2] reason 02 [PTE] on 
channel 0 [03ffb18000 DRM]
  [    4.942169] nouveau 000a:01:00.0: fifo: fault 00 [READ] at 
0000000000020000 engine 0c [HOST6] client 06 [GPC0/L1_2] reason 02 [PTE] on 
channel 0 [03ffb18000 DRM]
  /dev/sdb2: clean, 132397/61054976 files, 5995714/244188416 blocks
  [   11.206278] vio vio: uevent: failed to send synthetic uevent
  [  OK  ] Started Show Plymouth Boot Screen.
  [  OK  ] Reached target Local Encrypted Volumes.
  [  OK  ] Started Forward Password Requests to Plymouth Directory Watch.
  plymouth-start.service
  [  OK  ] Started ebtables ruleset management.

  == Comment: #5 - Chandni Verma <chand...@in.ibm.com> - 2018-09-20 16:41:49 ==
  --- screening ---

  From provided dmesg, I notice:

  
  1294 [   19.281478] nouveau 0004:05:00.0: bios: version 88.00.13.00.02
  1295 [   19.282753] nouveau 0004:05:00.0: Direct firmware load for 
nvidia/gv100/gr/sw_nonctx.bin failed with error -2
  1296 [   19.282755] nouveau 0004:05:00.0: gr: failed to load gr/sw_nonctx
  1297 [   19.282813] nouveau 0004:05:00.0: Using 32-bit DMA via iommu

  ..

  1322 [   34.367713] nouveau 0004:06:00.0: NVIDIA GV100 (140000a1)
  1323 [   34.497152] nouveau 0004:06:00.0: bios: version 88.00.13.00.02
  1324 [   34.502736] nouveau 0004:06:00.0: Direct firmware load for 
nvidia/gv100/gr/sw_nonctx.bin failed with error -2
  1325 [   34.502738] nouveau 0004:06:00.0: gr: failed to load gr/sw_nonctx
  1326 [   34.502797] nouveau 0004:06:00.0: Using 32-bit DMA via iommu

  ..

  upto 6 instances of the above...

  
  Looks like an NVIDIA firmware issue.

  == Comment: #6 - Luciano Chavez <cha...@us.ibm.com> - 2018-09-20 17:03:31 ==
  (In reply to comment #5)
  > --- screening ---
  > 
  > From provided dmesg, I notice:
  > 
  > 
  > 1294 [   19.281478] nouveau 0004:05:00.0: bios: version 88.00.13.00.02
  > 1295 [   19.282753] nouveau 0004:05:00.0: Direct firmware load for
  > nvidia/gv100/gr/sw_nonctx.bin failed with error -2
  > 1296 [   19.282755] nouveau 0004:05:00.0: gr: failed to load gr/sw_nonctx
  > 1297 [   19.282813] nouveau 0004:05:00.0: Using 32-bit DMA via iommu
  > 
  > ..
  > 
  > 1322 [   34.367713] nouveau 0004:06:00.0: NVIDIA GV100 (140000a1)
  > 1323 [   34.497152] nouveau 0004:06:00.0: bios: version 88.00.13.00.02
  > 1324 [   34.502736] nouveau 0004:06:00.0: Direct firmware load for
  > nvidia/gv100/gr/sw_nonctx.bin failed with error -2
  > 1325 [   34.502738] nouveau 0004:06:00.0: gr: failed to load gr/sw_nonctx
  > 1326 [   34.502797] nouveau 0004:06:00.0: Using 32-bit DMA via iommu
  > 
  > ..
  > 
  > upto 6 instances of the above...
  > 
  > 
  > Looks like an NVIDIA firmware issue.

  Well, I think those message mean that the nouveau module can't find
  the firmware file as opposed to it being a FW issue. Might be a
  packaging issue if this is actually not causing any real issues.
  Probably best to mirror this to Canonical for their comment.

  == Comment: #10 - Chandni Verma <chand...@in.ibm.com> - 2018-09-24
  03:25:35 ==

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1794055/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to