[Bug 2059353] Re: kernel 6.5.0-26-generic causes 640x480 default resolution on aspeed (drm/ast) video

2024-05-13 Thread Quesar
Can the "drm/ast: report connection status on Display Port" patch please
be readded to the kernel to fix this issue?  Is there a particular
reason not to add it?  It was in the earlier releases of 6.5 but someone
reverted it.  I don't see anything logged about why it was reverted
though.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2059353

Title:
  kernel 6.5.0-26-generic causes 640x480 default resolution on aspeed
  (drm/ast) video

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-6.5/+bug/2059353/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2055222] Re: ucx library fails with Genoa CPUs and InfiniBand

2024-04-04 Thread Quesar
This bug report includes the solution.  Can someone please acknowledge
and respond to it?  This is an easy fix at this point but it has been
ignored for over a month already.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2055222

Title:
  ucx library fails with Genoa CPUs and InfiniBand

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ucx/+bug/2055222/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2059353] [NEW] kernel 6.5.0-26-generic causes 640x480 default resolution on aspeed (drm/ast) video

2024-03-27 Thread Quesar
Public bug reported:

The previous kernel 6.5.0-17 worked fine.  With -26, the gdm login
screen is only 640x480.  I inspected the changelog from -17 to -26 and
found '- Revert "drm/ast: report connection status on Display Port."' in
the list.  I found the reverted patch at
https://www.spinics.net/lists/stable/msg682026.html.

I then built the 6.5.0-26 package with the "report connection" patch
added and the resulting kernel has the correct resolution now.

Why was the "report connection" patch reverted? Can it please be put
back in?

Hardware tested - Supermicro SYS-241E-TNRTTP.  lspci shows this info for the 
ASPEED video controller:
02:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics 
Family (rev 52)
02:00.0 0300: 1a03:2000 (rev 52)

** Affects: linux-signed-hwe-6.5 (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2059353

Title:
  kernel 6.5.0-26-generic causes 640x480 default resolution on aspeed
  (drm/ast) video

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-signed-hwe-6.5/+bug/2059353/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2055222] Re: ucx library fails with Genoa CPUs and InfiniBand

2024-03-18 Thread Quesar
I reproduced this on a Sapphire Rapids cluster now too, and the same
patches fixed it.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2055222

Title:
  ucx library fails with Genoa CPUs and InfiniBand

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ucx/+bug/2055222/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2055222] Re: ucx library fails with Genoa CPUs and InfiniBand

2024-03-07 Thread Quesar
Can these patches be added to the ucx package please?  This issue is
affecting all Genoa clusters with Infiniband.

Here's the type of error it causes:

root@rschhpc210:~# ucx_perftest
[1698428074.879303] [rschhpc210:13557:0] perftest.c:899 UCX WARN CPU affinity 
is not set (bound to 384 cpus). Performance may be impacted.
Waiting for connection...
Accepted connection from 10.3.8.219:54350
+--+
| API: protocol layer |
| Test: am latency |
| Data layout: (automatic) |
| Send memory: host |
| Recv memory: host |
| Message size: 1048576 |
| AM header size: 0 |
+--+
[rschhpc210:13557:0:13557] ib_mlx5_log.c:162 Remote access on mlx5_0:1/IB (synd 
0x13 vend 0x88 hw_synd 0/0)
[rschhpc210:13557:0:13557] ib_mlx5_log.c:162 RC QP 0x3177 wqe[60241]: RDMA_READ 
s-- [rva 0x7fc08799c000 rkey 0x2f1b1] [va 0x7fc4e3f63000 len 1048576 lkey 
0x1bdd26] [rqpn 0x102 dlid=33 sl=0 port=1 src_path_bits=0]
 backtrace (tid: 13557) 
0 /lib/x86_64-linux-gnu/libucs.so.0(ucs_handle_error+0x2e4) [0x7fc4e5535fc4]
1 /lib/x86_64-linux-gnu/libucs.so.0(ucs_fatal_error_message+0xb6) 
[0x7fc4e5536176]
2 /lib/x86_64-linux-gnu/libucs.so.0(+0x25c9a) [0x7fc4e553ac9a]
3 /lib/x86_64-linux-gnu/libucs.so.0(ucs_log_dispatch+0xe4) [0x7fc4e55344a4]
4 
/lib/x86_64-linux-gnu/ucx/libuct_ib.so.0(uct_ib_mlx5_completion_with_err+0x5ed) 
[0x7fc4e509d6fd]
5 /lib/x86_64-linux-gnu/ucx/libuct_ib.so.0(+0x3eb16) [0x7fc4e50b9b16]
6 /lib/x86_64-linux-gnu/libucp.so.0(ucp_worker_progress+0x7a) [0x7fc4e55ed28a]
7 ucx_perftest(+0x416de) [0x56329edf56de]
8 ucx_perftest(+0x1ff92) [0x56329edd3f92]
9 ucx_perftest(+0x82ea) [0x56329edbc2ea]
10 ucx_perftest(+0x5a94) [0x56329edb9a94]
11 /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7fc4e5229d90]
12 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7fc4e5229e40]
13 ucx_perftest(+0x6375) [0x56329edba375]
=
Aborted (core dumped)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2055222

Title:
  ucx library fails with Genoa CPUs and InfiniBand

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ucx/+bug/2055222/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 2055222] [NEW] ucx library fails with Genoa CPUs and InfiniBand

2024-02-27 Thread Quesar
Public bug reported:

Running MPI jobs or some ucx_perftest tests with Genoa CPUs and
Infiniband fails when ucx is built with a gcc version newer than 10.3
due to optimizations that convert code into "memmove" calls.  I worked
with Nvidia to identify and resolve the issues.  Here's the links to the
2 patches that resolve the issue:

https://github.com/openucx/ucx/pull/9692
https://github.com/openucx/ucx/pull/9714


Please include these patches into the ucx package to resolve the issues.

** Affects: ucx (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2055222

Title:
  ucx library fails with Genoa CPUs and InfiniBand

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ucx/+bug/2055222/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1822184] Re: clear_console locks up video when X is running and you log out from a plain text console

2019-04-04 Thread Quesar
Thanks!  This also affects 18.04 and will need to be fixed there too
please.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1822184

Title:
  clear_console locks up video when X is running and you log out from a
  plain text console

To manage notifications about this bug go to:
https://bugs.launchpad.net/xorg-server/+bug/1822184/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1822184] [NEW] clear_console locks up video when X is running and you log out from a plain text console

2019-03-28 Thread Quesar
Public bug reported:

References:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=810660
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=866898


If X is running and you switch to a plain text console (ie ctrl-alt-f3), log in 
as a regular user, and then log out, the display switches back to the graphical 
login screen and locks up.  The system is still functioning (can ssh in, etc) 
but the video output is frozen.

The underlying cause is:
https://gitlab.freedesktop.org/xorg/xserver/issues/492

However, the problem can at least be worked around by making
clear_console switch to tty 6 and back instead of switching to 1 and
back.  This patch corrects the problem:


--- bash-4.4.18.orig/debian/clear_console.c 2019-03-28 12:09:02.415907787 
-0400
+++ bash-4.4.18/debian/clear_console.c  2019-03-28 12:08:11.984366858 -0400
@@ -205,7 +205,7 @@
 #if defined(__linux__)
   num = vtstat.v_active;
 #endif
-  tmp_num = (num == 1 ? 2 : 1);
+  tmp_num = (num == 6 ? 5 : 6);
 
   /* switch vt to clear the scrollback buffer */
   if (ioctl(fd, VT_ACTIVATE, tmp_num))

** Affects: bash (Ubuntu)
 Importance: Undecided
 Status: New

** Patch added: "makes clear_console switch to either tty6 or tty5 instead of 
tty1/tty2"
   
https://bugs.launchpad.net/bugs/1822184/+attachment/5250280/+files/fix-clear_console-GUI-lockup.patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1822184

Title:
  clear_console locks up video when X is running and you log out from a
  plain text console

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/bash/+bug/1822184/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1572412] Re: nvidia-361 361.42-0ubuntu2: nvidia-361 kernel module failed to build [conftest.sh: Cannot fork]

2016-05-12 Thread Quesar
Manually running "dkms -m nvidia-361 -v 361.42 build" and then "dkms -m
nvidia-361 -v 361.42 install" worked for me after this error.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1572412

Title:
  nvidia-361 361.42-0ubuntu2: nvidia-361 kernel module failed to build
  [conftest.sh: Cannot fork]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-361/+bug/1572412/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1577849] [NEW] openmpi 1.10.2-8ubuntu1 fails running job with cpuset started from torque 5

2016-05-03 Thread Quesar
Public bug reported:

OpenMPI 1.10.2 has a bug handling cpusets.  Here is a link to the
mailing list discussion including the patch I'll attach:


https://www.mail-archive.com/users%40open-mpi.org/msg00273.html

I encountered this error by adding torque 5.1.1.2 to the system
including cpuset support.  When I run a job I get this error:

--
A request for multiple cpus-per-proc was given, but a directive
was also give to map to an object level that has less cpus than
requested ones:

  #cpus-per-proc:  1
  number of cpus:  0
  map-by:  BYSOCKET

Please specify a mapping level that has more cpus, or else let us
define a default mapping that will allow multiple cpus-per-proc.
--

Adding the patch to the deb package and rebuilding it resolved the
issue.

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: openmpi-bin 1.10.2-8ubuntu1
ProcVersionSignature: Ubuntu 4.4.0-21.37-generic 4.4.6
Uname: Linux 4.4.0-21-generic x86_64
ApportVersion: 2.20.1-0ubuntu2
Architecture: amd64
CurrentDesktop: Unity
Date: Tue May  3 12:11:00 2016
InstallationDate: Installed on 2016-04-27 (6 days ago)
InstallationMedia: Ubuntu 16.04 LTS "Xenial Xerus" - Release amd64 (20160420.1)
SourcePackage: openmpi
UpgradeStatus: No upgrade log present (probably fresh install)

** Affects: openmpi (Ubuntu)
 Importance: Undecided
 Status: New


** Tags: amd64 apport-bug cpuset openmpi patch torque xenial

** Patch added: "fixes hwloc cpuset support"
   
https://bugs.launchpad.net/bugs/1577849/+attachment/4654620/+files/hwloc-cpuset-fix.patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1577849

Title:
  openmpi 1.10.2-8ubuntu1 fails running job with cpuset started from
  torque 5

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/openmpi/+bug/1577849/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1274320] Re: Error: diskfilter writes are not supported

2014-04-17 Thread Quesar
I just made a permanent clean fix for this, at least for MD (software
RAID). It can easily be modified to fix for LVM too.  Edit
/etc/grub.d/00_header and change the recordfail section to this:


if [ "$quick_boot" = 1 ]; then
cat 

[Bug 107326] Re: non working gpt labels

2007-12-06 Thread Quesar
I had the same problem working with a 3TB array.  I created 7 partitions
on a gpt label with parted from 7.10 64bit desktop.  I then used hdparm
-z /dev/sda to reread the partitions, but the OS only saw the first 4.
I then chrooted into SLES 10 and used the parted from it.  It gave the
same error message regarding the bad gpt label that has been reported
already.  I made 1 small change so that the SLES10 version of parted
would rewrite the partition table, and then hdparm -z /dev/sda correctly
read the partition table after that.

Also, I have a patched version of grub that works with gpt partition
tables.  I can provide it if necessary.  This patched version also
wouldn't work until I used the parted from SLES10.  The parted from
ubuntu 7.10 64bit desktop left the partition table in a state that would
not work with my patched grub.

-- 
non working gpt labels
https://bugs.launchpad.net/bugs/107326
You received this bug notification because you are a member of Ubuntu
Bugs, which is the bug contact for Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs