Re: [CentOS] Update to Centos 7.7 / Arch ppc64le / Problem with nvidia driver
Hello Fabian, On 26.09.19 09:46, Fabian Arrotin wrote: On 25/09/2019 10:30, Ralf Aumüller wrote: ... today I updated a CentOS 7.6 ppc64le machine to CentOS 7.7. After reboot to the new kernel (4.18.0-80.7.2.el7.ppc64le) dkms could not build the nvidia-module. ...>> Any comments? thanks for Your quick response. Well, if you use that kernel, that means you're on Power9 variant, and that architecture doesn't exist anymore upstream (so no RHEL 7.7 for Power9). As almost all packages are just ppc64le (which still exist upstream), the decision was to still provide 7.7.1908 for Power9 users, but using the kernel from CentOS 8, rebuilt for CentOS 7. (same is also true for aarch64) For that kernel to be built, we had to use newer gcc, that you can find/use through devtoolset-8 : http://mirror.centos.org/altarch/7/sclo/ppc64le/rh/devtoolset-8/ Ok. So I try to install devtoolset-8 and build the nvidia-driver with that gcc. Curious : which kind of machine do you have that has both a Power9 and nvidia ? that seems to *not* be an IBM node, but a kind of openpower workstation ? It'a a IBM Power System AC922 (8335-GTH) with Nvidia Tesla V100 graphic cards. Supercomputer "Summit" uses this nodes (https://www.olcf.ornl.gov/summit/). PS2 : worth creating a bug report on https://bugs.centos.org for easier tracking and also indexing, so that other people in your situation would follow the bug report (index by crawlers) and eventually discussion can happen there. I will do that. Best regards, Ralf ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
[CentOS] Update to Centos 7.7 / Arch ppc64le / Problem with nvidia driver
Hello, today I updated a CentOS 7.6 ppc64le machine to CentOS 7.7. After reboot to the new kernel (4.18.0-80.7.2.el7.ppc64le) dkms could not build the nvidia-module. Error-message from dkms: Compiler version check failed: The major and minor number of the compiler used to compile the kernel: gcc version 8.3.1 20190311 (Red Hat 8.3.1-3) (GCC) does not match the compiler used here: cc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39) Output of /proc/version with new kernel running is: Linux version 4.18.0-80.7.2.el7.ppc64le (mockbu...@ppc64le-01.bsys.centos.org) (gcc version 8.3.1 20190311 (Red Hat 8.3.1-3) (GCC)) #1 SMP Thu Sep 12 15:45:05 UTC 2019 Problem seams to be: The kernel was compiled with gcc-version 8.3.1 and installed is gcc 4.8.5. All previous kernels were compiled with gcc 4.8.5. See: #cat /usr/src/kernels/*/include/generated/compile.h |grep LINUX_COMPILER define LINUX_COMPILER "gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)" define LINUX_COMPILER "gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)" define LINUX_COMPILER "gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)" define LINUX_COMPILER "gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)" define LINUX_COMPILER "gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC)" define LINUX_COMPILER "gcc version 8.3.1 20190311 (Red Hat 8.3.1-3) (GCC)" define LINUX_COMPILER "gcc version 8.3.1 20190311 (Red Hat 8.3.1-3) (GCC)" Any comments? Best regards, Ralf ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] Update from 6.6 to 6.7 > automount logs error message
Hello, >>> after an update from 6.6 to 6.7 the following error message is logged to >>> /var/log/messages when I login (per ssh): >>> >>> Aug 11 16:31:21 a1234 automount[1598]: set_tsd_user_vars: failed to get >>> passwd info from getpwuid_r Did some more tests: Compiled autofs with logging of UID/GID in autofs-function "set_tsd_user_vars". Just before the error is logged, autofs tries to get password info for e.g. UID 409651584 and GID 4294936577 (witch don't exist). Then error message is logged. A fully updated 6.7 system running latest 6.6 kernel (2.6.32-504.30.3.el6.x86_64) won't print the error message. I checked the changelog of kernel 2.6.32-573.3.1.el6.x86_64 and found some autofs patches since version 504.30.3. But I can't test an further because the kernel-srpm didn't include single patches anymore. Maybe someone with deeper kernel knowledge has an idea? Best regards, Ralf ___ CentOS mailing list CentOS@centos.org https://lists.centos.org/mailman/listinfo/centos
[CentOS] Update from 6.6 to 6.7 > automount logs error message
Hello, after an update from 6.6 to 6.7 the following error message is logged to /var/log/messages when I login (per ssh): Aug 11 16:31:21 a1234 automount[1598]: set_tsd_user_vars: failed to get passwd info from getpwuid_r Checked all log-files of my systems running 6.6 with same configuration -- never got such a message (We use NFS/autofs for home-directories, NIS and tcsh (login shell)). Everything seems to work -- but before I update all machines to 6.7 I want to know whats going on. Any comments? Best regards, Ralf ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Re: [CentOS] md5sum mismatch between CentOS 6.4 and 6.5 repository
Hello, On 12/09/2013 03:16 PM, Johnny Hughes wrote: > Yes, it is signed by the same key ... but the rpm is not identical. (The > difference being a different md5sum because rpm metadata (signature > time) is different). > > He does not say HOW the installation fails ... one would have to look at > the kickstart file to see. anaconda crashes while installing packages. Checking the log-files of the traceback I found something about wrong md5sum of package python-slip-dbus. > If the ks install is looking at a real 6.5 and pointing at it, it should > work fine. If he is pointing at something else, maybe not. I would > personally just delete the files in question and rsync again to make > sure they are replaced. > > Rsync, if it sees the same file and the same date, will not validate the > crc without a -c switch ... that would take forever for a whole tree, so > I would delete the files in question and sync again from a 6.5 tree. That was the way I fixed the kickstart installation. Thank You very much for Your explanation of the md5sum difference and best regards, Ralf ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
[CentOS] md5sum mismatch between CentOS 6.4 and 6.5 repository
Hello, when I download python-slip-dbus-0.2.20-1.el6_2.noarch.rpm from CentOS 6.5 repository the md5sum is different than when I download same file from 6.4. wget http://msync.centos.org/centos-6/6.5/os/x86_64/Packages/ python-slip-dbus-0.2.20-1.el6_2.noarch.rpm -O python-slip-dbus-0.2.20-1.el6_2.noarch.rpm.65 wget http://msync.centos.org/centos-6/6.4/os/x86_64/Packages/ python-slip-dbus-0.2.20-1.el6_2.noarch.rpm -O python-slip-dbus-0.2.20-1.el6_2.noarch.rpm.64 ls -l . 30844 Mar 26 2012 python-slip-dbus-0.2.20-1.el6_2.noarch.rpm.64 . 30844 Mar 26 2012 python-slip-dbus-0.2.20-1.el6_2.noarch.rpm.65 md5sum python-slip-dbus-0.2.20-1.el6_2.noarch.rpm.* 20bb02e6f3b7b71e09dcaff7f3b0ca02 python-slip-dbus-0.2.20-1.el6_2.noarch.rpm.64 d37fe4404a7a5fdb27b29f9b5ed09c73 python-slip-dbus-0.2.20-1.el6_2.noarch.rpm.65 Any comments? Background: We have a local CentOS mirror and after updating to 6.5 the kickstart installation fails because of the wrong md5sum of python-slip-dbus. We mirror with rsync (no -c) and so we had the version from 6.4 in our 6.5 repository. (Same seams to be true for python-paste-script-1.7.3-5.el6_3.noarch.rpm and slf4j-javadoc-1.5.8-8.el6.noarch.rpm) Best regards, Ralf ___ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos