Alpha has gone to its reward
We had a horrific electrical storm on the 28th, and a lightning strike took out my home air-conditioning units, my cable modem, my Wifi router, a 16-port switch, my Ooma Telo, my main computer, my printer, and... my PWS-433au :-(. The Alpha isn't worth repairing, and I'm not going to go to the trouble of replacing it. I had my fun with it, and frankly, it lasted far longer in 24x7x365 use than I had any right to expect. Bottom line: After more years than I care to remember, I'm out of the race. Will continue to lurk and help where/when I can, but I won't be doing any actual testing or debugging on Alpha. Sincere thanks to the experts hanging out here who have helped me through many a rough spot with the Alpha platform. Respectfully, --Bob
Re: 5.17.0 boot issue on Miata
On Mon, Apr 25, 2022 at 10:26:46AM +0100, John Garry wrote: > Please try v5.18-rc2 as it should have a fix in commit eaba83b5b850 Up and running on v5.18-rc5 as I type this. Fix confirmed. Thanks! --Bob
Re: 5.17.0 boot issue on Miata
(Adding linux-scsi and linux-kernel, now that bisection is complete.) On Wed, Apr 06, 2022 at 05:44:01PM -0500, Bob Tracy wrote: > v5.17-rc2 ok. v5.17-rc3 I get the disk sector errors and hang that I > reported in the first message in this thread. This is on an Alpha Miata platform (PWS 433au) with QLogic ISP1020 controller. Here's the implicated commit: edb854a3680bacc9ef9b91ec0c5ff6105886f6f3 is the first bad commit commit edb854a3680bacc9ef9b91ec0c5ff6105886f6f3 Author: Ming Lei Date: Thu Jan 27 23:37:33 2022 +0800 scsi: core: Reallocate device's budget map on queue depth change We currently use ->cmd_per_lun as initial queue depth for setting up the budget_map. Martin Wilck reported that it is common for the queue_depth to be subsequently updated in slave_configure() based on detected hardware characteristics. As a result, for some drivers, the static host template settings for cmd_per_lun and can_queue won't actually get used in practice. And if the default values are used to allocate the budget_map, memory may be consumed unnecessarily. Fix the issue by reallocating the budget_map after ->slave_configure() returns. At that time the device queue_depth should accurately reflect what the hardware needs. Link: https://lore.kernel.org/r/20220127153733.409132-1-ming@redhat.com Cc: Bart Van Assche Reported-by: Martin Wilck Suggested-by: Martin Wilck Tested-by: Martin Wilck Reviewed-by: Martin Wilck Reviewed-by: Bart Van Assche Signed-off-by: Ming Lei Signed-off-by: Martin K. Petersen drivers/scsi/scsi_scan.c | 55 +++- 1 file changed, 50 insertions(+), 5 deletions(-) Respectfully, --Bob
Re: 5.17.0 boot issue on Miata
On Wed, Apr 06, 2022 at 05:44:01PM -0500, Bob Tracy wrote: > v5.17-rc2 ok. v5.17-rc3 I get the disk sector errors and hang that I > reported in the first message in this thread. > > I'm going to try a native build of '-rc3' just to rule out any > cross-compiler strangeness. Should have something to report in another > 34 hours or so :-(. Confirmed: the native build was just as broken as the cross build. The bug was introduced somewhere between v5.17-rc2 and v5.17-rc3. But at least I have a bit more confidence in the integrity of what the cross tools build. Interesting aside: the cross build's vmlinux.gz was approx. 200k larger. That might be due to gcc version differences (native toolchain is 11.2, and the cross toolchain is 11.1). I'll start the actual bisection process today. If I don't finish today, it will be at least another week before I can get back to this, so apologies in advance. --Bob
Re: 5.17.0 boot issue on Miata
On Tue, Apr 05, 2022 at 08:22:48PM +0200, Helge Deller wrote: > You don't need to enable it, but for alpha it's probably beneficial to enable > it. > When enabled, you will see a big speed improvement when logging in to a > graphics text > console and printing info. E.g. try "time dmesg" with and without that > option... > The "dmesg" will scroll the screen, and that's what it accelerates (only if > the driver > has such hardware bitblt-support). v5.17-rc2 ok. v5.17-rc3 I get the disk sector errors and hang that I reported in the first message in this thread. (Unrelated, I *did* enable the framebuffer option, and that part of the boot worked just fine.) I'm going to try a native build of '-rc3' just to rule out any cross-compiler strangeness. Should have something to report in another 34 hours or so :-(. --Bob
Re: 5.17.0 boot issue on Miata
On Tue, Apr 05, 2022 at 05:01:25PM +1200, Michael Cree wrote: > On Mon, Apr 04, 2022 at 08:42:38PM -0500, Bob Tracy wrote: > > On Sun, Mar 27, 2022 at 11:21:57AM +1300, Michael Cree wrote: > > > On Thu, Mar 24, 2022 at 09:54:15PM -0500, Bob Tracy wrote: > > > > When I attempt to boot a 5.17.0 kernel built from the kernel.org > > > > sources, I see disk sector errors on my "sda" device, and the boot > > > > process hangs at the point where "systemd-udevd.service" starts. > > > > > > > > Rebooting on 5.16.0 works with no disk I/O errors of any kind. > > > > > > Oh, you can run a 5.16.y kernel on Alpha? I have had problems > > > with everything since 5.9.y with rare, random, corruptions in > > > memory in user space (exhibiting as glibc detected memory > > > corruptions or segfaults). > > > > Did we have this painted into the "SMP vs. not-SMP" corner at one point? > > No, this affects both ES45 (with 3 cpus) and XP1000 (one cpu). > > The problem is rare. I often have to run tests for 12 hours on > the XP1000 before I see a problem. On miata it might occur even > less often. > > I hope I am getting close to the bad commit, but it is taking > time when I run testing for a whole day before I feel confident > enough to mark the kernel as good. And I have been wrong on > that one a couple of times now, having to repeat part of the > bisection. Stupid question, but possibly related to what I'm seeing in v5.17-final. Beginning with "-rc3" there's a new FRAMEBUFFER_CONSOLE_LEGACY_ACCELERATION configuration option. Do I need this enabled on Miata if I normally boot in a video mode that displays a logo? I'll try "no" for the "-rc3" build if/when "-rc2" boots properly. --Bob
Re: 5.17.0 boot issue on Miata
On Sun, Mar 27, 2022 at 11:21:57AM +1300, Michael Cree wrote: > On Thu, Mar 24, 2022 at 09:54:15PM -0500, Bob Tracy wrote: > > When I attempt to boot a 5.17.0 kernel built from the kernel.org > > sources, I see disk sector errors on my "sda" device, and the boot > > process hangs at the point where "systemd-udevd.service" starts. > > > > Rebooting on 5.16.0 works with no disk I/O errors of any kind. > > Oh, you can run a 5.16.y kernel on Alpha? I have had problems > with everything since 5.9.y with rare, random, corruptions in > memory in user space (exhibiting as glibc detected memory > corruptions or segfaults). Did we have this painted into the "SMP vs. not-SMP" corner at one point? Miata is an automatic not-SMP case for hand-built kernels for that architecture, which might explain why I'm not seeing the problems with user space memory corruption. Just successfully booted on v5.17-rc1 a little while ago. Moving on to "-rc2". --Bob
5.17.0 boot issue on Miata
When I attempt to boot a 5.17.0 kernel built from the kernel.org sources, I see disk sector errors on my "sda" device, and the boot process hangs at the point where "systemd-udevd.service" starts. Rebooting on 5.16.0 works with no disk I/O errors of any kind. Assuming the 5.17.0 kernel or its associated initrd had bad sectors, I rebuilt both and saw no I/O errors during the build nor afterward when copying the new kernel into place under "/boot". Even tried a cross-compile build of a 5.17.0 alpha kernel on my x86_64 platform to save build time (34 hours for a native build on a PWS 433au vs. 2 hours on the x86_64 platform). That build produced identical results when I tried booting on it. If anyone else is seeing this and can get a head-start on bisecting, that would be very much appreciated. I won't be able to get to it for about a week and a half :-(. 5.16.0 works. 5.17.0 doesn't. Might get lucky and find that the offending changes happened in the first 5.17.0 release candidate. As always, sincere thanks in advance. --Bob
Re: X11 system lockup with 5.11.0 kernel
On Mon, Sep 06, 2021 at 11:00:27AM +1200, Michael Cree wrote: > I had intended to assist in testing with real hardware but there > are other issues due to the 5.10 kernel on Alpha that need fixing > first and I am working on that. I am hoping to come back to this > and can run some tests in the near future. I am delighted to report that this issue has finally been resolved as of the 5.14.0 mainline kernel. There is a completely unrelated annoyance involving APC UPSs and a flood of "hid-generic: control queue full" messages on the console. This was reported and fixed as of 01 Sep 2021, but the fix hasn't made it into mainline yet :-(. Respectfully, --Bob
Re: X11 system lockup with 5.11.0 kernel
Quick update: if this issue was ever fixed, the patch hasn't made it into the mainline kernel as of 5.13.0. I'm still getting the system lock-up when X11 starts, and have to hit the reset switch to recover. For whatever it might be worth, the mainline 5.10.0 kernel continues to work properly alongside all the user space changes in "sid" that have happened since late last year. Respectfully, --Bob On Fri, Jun 04, 2021 at 12:37:14AM -0500, Bob Tracy wrote: > On Fri, Jun 04, 2021 at 12:18:58AM -0500, Bob Tracy wrote: > > On Thu, Jun 03, 2021 at 03:15:05PM +0200, Maciej W. Rozycki wrote: > > > I have lost track about this issue, so please fill me in as to whether > > > the offending commit causing the regression has been bisected or not. > > > > It has. Michael Cree reported the following back on April 5th: > > > > And the first bad commit is: > > > > 0fe3cf3a53b5c1205ec7d321be1185b075dff205 is the first bad commit > > commit 0fe3cf3a53b5c1205ec7d321be1185b075dff205 > > Author: Christian König > > Date: Sat Oct 24 13:12:23 2020 +0200 > > > > drm/radeon: switch to new allocator v2 > > > > It should be able to handle all cases here. > > > > v2: fix debugfs as well > > > > Signed-off-by: Christian König > > Reviewed-by: Dave Airlie > > Reviewed-by: Madhav Chauhan > > Tested-by: Huang Rui > > Link: https://patchwork.freedesktop.org/patch/397088/?series=83051=1 > > > > :04 04 4e643ef861b921392bc67be21a42298c91c7ff7a > > b36453567c3176a3cd50fa0b23886b0fd642560d M drivers > > There were a few follow-up messages in this thread that left me with the > impression there *may* have been a patch submitted, although Christian > complained at the time he was having problems locating Alpha hardware to > test with. > > The current (5.12.0 kernel) problem symptoms show some "improvement". > I at least got to the point that the login screen displayed, but it > had a bit of pixelation/distortion in a few areas indicative of "bad > things about to happen". Then I got the expected system lock-up, > just as I originally reported: had to hit the reset switch to recover.
Re: X11 system lockup with 5.11.0 kernel
On Fri, Jun 04, 2021 at 12:18:58AM -0500, Bob Tracy wrote: > On Thu, Jun 03, 2021 at 03:15:05PM +0200, Maciej W. Rozycki wrote: > > I have lost track about this issue, so please fill me in as to whether > > the offending commit causing the regression has been bisected or not. > > It has. Michael Cree reported the following back on April 5th: > > And the first bad commit is: > > 0fe3cf3a53b5c1205ec7d321be1185b075dff205 is the first bad commit > commit 0fe3cf3a53b5c1205ec7d321be1185b075dff205 > Author: Christian König > Date: Sat Oct 24 13:12:23 2020 +0200 > > drm/radeon: switch to new allocator v2 > > It should be able to handle all cases here. > > v2: fix debugfs as well > > Signed-off-by: Christian König > Reviewed-by: Dave Airlie > Reviewed-by: Madhav Chauhan > Tested-by: Huang Rui > Link: https://patchwork.freedesktop.org/patch/397088/?series=83051=1 > > :04 04 4e643ef861b921392bc67be21a42298c91c7ff7a > b36453567c3176a3cd50fa0b23886b0fd642560d Mdrivers There were a few follow-up messages in this thread that left me with the impression there *may* have been a patch submitted, although Christian complained at the time he was having problems locating Alpha hardware to test with. The current (5.12.0 kernel) problem symptoms show some "improvement". I at least got to the point that the login screen displayed, but it had a bit of pixelation/distortion in a few areas indicative of "bad things about to happen". Then I got the expected system lock-up, just as I originally reported: had to hit the reset switch to recover. --Bob
Re: X11 system lockup with 5.11.0 kernel
On Thu, Jun 03, 2021 at 03:15:05PM +0200, Maciej W. Rozycki wrote: > I have lost track about this issue, so please fill me in as to whether > the offending commit causing the regression has been bisected or not. It has. Michael Cree reported the following back on April 5th: And the first bad commit is: 0fe3cf3a53b5c1205ec7d321be1185b075dff205 is the first bad commit commit 0fe3cf3a53b5c1205ec7d321be1185b075dff205 Author: Christian König Date: Sat Oct 24 13:12:23 2020 +0200 drm/radeon: switch to new allocator v2 It should be able to handle all cases here. v2: fix debugfs as well Signed-off-by: Christian König Reviewed-by: Dave Airlie Reviewed-by: Madhav Chauhan Tested-by: Huang Rui Link: https://patchwork.freedesktop.org/patch/397088/?series=83051=1 :04 04 4e643ef861b921392bc67be21a42298c91c7ff7a b36453567c3176a3cd50fa0b23886b0fd642560d M drivers --Bob
Re: X11 system lockup with 5.11.0 kernel
On Tue, Apr 06, 2021 at 12:19:29PM +0200, John Paul Adrian Glaubitz wrote: > We're also supporting everything else that most commercial vendors consider > obsolete > such as PA-RISC, M68k, big-endian PowerPC (32 and 64 bits) SPARC and so on, > in case > you need testing there. (Mostly including the above just as a reference to the most recent posting in this thread...) As of mainline kernel 5.12.0, the fix I (we) have been waiting for still hasn't been included. My alpha still locks up when X11 starts. Stuck at kernel version 5.10.0 for the time being. Respectfully, --Bob
Re: X11 system lockup with 5.11.0 kernel
On Wed, Mar 31, 2021 at 11:04:42AM +0200, Maciej W. Rozycki wrote: > I think the only feasible way of determining what has happened here is > that you track the offending change down by bisecting the upstream kernel > repository with `git bisect'. That would normally be what I would do, and it may yet happen. Problem is, I don't have a 64-bit system with enough disk space to do the kernel builds with a cross-compiler, and local (native) builds on the PWS are taking 36+ hours each these days. Unless I get *really* lucky with the bisects, the task will take a couple of weeks. Anyway, I've whined enough :-). Might as well get started... --Bob
Re: X11 system lockup with 5.11.0 kernel
On Wed, Mar 24, 2021 at 09:48:46AM -0500, Bob Tracy wrote: > (...) > Everything worked as well as it's going to for kernel versions up > through v5.10.0. When I boot on v5.11.0, "lightdm" starts, the screen > goes blank as usual, I get a mouse pointer as usual, and shortly after > that, the system locks up solid (completely nonresponsive except for > being able to ping it -- can't login remotely). Recovery is via the > reset switch at that point :-(. > (...) Same results for 5.12.0-rc4 kernel. --Bob
X11 system lockup with 5.11.0 kernel
All, First an apology for being "dark" for so long. There are still a few of us out here using Alpha computers... Another apology for the crappy "bug report" that follows, but first, a little background information. I'm not in the habit of running X11 on my PWS 433au these days, except for periodically verifying "lightdm" and "afterstep" still work. The natural order of things is "increased bloat over time", and the elephant barely walks these days, much less dances, i.e., even an extremely lightweight window manager like AfterStep is barely usable on my PWS. Everything worked as well as it's going to for kernel versions up through v5.10.0. When I boot on v5.11.0, "lightdm" starts, the screen goes blank as usual, I get a mouse pointer as usual, and shortly after that, the system locks up solid (completely nonresponsive except for being able to ping it -- can't login remotely). Recovery is via the reset switch at that point :-(. It doesn't seem to be a userspace problem: rebooting the system on the old v5.10.0 kernel works fine. Nothing is showing up in any of the system logs, unfortunately, so I'm at a loss how to troubleshoot this further. The PWS has the same Radeon 7500 video card in it that I've had for years. Any ideas/help appreciated, as always. In the meantime, my default strategy is to press on with trying the v5.12 release candidates (standard kernel.org source tree). Respectfully, --Bob
Re: directory sticky bit strangeness following libc6 update
On Sun, Apr 19, 2020 at 01:01:17AM +0200, Matthias Ferdinand wrote: > On Sat, Apr 18, 2020 at 07:48:27AM -0500, Bob Tracy wrote: > > > If the rules had changed, it should not succeed even without > > > O_CREAT. A bug? > > > > That's *my* take on the matter. It will be a day or so before I can > > check upstream and see if any bug reports have been opened against > > libc6, but if someone else would care to look in the meantime :-) ... > > found https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=954230, added > O_CREAT information to it. > > Matthias Ferdinand An update to that bug report suggested checking /proc/sys/fs/protected_regular, which is now set to 2 by default on my alpha. No idea where the new setting is coming from. It's a sysctl setting that has evidently been around for a good while. Other systems I have access to that are running the same kernel have that value set to 0. So I guess the current verdict is, "works as documented". Would still like to know what changed, because it's not being touched by the kernel build process: else, the other systems running the same kernel would be exhibiting the same behavior. --Bob
Re: directory sticky bit strangeness following libc6 update
On Sat, Apr 18, 2020 at 12:25:11PM +0200, Matthias Ferdinand wrote: > On Fri, Apr 17, 2020 at 02:17:46PM -0500, Bob Tracy wrote: > > (directory sticky bit handling strangeness) > > it seems the difference lies in handling of O_CREAT. > > (...) > > not Alpha specific; this was done on x86_64 Ubuntu 20.04 beta: > > # uname -a; dpkg -l 'libc6' | grep ^.i > Linux xyz 5.4.0-24-generic #28-Ubuntu SMP Thu Apr 9 22:16:42 UTC 2020 > x86_64 x86_64 x86_64 GNU/Linux > ii libc6:amd642.31-0ubuntu7 amd64GNU C Library: Shared > libraries > > > same kernel installed on an x86_64 Ubuntu 18.04, I get no "Permission denied": > > # uname -a; dpkg -l 'libc6' | grep ^.i > Linux xyz18 5.4.0-24-generic #28-Ubuntu SMP Thu Apr 9 22:16:42 UTC 2020 > x86_64 x86_64 x86_64 GNU/Linux > ii libc6:amd642.27-3ubuntu1 amd64GNU C Library: Shared > libraries > > > So it seems not to be caused by the kernel version; strange how the same > syscalls give different results depending on libc version. > > If the rules had changed, it should not succeed even without > O_CREAT. A bug? That's *my* take on the matter. It will be a day or so before I can check upstream and see if any bug reports have been opened against libc6, but if someone else would care to look in the meantime :-) ... --Bob
directory sticky bit strangeness following libc6 update
All, This likely isn't unique to Debian, much less the alpha platform, but I first encountered this strangeness on my alpha running Debian unstable. Best way to explain what I'm seeing is by example. A fairly common thing to do is create temporary or download directories with octal mode 1777 that are accessible by everyone. The directory can be read/written by everyone, but users (with the exception of "root") cannot delete files in the directory that they do not own. Otherwise, normal file permissions are applied as far as operations that can be performed on a particular file, and the expected (pre-libc6 update) behavior is that "root" can do anything with a particular file in the absence of extended ACL or selinux interference. "/var/tmp" is one such directory, and a thing I like to do is maintain a list of currently-installed packages by running "dpkg -l > packages" in that directory as a normal user. Prior to the libc6 update, "root" could update that file with an editor or by running the same "dpkg -l > packages" command. After the libc6 update, "root" can't do anything with the file except delete it. The file's owner is the only user that can update it, EVEN IF THE FILE PERMISSIONS ALLOW WRITING BY EVERYONE. Even more odd: "root" can change the permissions on the file to, say, "-rw-rw-rw-", and STILL cannot update the file. Outside of the directory having the sticky bit set, "root" can still do anything/everything to another user's files as expected. I'm currently running an up-to-date "unstable" distro. Kernel version is 5.5.0, and libc6 is "libc6.1:alpha 2.30-4". Maybe the rules have changed. If so, a pointer to the relevant documentation would be appreciated. Thanks. --Bob
packaging error: cmake-3.16.3-1
"apt-get upgrade" is failing on "cmake_3.16.3-1_alpha.deb" with the following errors for the past day: Get:1 http://ftp.ports.debian.org/debian-ports unstable/main alpha cmake alpha 3.16.3-1 [3,531 kB] Err:1 http://ftp.ports.debian.org/debian-ports unstable/main alpha cmake alpha 3.16.3-1 File has unexpected size (3529328 != 3530656). Mirror sync in progress? [IP: 2001:4f8:1:c::15 80] Hashes of expected file: - SHA512:f2edd4855fb556ab719479c89b72f05054624b8e905240d613f4631e473e90f681d70c7c8f46d6063bd21b54e1f8d65bf3b6bb94532f16458be03c798a1f610d - SHA256:f0b6df816e19a005727a49ed362f5ddbf66c8f088ad9def36f443f687d985af7 - SHA1:6fbd019ddd13bfd770d1ce0468c860f6ac857c33 [weak] - MD5Sum:b531b2ce6a59006d99ee0eff9eccd8af [weak] - Filesize:3530656 [weak] E: Failed to fetch http://ftp.ports.debian.org/debian-ports/pool-alpha/main/c/cmake/cmake_3.16.3-1_alpha.deb File has unexpected size (3529328 != 3530656). Mirror sync in progress? [IP: 2001:4f8:1:c::15 80] Hashes of expected file: - SHA512:f2edd4855fb556ab719479c89b72f05054624b8e905240d613f4631e473e90f681d70c7c8f46d6063bd21b54e1f8d65bf3b6bb94532f16458be03c798a1f610d - SHA256:f0b6df816e19a005727a49ed362f5ddbf66c8f088ad9def36f443f687d985af7 - SHA1:6fbd019ddd13bfd770d1ce0468c860f6ac857c33 [weak] - MD5Sum:b531b2ce6a59006d99ee0eff9eccd8af [weak] - Filesize:3530656 [weak] E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
Re: dbus-daemon unaligned accesses
On Sat, Jan 18, 2020 at 05:33:31PM +, Witold Baryluk wrote: > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=932381 At first glance, that certainly appears to be the issue. The conversation seems to have stalled-out as of July 2019. memcpy() looks to be a good way of handling the problem, for the reasons mentioned. Did you try that fix? If so, did it work for you? --Bob
dbus-daemon unaligned accesses
On my alpha, the system logs are getting spammed with unaligned trap errors as follows: [34656.586748] dbus-daemon(700): unaligned trap at 020a9720: d68c7222 28 18 [34656.599443] dbus-daemon(700): unaligned trap at 020a9720: d68c7222 28 18 [34656.612138] dbus-daemon(700): unaligned trap at 020a9720: d68c7222 28 18 [34656.617021] dbus-daemon(700): unaligned trap at 020a9720: d68c7222 28 18 [34656.624833] dbus-daemon(700): unaligned trap at 020a9720: d68c7222 28 18 The current "dbus" package version is 1.12.16-2. If no one else is actively working to fix this annoyance, I'll see what I can do. For me, it's mostly a matter of finding the time to download the source package and its dependencies, build a debug version with symbols in it that "gdb" can use, and then *maybe* figure out the best way to code around the unaligned access. If anyone else has the time and would like to have a go at it, the following two links might be useful: https://wiki.gentoo.org/wiki/Project:Alpha/Porting_guide#Unaligned_accesses https://www.redhat.com/archives/axp-list/2000-May/msg00151.html (Yes, the problem has been around at least as long as the alpha architecture :-) ). Michael Cree et al.: do we have a working "gdb" on alpha these days? I seem to recall brokenness there in the not-too-distant past. --Bob
libcrypt1 1:4.4.10-5 packaging error?
Tried to install the latest libc6.1 this evening, and ran into an issue with the "libcrypt1" installation. Specifically, "perl" is looking for "libcrypt.so.1.1", and after the new "libcrypt1" package gets installed, the following library and symlinks exist under "/usr/lib/alpha-linux-gnu": libcrypt.so -> libcrypt.so.1.1.0 libcrypt.so.1 -> libcrypt.so.1.1.0 libcrypt.so.1.1.0 Without a "libcrypt.so.1.1" symlink, the new "libc6.1" package and its dependencies cannot be configured. Adding the symlink and then running "dpkg --configure -a" takes care of the problem. --Bob
Re: Updated installation images for Debian Ports 2019-11-22
On Sat, Nov 30, 2019 at 05:51:45PM +0100, John Paul Adrian Glaubitz wrote: > > On Nov 30, 2019, at 4:54 PM, Skye wrote: > > > > Bob, that is excellent information. Thank you for sharing! > > I suggest turning this into a patch. Fixing guile-2.0 and guile-2.2 on alpha > is dearly needed, so patches are really welcome. > > Adrian I definitely appreciate that fixing the guile-2.0 and guile-2.2 builds on alpha is a priority, and if there was anything useful I could contribute beyond demonstrating it can be done, I'd be happy to provide patches. The problem *I* ran into was entirely due to how s-l-o-w my system is. Since the issue is associated with exactly *one* of the guile-2.2 tests (for the "guild" compiler), I'm reluctant to have a "hack" workaround become part of the test suite source, especially since the problem will never be seen on one of the "buildd" hosts. I didn't see the problem with the exact same test on the "guile-2.0" build because 2.0 runs more efficiently on older, slower systems. If you feel otherwise as far as wanting a patch, the simple diff is appended below. Nothing magical about the "sleep" values I picked. The first one is to allow enough time for the "guild" compiler to actually begin doing something, and *may* be too long to wait for a machine that can actually get out of its own way :-(. The second sleep value can be anything less than the 100 seconds allowed by the test script for the compile to complete, but needs to be long enough to allow the "guild" compiler to receive and process the sent SIGINT. All that being said, I'd *definitely* think twice about blindly changing the sleep values. Again, you'll never see this issue on the "buildd" systems. If I were the package maintainer, I'd reject this patch :-). (file is in "guile-2.2-2.2.6+1/test-suite/standalone" after extracting the source package) --- test-guild-compile.orig 2019-11-30 17:56:39.276270948 -0600 +++ test-guild-compile 2019-11-30 17:57:18.874959718 -0600 @@ -23,10 +23,10 @@ pid="$!" # Send SIGINT. -sleep 2 && kill -INT "$pid" +sleep 5 && kill -INT "$pid" # Wait for 'guild compile' to terminate. -sleep 2 +sleep 15 # Check whether there are any leftovers. for file in "$target"*
Re: Updated installation images for Debian Ports 2019-11-22
On Sat, Nov 30, 2019 at 01:59:36PM +1300, Michael Cree wrote: > (...) It passes more often than not and > only fails occasionally. I see that there is a patch in the > debian/patches directory to avoid a race condition in this test. > But I don't know guile so don't understand the code. There are a few of the "guile" tests that have some timing aspects where sometimes you "win" the race, and other times you "lose". In an earlier private message, I indicated one such test where I had to lengthen the sleep intervals before following actions were taken (because my system is so slow relative to modern hardware). If I didn't mention the specific test, it had to do with making sure the "guild" compiler would clean up after itself if interrupted. On the PWS, it was taking a few more seconds for the interrupt to be received and processed than the test originally allowed. You wouldn't have seen or experienced that particular problem on any of the "buildd" systems. --Bob
Re: Updated installation images for Debian Ports 2019-11-22
On Sat, Nov 30, 2019 at 12:10:28PM +1300, Michael Cree wrote: > ERROR: 00-repl-server.test: repl-server: HTTP inter-protocol attack - > arguments: ((system-error "fport_write" "~A" ("Broken pipe") (32))) > > Bob: how did you get past this test or did it pass on your build? It passed on mine. I didn't save the build log for the 2.0 build, but here's the relevant section of the 2.2 build log: (...) make check-TESTS make[4]: Entering directory '/opt/downloads/work/guile-2.2/guile-2.2-2.2.6+1' Testing /opt/downloads/work/guile-2.2/guile-2.2-2.2.6+1/meta/guile ... with GUILE_LOAD_PATH=/opt/downloads/work/guile-2.2/guile-2.2-2.2.6+1/test-suite Running 00-initial-env.test Running 00-repl-server.test Running 00-socket.test Running alist.test (...) --Bob
Re: Updated installation images for Debian Ports 2019-11-22
(This is a separate copy to the list, just to keep everyone informed. No attachment included.) On Tue, Nov 26, 2019 at 04:49:15PM +1300, Michael Cree wrote: > I don't seem to have received that message. I'll try sending again just to you... The attached "packages" file was on the order of 500k, and it's possible an upstream mailer got offended at the message size. In *this* letter, I'll append the list gzipped. Here's the relevant portion of that earlier posting: gcc is version 9.2.1 (Debian 9.2.1-19) ld is version 2.33.1 (binutils 2.33.1-4) kernel version is 5.3.0 built from the kernel.org source tree Other packages are as in the attached "packages" file ("dpkg -l" output). Started the "guile-2.2" build. So far, so good after 12+ hours :-). --Bob
Re: Updated installation images for Debian Ports 2019-11-22
On Tue, Nov 26, 2019 at 12:00:59PM +1300, Michael Cree wrote: > Did you build with latest toolchain? I suspect the issue has > appeared with toolchain changes (hard to pin down when because there > was quite a period in which a new version of guile-2.0 was not > uploaded). > > And the bug (a segfault when texi documentation is built with the > recently built guild executable) looks to be present elsewhere too > (take a look at #941218 where comment #10 seen on Ubuntu looks > suspiciously like what we see on Alpha assuming it occurs at the > same place). I think I answered the toolchain question in my reply to Adrian's earlier message. There was an attached "packages" file with the complete list of what I've got installed on the PWS. > Unless built in clean chroot with only the build dependencies installed > and with an up to date toolchain they won't be much use to us. The toolchain is up-to-date, but I don't have the infrastructure to support a clean chroot environment, even on another local system if I were to try and use a cross-compiler vs. a native build. In reference to the build dependencies, if particular versions aren't specified, or are only loosely specified (e.g., >= some value), are the dependencies considered "best" satisfied with a stable package version meeting the requirement? Or is the current unstable version of a dependency preferred when building for "sid"? Many variables to consider, I guess. --Bob
Re: Updated installation images for Debian Ports 2019-11-22
On Sat, Nov 23, 2019 at 07:36:11AM +1300, Michael Cree wrote: > That's not going to help at the moment because vim is bd-uninstallable. > > The real problem is guile-2.0 and guile-2.2, both of which FTBFS, and > are blocking the building of many other packages. I downloaded the Debian source for "guile-2.0_2.0.13+1-5.3" and successfully built the binary packages on my PWS-433au without having to modify anything. My guess is some kind of toolchain or other build environment issue on the "buildd" servers. Michael -- I've got the following ".deb" packages available, and you're welcome to them if they would be of any help getting us unstuck: guile-2.0_2.0.13+1-5.3_alpha.deb guile-2.0-libs_2.0.13+1-5.3_alpha.deb guile-2.0-dev_2.0.13+1-5.3_alpha.deb guile-2.0-libs-dbgsym_2.0.13+1-5.3_alpha.deb guile-2.0-doc_2.0.13+1-5.3_all.deb Just need a place to upload them where you can get to them, or I could send them as e-mail attachments if all else fails: the "libs" package is the largest at 2,262,128 bytes. I'll get started on trying to build "guile-2.2" later today. --Bob
Re: Updated installation images for Debian Ports 2019-11-22
On Fri, Nov 22, 2019 at 11:23:10AM +0100, John Paul Adrian Glaubitz wrote: > (...) > The images for alpha and ia64 could have issues because of the missing > vim package [3]. Someone needs to have a look at vim on these two > architectures. > > (...) > > [3] https://buildd.debian.org/status/package.php?p=vim=sid I've noticed I haven't been able to update the "vim" packages for a long time. Michael Cree -- if you see this, I think you explained the problem to me many moons ago. In any event, there has been a held update to "vim-common" on alpha for at *least* the past year. We seem to be stuck at version "2:8.1.0875-5". --Bob
Re: congratulations in order
On Sat, Sep 28, 2019 at 04:15:17PM -0600, Skye wrote: > Congrats! Can you tell us how you got to that point? I need to bring up a > series of servers next week and dreading my ignorance. They are currently > running an old release of Red Hat. Short answer: up-to-date Debian "sid" (unstable) on a PWS 433au with a kernel built from the latest kernel.org source tree. Longer answer: your mileage *will* vary, depending on your hardware. The hardest part of "getting to that point" is going to be bootstrapping from nothing. The debian-alpha archives have *many* postings that will attest to that :-(. I probably missed it, but we might have an install CD at this point that includes enough of the needed drivers to accomplish an installation. If not, the known traditional trouble spots are video and disk controller support. If you clear that hurdle, successfully partitioning hard disks on alpha is more difficult than it should be, and depends entirely on what tool you choose: recent "fdisk" versions on alpha are broken -- see the debian-alpha archives for workarounds. If all else fails, you can try either the last official Debian release for alpha, or maybe a Gentoo boot CD. I would encourage you to try the latest Debian CD though... and document here exactly what doesn't work so there's a chance of getting it fixed. Upgrading from the ancient Debian stable release *will* be problematic, and I can't really recommend that option. --Bob
congratulations in order
Seriously. I just experienced the first "flawless" boot of my Alpha in over two years. All devices initialized and came up perfectly, including in particular the network interfaces, the X11 graphical login screen, all configured file systems, and even the hardware clock. The latter has been an issue for some time, and until today, hadn't survived a reboot without me having to manually reset it from the system clock. Current kernel is 5.3.0, built from the kernel.org source tree with the gcc-9.2 compiler and associated current (unstable release) tool chain. --Bob
vmlinux.o linker warning
Using gcc 9.2 to build current 5.X kernel.org source trees, I'm seeing several warnings like the following during the linking of "vmlinux.o": WARNING: vmlinux.o(__ex_table+0x11d0): Section mismatch in reference from the (unknown reference) (unknown) to the (unknown reference) .alphalib:(unknown) The relocation at __ex_table+0x11d0 references section ".alphalib" which is not in the list of authorized sections. If you're adding a new section and/or if this reference is valid, add ".alphalib" to the list of authorized sections to jump to on fault. This can be achieved by adding ".alphalib" to OTHER_TEXT_SECTIONS in scripts/mod/modpost.c. This is directly due to the workarounds we put in place on alpha for dealing with relocation errors during the linking of large executables (such as "firefox" :-)). These are warnings, but sending a fix upstream sooner rather than later would probably be a good idea. --Bob
Re: systemd network interface configuration (was "Re: systemd woes continue")
On Thu, Sep 19, 2019 at 09:10:20AM +0200, John Paul Adrian Glaubitz wrote: > (...) > So, can you please type "ip a" and check what device name is actually assigned > to your wired card and if it differs from "eth0", adjust your /etc/network/ > interfaces file? > > If your wired card is actually named "eth0", then the problem is somewhere > else and we need to proceed in your next mail. The wired card really *is* "eth0". "ip a" shows five interfaces in the current active (correct) environment: 1: lo 2: eth0 (my primary network interface, connected to a pre-CIDR routable class C network) 3: enx00e04c6881f7 (USB NIC connected to an internal non-routable class C network) 4: sit0@NONE (tunnel for IPv6-in-IPv4 traffic) 7: he-ipv6@NONE (a point-to-point IPv4 connection to the IPv6 tunnel broker) The wireless interface in the "interfaces" file corresponds to a USB adapter I haven't used with the Alpha in a long time. I left the configuration info there as a reminder of how to do that if/when it becomes necessary :-). At the risk of providing too much information, the Alpha is serving as a local IPv6 gateway router. "/etc/sysctl.conf" has "net.ipv6.conf.all.forwarding=1" which is appropriate (required) for a router, because otherwise, "radvd" will "unexpectedly" configure an additional global IPv6 address for "eth0" which you definitely do not want. The only global scope IPv6 addresses are statically assigned to the "eth0" and "he-ipv6" interfaces. There remains a bit of strangeness, even if/when the interfaces are brought up correctly, because of the "gateway" configuration line associated with the USB interface. I need to comment that out: multiple default routes at identical priorities is a legitimate configuration error in my setup. At one time, there was a legitimate reason for that gateway line to be there: no need to go into that level of detail at present :-). --Bob
systemd network interface configuration (was "Re: systemd woes continue")
On Wed, Sep 18, 2019 at 11:46:06AM +0200, John Paul Adrian Glaubitz wrote: > Your permanent bashing of systemd makes answering your mails stressful > for me. Adrian -- please accept my apology for my rantings... They contribute nothing to the conversation, and as you note, irritate the very people in the best position to render needed assistance. Going back to a previous message you sent, you suggested looking at a few systemd network-related services: (1) systemd-networkd: this is currently showing "disabled" on my system (vendor preset: enabled). (2) resolver-related systemd services such as "resolvconf" and "systemd-resolved": "resolvconf" is "enabled", but "systemd-resolved" is "disabled" (vendor preset: enabled). None of the services mentioned above have any configuration files other than the defaults. So, I guess the main question on the table is, what's the best path forward to ensure network interfaces are brought up and configured automatically at boot time? Related to that question: is the use of "/etc/network/interfaces" deprecated? That's where my network configuration details currently exist, and that used to be sufficient, even after the migration from the old-style init program/scripts to "systemd". A sanitized copy of my current "interfaces" file is attached. Thanks in advance for the assist. --Bob # This file describes the network interfaces available on your system # and how to activate them. For more information, see interfaces(5). # The loopback network interface auto lo iface lo inet loopback # The primary network interface allow-hotplug eth0 iface eth0 inet static address (masked) netmask 255.255.255.240 network (masked) broadcast (masked) gateway (masked) # dns-* options are implemented by the resolvconf package, if installed dns-nameservers (masked) dns-search (my registered domain) # /64 prefix assigned by Hurricane Electric iface eth0 inet6 static address (masked) netmask 64 scope global # Wireless settings for D-Link DWA-131 (r8712u driver from staging -- sigh) # The initial interface name is wlan0, but that gets remapped to the name # below by systemd+udev. allow-hotplug wlx1c7ee513fb7b iface wlx1c7ee513fb7b inet dhcp wireless-mode Managed wpa-driver wext wpa-ssid (masked) wpa-psk (masked) # USB RTL8153 Gigabit Ethernet Adapter allow-hotplug enx00e04c6881f7 iface enx00e04c6881f7 inet static address (masked) netmask 255.255.255.0 network (masked) broadcast (masked) gateway (masked) # Hurricane Electric tunnel: ID# (masked) est. 01 May 2016 auto he-ipv6 iface he-ipv6 inet6 v4tunnel address (masked) netmask 64 endpoint (masked) local (masked, but IPv4 address of eth0) ttl 255 gateway (masked)
Re: systemd woes continue
On Mon, Sep 16, 2019 at 09:38:55PM -0500, Bob Tracy wrote: > I say "almost clean", because a recent update managed to break my > resolver configuration: "/etc/resolv.conf" came up completely empty, in > spite of "/etc/network/interfaces" having the requisite > "dns-nameservers" and "dns-search" lines for my primary interface (eth0). > The "resolvconf" package is definitely installed, but I'm guessing some > new interaction between "systemd", "NetworkManager", and possibly > "resolvconf" is, er, ah, unfortunate. It appears that all the configuration info in "/etc/network/interfaces" for my "eth0" interface was completely ignored other than the IPv4 address. No IPv6 address, no "dns-*" config items, etc. Thanks, systemd! Not :-(. I blame "systemd", because, when it was still dropping me into emergency mode, "eth0" came up correctly. --Bob
Re: systemd woes continue
On Mon, Sep 09, 2019 at 10:49:25PM +0200, Aurelien Jarno wrote: > > ... > > > > You are in emergency mode. After logging in, type "journalctl -xb" to view > > system logs, "systemctl reboot" to reboot, "systemctl default" or "exit" > > to boot into default mode. > > Give root password for maintenance > > (or press Control-D to continue): > > In my case, upgrading glibc to 2.29 (now available in unstable) fixed > the issue. Verified. Life is now better, if not good :-). This is the first time I've been able to accomplish an "almost clean" boot of my Alpha in MANY months. I say "almost clean", because a recent update managed to break my resolver configuration: "/etc/resolv.conf" came up completely empty, in spite of "/etc/network/interfaces" having the requisite "dns-nameservers" and "dns-search" lines for my primary interface (eth0). The "resolvconf" package is definitely installed, but I'm guessing some new interaction between "systemd", "NetworkManager", and possibly "resolvconf" is, er, ah, unfortunate. --Bob
Re: systemd woes continue
On Wed, Jul 17, 2019 at 01:45:21PM +, Witold Baryluk wrote: > Hey John, > > > The gist is: A lot of projects don't test their code on systems with > separate > > /usr partitions anymore, so things get silently broken. > > I don't have separate /usr, just single / (ext4) partition, and just > separate /boot (ext2), and still systemd fails to mount this /boot file > system, similar to Michael issue. So, I dont think it is really related to > separate /usr vs non separate /usr. > > PS. On my amd64 system with systemd, I do have separate /usr, and it does > work. Recall my original statement of the issue. Separate "/usr" is ok. Other persistent filesystems, say, "/boot", do *not* work. This is consistent with what the rest of us are seeing who have run into this problem with "systemd". I agree with the assertion/speculation that what we're seeing shouldn't be unique to Alpha, and I don't think it is. The "everything on /" default partitioning scheme is sensible for people new to UN*X who simply want to get Linux up and running as quickly and easily as possible. For people with a little bit of experience and a sense of historical perspective, separate persistent filesystems are the way to go. Most of the "heavy" disk I/O is associated with directories like "/tmp" and "/var", not to mention swap partitions. "/tmp" is non- persistent in modern distros, so we'll ignore it for now. On the other hand, "/var" sees quite a bit of disk I/O (software updates, spooling, e-mail, etc.), and it makes sense for "/var" to be on a separate partition so that *when* a crash occurs, the resulting filesystem corruption won't affect "/". Other persistent filesystems are a matter of individual taste/preference, but in general, the idea is to distinguish between those parts of the directory hierarchy that are relatively stable vs. those that are likely to see a lot of writing. Modern filesystems aren't as likely to be unrecoverable after a crash, so we can get away with a monolithic "everything on /" philosophy. That wasn't always the case, and old fossils like me that got burned decades ago haven't forgotten the lessons that were learned the hard way :-(. Another consideration is that disks used to be much smaller, and multiple persistent partitions were pretty much a forced choice, i.e., you couldn't put everything on a single spindle, and there was no way to construct a large logical volume out of multiple smaller physical ones as we can do today. The standard hard disks available for a DEC PWS when that system first hit the market would certainly have required either separate persistent filesystems or logical volume capability to be useful. At the time I first installed Linux on an Alpha, I think only RedHat was using LVM by default (although they no longer supported Alpha at that point), and I can't remember whether anyone else's installer even offered it as an option. In summary, I guess I'm saying there are valid reasons in 2019 for having multiple persistent fiiesystems, even with large physical disks and the ability to create large logical volumes out of multiple smaller ones. --Bob
Re: systemd woes continue
On Wed, Jul 17, 2019 at 03:16:09PM +0200, Frank Scheiner wrote: > To sum things up: what Adrian intends to do for Alpha - pre-include the > firmware on the installer discs - seems to be the only way to get this > problem fixed w/o manual intervention during installation. Many other things to comment on in this thread, but not much time to do so at the moment. I *did* want to comment on the above... It more than likely would be completely against Debian policy/guidelines to do so, but the best *technical* solution to the qlogic firmware issue on Alpha is to build the firmware into the kernel along with the driver. That is how it was done for older Debian releases (and other distros) before kernel developers adopted the default position of NOT building the firmware into the kernel. The fact that the firmware is non-free from a licensing perspective only made matters worse. As a side-note, if you build your own kernel from source and build-in a driver (rather than build it as a module) that requires firmware, you *have* to include the firmware in the kernel as well -- the kernel configurator neither detects nor enforces inclusion of the corresponding firmware -- you have to manually specify you want that done. My custom kernels have all the drivers for hardware that's present and required at boot time built-in to the kernel, along with the required firmware. For the Miata platform I use (PWS 433au), the two affected drivers are for the qlogic SCSI card and the Radeon-based graphics card. Things get complicated in a hurry when required drivers are built as modules if the firmware isn't available when the hardware gets auto- detected and the module gets loaded. As one person in this thread already observed, one workaround is to blacklist the driver module to prevent it from being loaded until the firmware can somehow be made available -- a less than satisfactory solution because the driver must then be loaded manually -- definitely not user-friendly in any sense of the phrase. To summarize... If *I* were trying to put together an installer image for the Alpha platform, it would include drivers for the default disk controllers for each Alpha variant, drivers for the standard DEC video cards, and drivers for the commonly-used Radeon cards that seem to be the agreed-upon upgrade used by the Alpha community. I'm not familiar with Alpha platforms other than the one I use (Miata), but I think it would be helpful/useful for us to put together a list of disk controller and video drivers that might reasonably be needed in an installer image. This would help Adrian get the Debian installer where it needs to be in fewer iterations. Respectfully, --Bob
Re: systemd woes continue
On Thu, Jul 11, 2019 at 09:48:14PM +0900, John Blake wrote: > (...) > I have a DS10L 617mhz and I can't figure out which version is the best to > attempt to install on it.?? I'd rather avoid things like this issue with > systemd where they obviously haven't tried to actually test it on an alpha > processor... I don't believe the systemd issue I'm experiencing is unique to the alpha architecture. Apologies if I left you with that impression. That being said, I'm pretty sure most of the people reading this know how I feel about "systemd", and I'll state here and now, for the record, that my feelings are irrelevant at this point. That battle was lost a long time ago, and the community is best served by trying to identify and fix the legitimate bugs. > The other question I have is whether or not someone has fixed the issue with > fdisk on the system (...) Check back in the relatively recent (no more than a year ago) archives for this mailing list, but I believe we agreed that "fdisk" was not the correct tool to use for setting up the disk partitions on Alpha. The criticism about "fdisk" being mentioned in the installation documentation is legitimate, and that should be fixed. However, since this is an Alpha-specific thing, and Alpha is no longer a release architecture for Debian, the chances of getting the documentation updated are tending more toward "not happening" these days :-(. If there's a "systemd" wonk tracking this conversation, the main issue I'm seeing with the multiple persistent filesystems is that the dependency service scripts for filesystems other than "/" and "/usr" are dynamically generated at boot time based on what's defined in "/etc/fstab". The other filesystems are being correctly discovered and enumerated (based on the messages I see on the console), but for some reason, "systemd" is unable to figure out how to choose and run the appropriate "fsck" variant ("e2fsck" in my case), so the dependencies (remaining filesystems) fail. Other than this recent crap with more than one process trying to read input from the console at the same time, the workaround for the remaining persistent filesystems is straightforward: (1) when dropped into "emergency mode", enter the root password; (2) run the appropriate filesystem checker for each of the remaining persistent filesystems, and mount them; (3) exit "emergency mode", and the system *should* finish coming up multi-user. I usually do a few other things before exiting emergency mode, such as bring up my primary network interface so I can run "ntpdate" and set the system clock (on my PWS 433au, the hardware clock is *always* *way* off following a reboot, and yes, the battery on the motherboard is good). --Bob
systemd woes continue
Greetings. It has been a while since I last checked in. Thought I'd let the rest of the Alpha community know I'm still around :-). I'm up and running on kernel version 5.2.0, built from the kernel.org source tree as is my usual pattern. The previous kernel running on my system was 5.1.0-rc7. Between then and today, something changed in user space that made the expected drop into systemd's "emergency mode" more painful than usual. First, "systemd" still cannot handle systems with persistent filesystems other than "/" and "/usr". As far as I know, the bug report I filed against "systemd" is still open, and no progress has been made on that front. The added complication when I rebooted the system today was multiple processes attempting to read input from the console at the same time. Both the old kernel and the new one behaved identically, which is why I'm assuming a problem with userspace. If you immediately type in the "root" password when prompted (without waiting for additional background init tasks to finish), things work normally up to the point where the console font gets loaded. Sometime after that, part of what you type goes to the command line, and the rest goes to ???. Tty echo is disabled, so you can't tell which input characters are going to the interactive shell, and which ones are going to ???. A workaround I discovered by accident is to keep typing "\n" until the "emergency mode" shell exits and "systemd" attempts to continue with normal startup. That fails, and "systemd" drops back into "emergency mode" again. However, only an interactive shell is listening at that point, so you can go about the usual cleanup tasks (run "fsck" on the remaining filesystems, mount them, bring up the primary network interface, etc.), and *then* type "" to continue with normal system startup. If you wait until *after* the console font gets loaded before trying to type the "root" password, the only way forward might be to try typing "\n" multiple times until "systemd" attempts to continue with normal startup, fails, and then drops you back into "emergency mode" again. I didn't try that. Typing "" works, at least, to restart the system and give you another crack at entering the "root" password immediately after the "emergency mode" prompt appears. No idea which startup process is competing with the "emergency mode" interactive shell for input from the console keyboard. --Bob
is there a working UP generic 4.X kernel available?
I'm finally starting to get a bit of traction on Debian bug #919825, but Michael Biebl would really like to see me testing with a Debian-provided kernel instead of my hand-built kernel.org versions (now running 5.0.0). I saw where Ben Hutchings grabbed the fix referenced here (https://salsa.debian.org/kernel-team/linux/merge_requests/79) for inclusion in sid, and the corresponding issue was closed approximately four weeks ago. A quick check of the kernels available over at "http://ftp.ports.debian.org/debian-ports/pool-alpha/main/l/linux/; doesn't show anything with a late enough date stamp to include the fix. If I'm mistaken as to the availability of a working generic kernel for single-processor alpha systems, kindly point me to it and I'll be happy to give it a try. Otherwise, what's the targeted kernel version for the fix in sid? And approximately when might that show up? As usual, many thanks in advance. --Bob
Re: Re: PWS 433au (Miata) recovery update
On Wed, Jan 30, 2019 at 04:42:23PM -0500, Alex Winbow wrote: > I've heard word that /usr destined to be going away, but frankly I'm > very > surprised that multiple local filesystems is a rarity these days. The debian > installer even creates these semi-automatically. It is seriously the case > that "everyone" has /var and /tmp on the root filesystem? Can't speak to all distros, but at least for RedHat/CentOS and derivatives, the default partitioning scheme is to put all persistent files on a single partition, possibly spanning multiple spindles (a logical volume). Files of a temporary nature get put on filesystems of a correspondingly temporary nature, which is to say "/run", "/dev", "/tmp", and anything I may have left out that does not need to survive a reboot gets put on separate filesystems (other than "/"). Distributions that have made their peace with "systemd" *have* to have "/usr" present at boot or shortly thereafter. CentOS 7.X has no separate "/usr": it's all symlinked to corresponding directories under "/". Example: "/usr/lib" --> "/lib". That's why there's no supported upgrade path from CentOS 6.X to CentOS 7.X (except for *very* carefully defined server configurations), just in case you were wondering. The "old school" way of partitioning disks tried to separate things into partitions based somewhat on where the contents came from and how likely they were to change. Three basic categories of software: (1) came with the OS; (2) other supported -- typically commercial -- software; and (3) home-grown (user written) software. There was always debate about "/usr" vs. "/opt" for the second category, and the UN*X OS vendor typically decided for you :-). Prior to the advent of true temporary filesystems (only became possible as memory became cheap and plentiful), you wanted to give filesystems with high activity their own separate partitions. This would obviously include things like "/tmp", and "/var" was usually a good candidate for a separate partition because that's where print spoolers, mail directories, USENET feeds, and so forth typically live(d). USENET was a particularly "nasty" case where the default filesystem creation parameters were typically not what you wanted -- news feeds required lots of inodes to support many small files rather than few large files. These days, the average Linux hobbyist doesn't know or care about the history or reasons why separate filesystems might be a good idea. Disk drives are large, cheap, and fast enough that one generally doesn't have to worry about where the swap partition is created relative to the rest of the disk -- the performance impact just isn't that great, particularly if the system has a metric arse-load of RAM to begin with and hardly ever touches swap. Consider yourself one of the "lucky" hobbyists if you've got an Alpha PWS to play with, becuase you *don't* have enough RAM or disk space that you can afford to be so cavalier in your attitude about disk partitioning :-). I try to do things like put "swap" and "/tmp" near the center of the spindle to minimize seek times/distances from other partitions, and for my PWS, "/tmp" is a local persistent filesystem rather than tmpfs -- there's simply not enough memory on a PWS to waste it. No choice in the matter as far as "/dev", thanks to "systemd" and "udev", but the amount of memory consumed there is minimal relative to the init system itself. Well, I hope this has at least been somewhat entertaining if not helpful :-). I've been playing around with different flavors of UN*X since 1977. First exposure was AT UNIX Sixth Edition on a PDP 11/70. First Linux system was SoftLanding Systems (SLS) Linux on a 386 back in 1992. I inherited my Alpha PWS-433au from a fellow who originally bought it to do some mail server software development in a Digital UNIX environment, then decided he needed a cantilever shelf for his equipment rack more than he needed the computer: I *think* I got the better end of the deal, even with having to make the 90-mile round trip out to his house to deliver the shelf and pick up the computer :-). --Bob
Re: Re: PWS 433au (Miata) recovery update
On Sun, Jan 27, 2019 at 12:25:52PM -0800, Alex Winbow wrote: > I'm seeing this also, after installing using the Debian 8.0 installer > and dist-upgrade'ing to unstable (using the SMP kernel trick to get past the > GENERIC issue). My understanding is that it's not initramfs-tools that > mounts all the (non-root) local filesystems, but systemd (which it looks > like you've reported as a bug elsewhere). I was able to pseudo-fix this by > changing the fs_passno field in /etc/fstab to '0'. This tells us (or me, anyway) that systemd's logic for automatically setting up and running "fsck.fstype" for local filesystems is broken. I don't think the dynamic generation of services and dependencies for handling local filesystems was part of the "special sauce" for systemd versions prior to version 235-X, which was when things broke on my system. Setting the fs_passno field to '0' (electing to not run file system checks before mounting) will bite you eventually. It's annoying to have to manually run "e2fsck -p" on three local filesystems (other than '/' and '/usr') and mount them, but at least the boot process isn't so badly broken I can't do it. > Couple other things I found: in /etc/network/interfaces, > "allow-hotplug eth0" doesn't seem to work nicely with systemd, but "auto > eth0" does. Hadn't dug into this enough to determine what's going on, but did notice that running "ifup eth0" (static IP configuration) while in the emergency shell was effective. It's interesting you mention "auto eth0" working, because I've got an IPv6 tunnel interface designated "auto" that depends on "eth0" being up to function properly, and "systemd" happily configures the tunnel interface without "eth0" being present. Have I mentioned today how much I detest "systemd"? :-) This will get solved eventually, but it would get solved more quickly if the case of multiple local filesystems was more common today. --Bob
[PATCH] 4.18-rc7 on alpha: bitsperlong issue
Apologies for what is essentially a repost with a proper subject header in the sense of trying to get the attention of people who collect/approve patches for submission upstream. See my posting from earlier today (followup: [FTBFS] kernel 4.18-rc7 bitsperlong.h issue on alpha) for the back story. As mentioned there, this patch applies cleanly to at least all mainline kernel source trees >= version 4.18. Further apologies for including the patch as an attachment, but I don't trust my mailer not to impose unintended formatting. --Bob Signed-off-by: Bob Tracy Tested-by: Bob Tracy --- a/tools/include/uapi/asm/bitsperlong.h 2019-01-20 14:40:32.522422998 -0600 +++ b/tools/include/uapi/asm/bitsperlong.h 2019-01-21 09:51:45.336938260 -0600 @@ -13,6 +13,8 @@ #include "../../arch/mips/include/uapi/asm/bitsperlong.h" #elif defined(__ia64__) #include "../../arch/ia64/include/uapi/asm/bitsperlong.h" +#elif defined(__alpha__) +#include "../../arch/alpha/include/uapi/asm/bitsperlong.h" #else #include #endif
followup: [FTBFS] kernel 4.18-rc7 bitsperlong.h issue on alpha
July 30, 2018 I reported the following to linux-kernel, linux-alpha, etc.: On an alpha system, got the following build error on the 4.18-rc7 mainline kernel source tree: HOSTCC net/bpfilter/main.o In file included from tools/include/uapi/asm/bitsperlong.h:17, from /usr/include/asm-generic/int-ll64.h:12, from /usr/include/alpha-linux-gnu/asm/types.h:24, from tools/include/linux/types.h:10, from ./include/uapi/linux/bpf.h:11, from net/bpfilter/main.c:9: tools/include/asm-generic/bitsperlong.h:14:2: error: #error Inconsistent word size. Check asm/bitsperlong.h #error Inconsistent word size. Check asm/bitsperlong.h ^ scripts/Makefile.host:107: recipe for target 'net/bpfilter/main.o' failed make[2]: *** [net/bpfilter/main.o] Error 1 scripts/Makefile.build:558: recipe for target 'net/bpfilter' failed make[1]: *** [net/bpfilter] Error 2 Makefile:1029: recipe for target 'net' failed make: *** [net] Error 2 I implemented a crap workaround at the time in "linux/tools/include/asm-generic/bitsperlong.h", similar to what some frustrated person did for the x86-64 case in "linux/include/asm-generic/bitsperlong.h". Never sent that in because I knew it was the wrong approach. The proper fix is attached. If needed, consider this my official "signed off by" and/or "tested by". This applies cleanly to at least all kernel mainline source trees from 4.18 to current. Thanks. --Bob --- a/tools/include/uapi/asm/bitsperlong.h 2019-01-20 14:40:32.522422998 -0600 +++ b/tools/include/uapi/asm/bitsperlong.h 2019-01-21 09:51:45.336938260 -0600 @@ -13,6 +13,8 @@ #include "../../arch/mips/include/uapi/asm/bitsperlong.h" #elif defined(__ia64__) #include "../../arch/ia64/include/uapi/asm/bitsperlong.h" +#elif defined(__alpha__) +#include "../../arch/alpha/include/uapi/asm/bitsperlong.h" #else #include #endif
Re: Updated installation images 2019-01-20
On Sun, Jan 20, 2019 at 03:32:10PM +0100, John Paul Adrian Glaubitz wrote: > Hello! > > I have created updated installation images for Debian Ports, > please find the updated images below and test them [1]. > > Feedback welcome. > > Adrian > > > [1] https://cdimage.debian.org/cdimage/ports/2019-01-20/ Thank you! I'll give the Alpha version a try in the next few days. Waiting on a 4.18 kernel build: should be done in the next 24 hours or so. --Bob
Re: PWS 433au (Miata) recovery update
On Fri, Jan 18, 2019 at 12:19:52AM +, Maciej W. Rozycki wrote: > (much useful information set in the appropriate historical context) Thank you for your thoughts. The earlier reported problem with "/lib/systemd/systemd-udevd" evidently requiring AF_UNIX socket support to be built-in rather than modular has been confirmed. Setting "CONFIG_UNIX=y" in the kernel configuration was enough to get me past that particular problem I was seeing with the initial ramdisk. So, per advice I was given a long time ago, *do* examine the "systemd" README file under "/usr/share/doc": many kernel configuration requirements are mentioned there. As far as gleaning the additional udev-related info, one *might* infer it from the error messages produced by the executable, *or* one can examine the udev- related files under "/lib/systemd/system", one of which explicitly mentions AF_UNIX in the context of a restricted address family. I also note that the current "initramfs-tools" have evidently forgotten how to automatically check and mount local file systems *other* than "/" and "/usr". Every boot since restoring my PWS thus far has dropped me into emergency mode with everything mounted read-write and ready to go (including swap) *other* than the local non-tmpfs file systems. Manually running the appropriate flavor of "fsck" and mounting the file systems before exiting emergency mode results in the expected normal startup of multi-user system services. "journalctl -xb" has, for the case of one such file system that didn't get checked/mounted, the following messages: (...) -- The job identifier is 271 and the job result is done. Dec 21 13:02:10 smirkin systemd[1]: Starting of /dev/sda2 not supported. -- Subject: A start job for unit dev-sda2.device has failed -- Defined-By: systemd -- Support: https://www.debian.org/support -- -- A start job for unit dev-sda2.device has finished with a failure. -- -- The job identifier is 307 and the job result is unsupported. Dec 21 13:02:10 smirkin systemd[1]: Dependency failed for File System Check on /dev/sda2. -- Subject: A start job for unit systemd-fsck@dev-sda2.service has failed -- Defined-By: systemd -- Support: https://www.debian.org/support -- -- A start job for unit systemd-fsck@dev-sda2.service has finished with a failure. -- -- The job identifier is 306 and the job result is dependency. Dec 21 13:02:10 smirkin systemd[1]: Dependency failed for /boot. -- Subject: A start job for unit boot.mount has failed -- Defined-By: systemd -- Support: https://www.debian.org/support -- -- A start job for unit boot.mount has finished with a failure. (...) My guess is, device naming conventions have, once again, changed as far as what the systemd service descriptions/templates expect. Anyone have any idea how and/or where to fix this most efficiently? --Bob
Re: PWS 433au (Miata) recovery update
On Wed, Jan 16, 2019 at 01:10:14AM -0600, Bob Tracy wrote: > (initramfs / systemd / udevd issue) I think I may have this one painted into a corner... CONFIG_UNIX (at least) needs to be "y" instead of "m". This is a relatively new dependency. For userland applications used in the boot process to have such dependencies on kernel configuration options is "unfortunate" to put it mildly. --Bob
PWS 433au (Miata) recovery update
Gentlemen, Figured it was past time for an update, now that I actually have the Alpha back on-line and functioning in its pre-meltdown capacity as my IPv6 router and Linux kernel git repository. The following narrative is going to necessarily be somewhat long-winded, seeing as it's intended to be a modern synthesis of knowledge gleaned from out-of-date Debian and Gentoo installation documents, mailing list archives, bitter experience, and source code examination. Make whatever use of it you will. My intent is to have this written down *somewhere* for "the next time". The ability to recover the machine in a somewhat timely fashion was predicated on having reasonably current backups and a way to get them onto the Alpha. I never figured the latter consideration would be the most difficult part of the job. Not to put too fine a point on it, but functional boot media for Alpha is more scarce than it should be. Special thanks to the people at Gentoo (and Matt Turner in particular) for being responsive and fixing the "qla1280" firmware issue that was preventing the effective use of Gentoo's "install-alpha-minimal" image as a recovery tool. After a few off-line conversations with Michael, I'm cautiously optimistic we'll eventually see a useful Debian NETINST image at some point in the not-too-distant future. The Gentoo image had neither the requisite USB drivers nor "ntfs-3g" filesystem support, so I had to mount my external USB drive remotely and copy my backups across the network. Not too much pain, even over a relatively slow 10 Mbit/s link. Perhaps somewhat fortuitously, I used a 36 GB disk in the PWS with a layout something like this: (reserved for aboot) /boot (about 75 MB) / (about 4 GB) swap(about 2 GB) /tmp(about 3 GB) /usr(about 13.5 GB) /opt(about 13.5 GB) Out of the lot, the real contents of "/opt" didn't have to be present for the system to function when I initially booted off the hard drive, so that was my staging area for the backups. Once I was running off the hard drive, the plan was to hook up the USB drive and restore "/opt". Odd thing about the disk partitioning scheme. The disk label definitely has to be "bsd" for SRM to be happy, but if Linux is the only OS on the disk, all the rest of the BSD partitioning conventions don't have to be observed as far as slice "c" spanning the entire disk, slice "a" being the "boot" slice, slice "b" being "swap", and so forth. I doubt dual- booting with Digital UNIX or one of the *BSD variants is a practical possibility for most people, particularly those with PWS systems having limited disk space. A brief note about partitioning programs: "fdisk" is NOT your friend on the Alpha, especially in "modern" times. Use "parted" and save yourself much frustration. Using "parted", I set the default units to "cyl" and created a "sacrificial" first partition beginning on cylinder 0 and ending on cylinder 1. This is detected by Linux as, for example "sda1" and should not be used for anything on the off chance "aboot" installation overwrites it. So, the sequence of partition creation commands was: mkpart 0 1 mkpart 1 a mkpart a b mkpart b c mkpart c d mkpart d e mkpart e f where the letters "a" through "f" represent starting and ending cylinder numbers for each partition, and the starting cylinder for each partition is the ending cylinder of the preceding partition, and yes, "parted" makes sure things don't overlap. Bonus: when it comes time to do "swriteboot", you don't have to specify "-f3" because there's no slice 3 spanning the entire disk to prevent "swriteboot" from writing the boot sectors. Once I copied my backups into place (with the exception of "/opt" as mentioned earlier) and wrote the boot sector, I ran into an interesting show-stopper. I had evidently upgraded the "initramfs-tools" package prior to creating my backups, and, long story short, I was getting dropped into an interactive shell with an "(initramfs)" prompt due to the following braindamage: (1) "systemd" (and "udevd" by extension) don't play well with "/usr" being on a separate partition from "/". If I have *any* advice to offer both the battle-scarred veteran and the newbie, it would be to consider consolidating those two partitions into a single partition. Me? I'd prefer the younger generation of system programmers consider the perfectly valid reasons why those filesystems might have been separate to begin with, and respect those reasons. (Hint: much smaller disks.) (2) Perhaps as a consequence of (1), "/lib/systemd/systemd-udevd" refuses to start/run on the initramfs, in spite of the appropriate support being enabled in the kernel configuration per systemd's README file. The error messages appearing on the console are: error getting socket: Address family not supported by protocol error initializing udev control socket could not listen on fds: Invalid argument This isn't necessarily a fatal error, EXCEPT... (3) The brain-dead
Re: fdisk vs. BSD disklabels and slices
On Sun, Jan 06, 2019 at 02:46:54PM -0800, Matt Turner wrote: > On Sun, Jan 6, 2019 at 2:31 PM Bob Tracy wrote: > > > > Has anyone reading this used a recent version of "fdisk" to create a BSD > > disklabel and disk slices from scratch on an Alpha? If so, would you > > please describe the procedure in enough detail that relevant Linux > > installation documentation could be updated? It would seem to be anything > > *but* intuitive :-(. > > fdisk's BSD disklabel support has been unusable raw disks, as far as I > understand, since v2.23. See > https://www.spinics.net/lists/util-linux-ng/msg11869.html > > > If "fdisk" is the wrong utility, I need a pointer to the correct one(s). > > At this point, I'm seriously considering digging out my old Debian 4.0 > > CDs and harvesting the essential pieces. > > > > As always, thanks in advance. > > Use 'parted' instead. It works well. Reminds me that I need to change > the Gentoo handbook to reference parted instead of fdisk. > > > For what it's worth, the new Gentoo "install-alpha-minimal" image fixed > > the "qla1280" firmware loading issue for the most part (a module reload > > is required to get the firmware to load). > > Good to hear. I suspect the module is in the initramfs but the > firmware is on the root file system. Hmm.. Re: parted. Thanks. Presumably included in the i-a-m image: count on me to say something if not :-). Appreciate the steer. --Bob
fdisk vs. BSD disklabels and slices
Has anyone reading this used a recent version of "fdisk" to create a BSD disklabel and disk slices from scratch on an Alpha? If so, would you please describe the procedure in enough detail that relevant Linux installation documentation could be updated? It would seem to be anything *but* intuitive :-(. If "fdisk" is the wrong utility, I need a pointer to the correct one(s). At this point, I'm seriously considering digging out my old Debian 4.0 CDs and harvesting the essential pieces. As always, thanks in advance. For what it's worth, the new Gentoo "install-alpha-minimal" image fixed the "qla1280" firmware loading issue for the most part (a module reload is required to get the firmware to load). --Bob
Re: Use SMP kernel for Alpha (udeb) builds
On Sat, Dec 08, 2018 at 07:41:15PM +0100, Frank Scheiner wrote: > On 12/8/18 15:05, Bob Tracy wrote: > > From the "image.squashfs" file on the Gentoo "install-alpha-minimal" > > image, attached is "etc/kernels/kernel-config-alpha-4.14.65-gentoo" > > which appears to correspond to the "nolsa" kernel variant. To your > > question about whether SMP is configured, most definitely "yes" with > > CONFIG_NR_CPUS=32. > > Thanks for checking. This seems to be definitely a SMP capable kernel, as > `CONFIG_SMP=y` is also set. > > About the `CONFIG_ALPHA_LEGACY_START_ADDRESS`, [1] mentions this is actually > needed for older boot loaders only which hardcoded the kernel start address. > And the Gentoo config shows it as inactive: `# > CONFIG_ALPHA_LEGACY_START_ADDRESS is not set` > > [1]: https://cateee.net/lkddb/web-lkddb/ALPHA_LEGACY_START_ADDRESS.html > > But interesting, [1] also says, that this option depends on > CONFIG_ALPHA_GENERIC, which is actually set (`CONFIG_ALPHA_GENERIC=y`) in > the Gentoo config. > > So can we assume `CONFIG_ALPHA_GENERIC=y` also activates > `CONFIG_ALPHA_LEGACY_START_ADDRESS`? I wouldn't assume so, particularly for the Gentoo kernel source tree to whatever extent it differs from the kernel.org source tree. What the dependency is saying is, you can't have the legacy start address config option force-enabled unless you're building a generic kernel. Otherwise, the (alpha) processor-specific config options presumably dictate whether the legacy start address is used. This is, I think, why Gentoo includes a generic+lsa kernel and a generic+nolsa kernel in their install image. BUT, in your defense, it's possible an unpatched kernel.org source tree might be doing (or might have done -- this could have been patched upstream) exactly as you suggest. I haven't investigated this, because I've never used the alpha generic kernel except for the initial installation on a system. Just to be clear, Gentoo's generic kernel *does* have SMP configured, and *with* the legacy start address enabled should boot just fine on your PWS as it does on mine. The kernel version is 4.14(.65). --Bob
Re: Use SMP kernel for Alpha (udeb) builds
On Sat, Dec 08, 2018 at 11:15:21AM +0100, Frank Scheiner wrote: > Is this Gentoo generic installer kernel SMP capable? I believe these Gentoo > kernels have the config included in the kernel image, so available as > `/proc/config.gz` during runtime, I think. >From the "image.squashfs" file on the Gentoo "install-alpha-minimal" image, attached is "etc/kernels/kernel-config-alpha-4.14.65-gentoo" which appears to correspond to the "nolsa" kernel variant. To your question about whether SMP is configured, most definitely "yes" with CONFIG_NR_CPUS=32. --Bob # # Automatically generated file; DO NOT EDIT. # Linux/alpha 4.14.65-gentoo Kernel Configuration # # # Gentoo Linux # CONFIG_GENTOO_LINUX=y CONFIG_GENTOO_LINUX_UDEV=y CONFIG_GENTOO_LINUX_PORTAGE=y # # Support for init systems, system and service managers # CONFIG_GENTOO_LINUX_INIT_SCRIPT=y # CONFIG_GENTOO_LINUX_INIT_SYSTEMD is not set CONFIG_ALPHA=y CONFIG_64BIT=y CONFIG_MMU=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y # CONFIG_ARCH_HAS_ILOG2_U32 is not set # CONFIG_ARCH_HAS_ILOG2_U64 is not set CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_ZONE_DMA=y CONFIG_ARCH_DMA_ADDR_T_64BIT=y CONFIG_NEED_DMA_MAP_STATE=y CONFIG_NEED_SG_DMA_LENGTH=y CONFIG_GENERIC_ISA_DMA=y CONFIG_PGTABLE_LEVELS=3 CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" CONFIG_IRQ_WORK=y # # General setup # CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_CROSS_COMPILE="" # CONFIG_COMPILE_TEST is not set CONFIG_LOCALVERSION="" CONFIG_LOCALVERSION_AUTO=y CONFIG_DEFAULT_HOSTNAME="(none)" CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y # CONFIG_POSIX_MQUEUE is not set CONFIG_CROSS_MEMORY_ATTACH=y CONFIG_FHANDLE=y CONFIG_USELIB=y # CONFIG_AUDIT is not set CONFIG_HAVE_ARCH_AUDITSYSCALL=y # # IRQ subsystem # CONFIG_GENERIC_IRQ_PROBE=y CONFIG_GENERIC_IRQ_SHOW=y CONFIG_AUTO_IRQ_AFFINITY=y CONFIG_IRQ_DOMAIN=y CONFIG_GENERIC_CLOCKEVENTS=y # # Timers subsystem # CONFIG_HZ_PERIODIC=y # CONFIG_NO_HZ_IDLE is not set # CONFIG_NO_HZ is not set # CONFIG_HIGH_RES_TIMERS is not set # # CPU/Task time and stats accounting # CONFIG_TICK_CPU_ACCOUNTING=y # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # # RCU Subsystem # CONFIG_TREE_RCU=y # CONFIG_RCU_EXPERT is not set CONFIG_SRCU=y CONFIG_TREE_SRCU=y # CONFIG_TASKS_RCU is not set CONFIG_RCU_STALL_COMMON=y CONFIG_RCU_NEED_SEGCBLIST=y CONFIG_BUILD_BIN2C=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=15 CONFIG_LOG_CPU_MAX_BUF_SHIFT=12 CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=13 CONFIG_CGROUPS=y # CONFIG_MEMCG is not set # CONFIG_BLK_CGROUP is not set # CONFIG_CGROUP_SCHED is not set # CONFIG_CGROUP_PIDS is not set # CONFIG_CGROUP_RDMA is not set # CONFIG_CGROUP_FREEZER is not set # CONFIG_CPUSETS is not set # CONFIG_CGROUP_DEVICE is not set # CONFIG_CGROUP_CPUACCT is not set # CONFIG_SOCK_CGROUP_DATA is not set # CONFIG_CHECKPOINT_RESTORE is not set CONFIG_NAMESPACES=y # CONFIG_UTS_NS is not set CONFIG_IPC_NS=y # CONFIG_USER_NS is not set # CONFIG_PID_NS is not set CONFIG_NET_NS=y # CONFIG_SCHED_AUTOGROUP is not set # CONFIG_SYSFS_DEPRECATED is not set # CONFIG_RELAY is not set CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE="" CONFIG_RD_GZIP=y CONFIG_RD_BZIP2=y CONFIG_RD_LZMA=y CONFIG_RD_XZ=y CONFIG_RD_LZO=y CONFIG_RD_LZ4=y CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y # CONFIG_CC_OPTIMIZE_FOR_SIZE is not set CONFIG_SYSCTL=y CONFIG_ANON_INODES=y CONFIG_HAVE_PCSPKR_PLATFORM=y CONFIG_BPF=y # CONFIG_EXPERT is not set CONFIG_MULTIUSER=y # CONFIG_SGETMASK_SYSCALL is not set CONFIG_SYSFS_SYSCALL=y # CONFIG_SYSCTL_SYSCALL is not set CONFIG_POSIX_TIMERS=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ABSOLUTE_PERCPU is not set CONFIG_KALLSYMS_BASE_RELATIVE=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_PCSPKR_PLATFORM=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_FUTEX_PI=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_TIMERFD=y CONFIG_EVENTFD=y # CONFIG_BPF_SYSCALL is not set CONFIG_SHMEM=y CONFIG_AIO=y CONFIG_ADVISE_SYSCALLS=y # CONFIG_USERFAULTFD is not set CONFIG_PCI_QUIRKS=y CONFIG_MEMBARRIER=y # CONFIG_EMBEDDED is not set CONFIG_HAVE_PERF_EVENTS=y # CONFIG_PC104 is not set # # Kernel Performance Events And Counters # # CONFIG_PERF_EVENTS is not set CONFIG_VM_EVENT_COUNTERS=y CONFIG_COMPAT_BRK=y CONFIG_SLAB=y # CONFIG_SLUB is not set CONFIG_SLAB_MERGE_DEFAULT=y # CONFIG_SLAB_FREELIST_RANDOM is not set # CONFIG_SYSTEM_DATA_VERIFICATION is not set # CONFIG_PROFILING is not set CONFIG_HAVE_OPROFILE=y CONFIG_HAVE_64BIT_ALIGNED_ACCESS=y CONFIG_GENERIC_SMP_IDLE_THREAD=y CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG=y CONFIG_ARCH_WANT_IPC_PARSE_VERSION=y # CONFIG_CC_STACKPROTECTOR is not set CONFIG_THIN_ARCHIVES=y CONFIG_HAVE_VIRT_CPU_ACCOUNTING_GEN=y CONFIG_HAVE_MOD_ARCH_SPECIFIC=y CONFIG_MODULES_USE_ELF_RELA=y # CONFIG_HAVE_ARCH_HASH is not set CONFIG_ISA_BUS_API=y CONFIG_ODD_RT_SIGACTION=y CONFIG_OLD_SIGSUSPEND=y CONFIG_CPU_NO_EFFICIENT_FFS=y # CONFIG_HAVE_ARCH_VMAP_STACK is not set # CONFIG_ARCH_OPTIONAL_KERNEL_RWX is not set # CONFIG_ARCH_OPTIONAL_KERNEL_RWX_DEFAULT is not set
Re: Use SMP kernel for Alpha (udeb) builds
On Sat, Dec 08, 2018 at 10:06:25AM +1300, Michael Cree wrote: > On Tue, Dec 04, 2018 at 05:38:51PM +0100, Frank Scheiner wrote: > > As per [1] and our recent discussions the generic 4.x kernels seem to no > > longer work on Alpha machines which also renders any installer images using > > the generic 4.x kernels non-working. > > Yes, that was noted some time ago. A generic kernel does not boot > since about 3.13. I can't remember why I never attempted bisecting > this back when it was first noted to be a problem, maybe because it > didn't affect me because I normally run my own spun kernels. Ditto on this end. I figure a first pass at the problem would be to compare our respective kernel configs against the generic one, just to get a reading on what code *may* be involved. I can provide my Miata config for a 4.14 kernel (and that's about all I can do until I'm back up and running) if that would be helpful. Another data point to consider would be the kernel config for the current (as of the end of November) Gentoo "install-alpha-minimal" image, which works on Miata at least (modulo the missing Qlogic firmware issue). The associated kernel is "4.14.65-gentoo", and two variants are present on the image -- a "generic" one, and one without a "legacy start address". The "aboot.conf" file has the following comment: # Some later alphas need a special kernel without legacy start address, most # notably the DS15A and DS25 workstations as well as the ES45, ES47 and GS # series of servers. The Miata boots fine with the "generic" kernel, and panics when I try the "nolsa" kernel. Bottom line: I think the way forward will be easier from a Debian perspective if the Debian installer for alpha includes a >= 4.14 kernel, because the 4.8 and 4.9 kernels are known to have issues anyway. An upgrade would also put alpha closer to being in-sync with the "testing" distro on Intel/AMD platforms. --Bob
Re: Use SMP kernel for Alpha (udeb) builds
On Tue, Dec 04, 2018 at 07:37:13PM +0100, Frank Scheiner wrote: > On 12/4/18 17:45, John Paul Adrian Glaubitz wrote: > > > ## Patches ## > > > > > > 1. > > > https://salsa.debian.org/frank-scheiner-guest/linux/commit/865cacfd7722b346629082ab3094b6ad93964095 > > > > > > 2. > > > https://salsa.debian.org/frank-scheiner-guest/debian-installer/commit/7269679bec8bae997ef5ed7619e9f8df2e184134 > > > > > > I think both patches are already enough to produce the needed alpha-smp > > > udebs and will allow to produce working installer images (e.g. netboot > > > images might work instantly and could be an alternative way for Bob to > > > reinstall his PWS). > > > > > > What do you think? Is there anything obvious missing? > > > > Can you open PRs so that these changes can get merged? I will then build > > new images. > > Sure, created them now: > > * First part: https://salsa.debian.org/kernel-team/linux/merge_requests/79 > > * Second part: > https://salsa.debian.org/installer-team/debian-installer/merge_requests/6 Much appreciated, gentlemen. Wish I could do more than offer my system up as a test platform, but so it goes... I'll be happy to help with determining the "actual problem which is yet unknown" with the alpha generic kernel, once my system is back up and running :-). --Bob
Re: [alpha] Debian 9.0 NETINST fails
On Sat, Nov 24, 2018 at 09:07:09AM +1300, Michael Cree wrote: > On Fri, Nov 23, 2018 at 01:41:17PM -0600, Bob Tracy wrote: > > Trying the 20181120 minimal installation CD fails due to the firmware > > for the Qlogic ISP1020 (1040.bin) SCSI adapter not being present, either > > as built-in to the kernel or as a standalone file. No firmware means no > > hard disks detected means "full stop" -- can't get there from here. > > Yeah, that's a problem. (Missing context: I was trying the Gentoo "install-alpha-minimal" CD when I ran into the missing firmware issue. Their CD is based on a later kernel than Debian's, and boots/runs ok other than the missing firmware issue. Yes, I'm keeping "debian-alpha" in the list, because our community is small enough to make keeping maximal eyes on Alpha installation issues a good thing.) Submitted a bug report via Gentoo's bugzilla page to get the ball rolling on that side of the fence. Doesn't help as far as fixing the problems with the Debian installer though... Other architectures officially supported by Debian have moved to a later kernel for their installation CDs, and we should do likewise. There are known issues with 4.9.X which prompted the move to, I think, 4.12.X for other Debian install CDs. We (Debian) ran into a Qlogic firmware issue about a year ago when the 4.14 kernel was current, because that was around the time firmware was removed from the kernel source tree. I mention that as a reminder in case we decide to use a later kernel than the other architectures. --Bob
Re: [alpha] Debian 9.0 NETINST fails
Just a quick update, and hopefully there's a Gentoo advocate/developer reading this... Trying the 20181120 minimal installation CD fails due to the firmware for the Qlogic ISP1020 (1040.bin) SCSI adapter not being present, either as built-in to the kernel or as a standalone file. No firmware means no hard disks detected means "full stop" -- can't get there from here. --Bob
Re: [alpha] Debian 9.0 NETINST fails
On Wed, Nov 07, 2018 at 04:12:27PM +0100, John Paul Adrian Glaubitz wrote: > I can unfortunately not build updated installer images for Alpha since I > don't have an Alpha porterbox available where I can build the debian-installer > package for Alpha. > > I do have two AlphaStation 233 sitting in the basement at the university now, > but I don't have any time and space to set them up to be used as porterboxes. > > The only porterbox currently currently available to me is Michael Cree's > "electro" > but the repositories of the chroots there don't include the debian-installer > component packages, so building d-i fails - at least last time I tried. I've carved out a bit of time to deal with getting my PWS back on-line. The above is the most recent post in the thread, as far as I can tell... Michael -- is it a relatively easy thing to do to make the d-i component packages available on "electro"? Another option: is the SMP version of the 4.9.0-3 kernel available to substitute into the existing netinst image? Unless I'm missing something, rebuilding the ISO with that substitution appears to be a simple task (assuming the existing initrd image for the generic kernel can be used). I *did* see further postings indicating that, while the SMP kernel *did* boot where the generic one wouldn't, there were other issues encountered subsequent to booting. I have not yet tried a prior netinst image, so there's a window of opportunity here are far as me being a willing/motivated tester of the installer if a working one can be cobbled together. After all, I'm currently sitting at the "worst case scenario" state and would lose nothing but a bit of time by trying a new installer :-). For that matter, if "the community" thinks a better thing to do would be to try out the Gentoo installer, I think I could get to the desired end state via that route as well. Opinions welcome. --Bob
Re: [alpha] Debian 9.0 NETINST fails
On Tue, Nov 06, 2018 at 01:36:45PM +0100, Frank Scheiner wrote: > sorry, looks like I missed your mails to the debian-alpha list until now. Not a problem. As a temporary workaround, I've got an "amd64" "testing" distro loaded on a spare i5-based system. We have time to explore options and possibly put together a "recent" alpha installer that will work. > (...) > > > As a workaround, could it work to netboot the matching stock SMP kernel > (4.9.0-3) with the netboot installer initrd from [1]? I don't know how to > extract the initrd from the `netabootwrap`ed image though. Example: # mount -t iso9660 debian-9.0-alpha-NETINST-1.iso /mnt/cdrom -o loop,ro > Or could it work to netboot the SMP kernel with the cdrom installer initrd > from [1] and the installer CDROM in the CDROM drive? > > [1]. > http://ftp.ports.debian.org/debian-ports/pool-alpha/main/d/debian-installer/debian-installer-images_20170615_alpha.tar.gz > > > > Other approach: as per [2] hppa for example uses two kernels. So could we > just change [2] for alpha to also include the SMP kernel, with e.g. that > patch: > > ``` > --- debian/installer/kernel-versions 2018-11-06 13:30:54.152319148 +0100 > +++ debian/installer/kernel-versions-new 2018-11-06 13:31:38.992320296 > +0100 > @@ -1,5 +1,6 @@ > # arch version flavour installedname suffix build-depends > alpha - alpha-generic - y - > +alpha - alpha-smp - y - > amd64 - amd64 - - - > arm64 - arm64 - - - > armel - marvell - y - > ``` > > ...and the installer images will also include the SMP kernel? > > [2]: > https://salsa.debian.org/kernel-team/linux/raw/master/debian/installer/kernel-versions I'm not going to be able to look at this too closely for the next few days, but my thought was to simply replace the generic kernel with the smp kernel on the existing "netinst" ISO. The most obvious issue is finding (or building) the correct version of the smp kernel. Rebuilding the ISO image once the kernel binary has been copied into place looks to be a straightforward procedure: the relevant details are present on the existing ISO image in the ".disk" directory (the "mkisofs" file). This isn't the correct fix by any means, but as workarounds go, not too distasteful because everything else on the installer image remains unchanged (except for possibly a few file checksums if anything involved in the actual installation process cares about them). At the moment, I don't have the necessary infrastructure to build an alpha kernel. If someone could build the appropriate 4.9.0-3 smp kernel and make it available for download, I think I can handle the rest... I guess while I'm poking around in the existing ISO image, I should look to see which video drivers are included in the "initrd" image. A non- graphical method of installation is perfectly acceptable if that's an option: for all I know, that *is* the option :-). As far as thinking outside the box a little bit, it's helpful to consider what my actual goal(s) might be. What I really need to be able to do is (a) partition the hard disk; (b) create file systems; (c) install the "aboot" boot sector; and (d) copy my backup into place from an external USB device. If having USB support on a CD booted in "rescue" mode is expecting a bit much, the restoration could be accomplished over a network connection (trivially if NFS support is built into the mix). In other words, I don't really need to be able to do even a minimal Debian installation. Depending on the "rescue" feature set available, even an older known-to-work alpha installer image might be sufficient to accomplish what I want. With an older installer image, the main concerns would be support for "ext3/ext4" file system types, and having the most recent version of "aboot". The Debian 8.0 installer might meet those basic requirements: I haven't checked. --Bob
Re: [alpha] Debian 9.0 NETINST fails
(Dyslexia-related failure on original copy to Michael) On Fri, Nov 02, 2018 at 01:41:51PM -0500, Bob Tracy wrote: > Quick background info: I'm having to do a "from scratch" install on my > PWS 433au (miata) due to a SCSI disk failure. > > The Debian 9.0 NETINST image (from > "http://cdimage.debian.org/cdimage/ports/9.0/alpha/iso-cd/;) seems to > boot ok from SRM (>>> b dka[device_spec]) and takes me to the usual > "aboot" menu. Typing "l" to get a list of pre-configured kernels gets > me three items, all of which are designated "n" rather than the expected > "0", "1", "2". The first kernel seems to be the desired one, as the > other two expect a serial console on ttyS0 and ttyS1 respectively. > > Typing "0" at the "aboot" prompt seems to do the right thing as far as > selecting the first item in the list. The "initrd" loads, and at the > point where a message gets printed out to the effect it's booting the > kernel with the expected options (as listed for the first kernel in the > aboot menu), I get a halt with code 5 (CPU 0 halted), and I'm back at > the SRM prompt. Completely repeatable. > > Before I try the 8.0 NETINST image, if anyone has noticed anything > fundamentally wrong with how I'm trying to boot the 9.0 image, kindly > let me know. > > Possibly relevant: I'm using the Radeon video card that was in the > machine when the disk failed. I have the original TGA video card if the > NETINST kernel can't handle the Radeon, but would rather not have to > swap out the video card if I don't absolutely have to. Additional info... Frank Scheiner reported similar badness on his PWS back in March of 2017. See the "debian-alpha" archive link: https://lists.debian.org/debian-alpha/2017/03/msg7.html Executive summary: SMP 4.x kernels work fine, but the generic Debian kernel does *not* (or at least didn't at that time). --Bob
[alpha] Debian 9.0 NETINST fails
Quick background info: I'm having to do a "from scratch" install on my PWS 433au (miata) due to a SCSI disk failure. The Debian 9.0 NETINST image (from "http://cdimage.debian.org/cdimage/ports/9.0/alpha/iso-cd/;) seems to boot ok from SRM (>>> b dka[device_spec]) and takes me to the usual "aboot" menu. Typing "l" to get a list of pre-configured kernels gets me three items, all of which are designated "n" rather than the expected "0", "1", "2". The first kernel seems to be the desired one, as the other two expect a serial console on ttyS0 and ttyS1 respectively. Typing "0" at the "aboot" prompt seems to do the right thing as far as selecting the first item in the list. The "initrd" loads, and at the point where a message gets printed out to the effect it's booting the kernel with the expected options (as listed for the first kernel in the aboot menu), I get a halt with code 5 (CPU 0 halted), and I'm back at the SRM prompt. Completely repeatable. Before I try the 8.0 NETINST image, if anyone has noticed anything fundamentally wrong with how I'm trying to boot the 9.0 image, kindly let me know. Possibly relevant: I'm using the Radeon video card that was in the machine when the disk failed. I have the original TGA video card if the NETINST kernel can't handle the Radeon, but would rather not have to swap out the video card if I don't absolutely have to. Thanks. --Bob
Re: vmlinux ld relocation errors on Alpha
On Wed, Oct 24, 2018 at 09:46:29AM -0500, Bob Tracy wrote: > Back in January of 2017, a patch set was created for the 4.9.0 kernel and > *hopefully* sent upstream to address the subject issue. > (...). > Am currently attempting a native 4.19.0 build with an up-to-date debian > "sid" toolchain. If the build succeeds, than the issue became OBE for > whatever reason. Otherwise, it would be nice to know what happened to > the patch set. Should know either way as far as the build in a few more > hours. Successful build, so I guess "OBE" applies. If someone reading this knows, how was this fixed? My guess would be improvements in the compiler tool chain, most particularly "bintools". Sorry for the noise. --Bob
vmlinux ld relocation errors on Alpha
Back in January of 2017, a patch set was created for the 4.9.0 kernel and *hopefully* sent upstream to address the subject issue. The patch implemented a workaround: defining a new "alphalib" section for all the Alpha-specific library functions to be linked into the final vmlinux. The patch was definitely not implemented in a vacuum: Helge Deller and Michael Cree suggested the fix approach, and both Maciej W. Rozycki and Matt Turner provided significant input along the way. At this late date, I was forced to resync my Alpha build tree with the one from "kernel.org", and noticed the patch set never made it into mainline. There's a definite possibility it was never submitted upstream, and a further possibility it was NACK'd for some reason. Am currently attempting a native 4.19.0 build with an up-to-date debian "sid" toolchain. If the build succeeds, than the issue became OBE for whatever reason. Otherwise, it would be nice to know what happened to the patch set. Should know either way as far as the build in a few more hours. --Bob
X11 on Alpha running Debian "sid"
It had been an *extremely* long time since I dared try to run a graphical console on my PWS 433au, so I screwed-up my courage and gave it a try... "sddm" is extremely slow to initialize, and even slower to respond to both keyboard and mouse input. Usually, this means something screwy going on with acceleration, which historically has barely worked on this platform anyway. AIGLX is something new showing up in the "Xorg.log.0" file that wasn't there the last time I had X11 working (which should tell you something about how long it has been), so that's a prime suspect. On the system configuration side of things, the video card is a Radeon 7500 (RV200), which is pretty much as good as it gets on the Alpha. Modern Xorg releases do a respectable job of figuring things out on their own as far as not needing to override anything (or at least not much) in a "xorg.conf" file, so I moved that out of the way as part of the debugging effort. That didn't make any difference as best I can tell, i.e., the display still initializes properly, and while Xorg.0.log has a lot more information related to "figuring things out" in it, there's nothing there to raise an eyebrow. KDE and GNOME are both elephantine on a 1.5 GB machine in 2018, so I'm attempting to run with AfterStep. It seems to mostly work, or *will* when I replace all the "x-terminal-emulator" in-/re-direction going on in the Wharf configuration. Out of the box, the configuration seems predisposed to use the gnome terminal, whereas AfterStep should default to aterm --> urxvt. So, if someone can provide a hint or three as to how to make "sddm" more responsive, that would be much appreciated. Alternatively, if I need to be using a different display manager, suggestions would be appreciated as long as they're window-manager-agnostic. In particular, substantial dependencies on KDE infrastructure should probably be avoided :-(. --Bob
Re: kde build deps issue?
On Wed, Oct 03, 2018 at 07:58:27AM +1300, Michael Cree wrote: > Hope you don't mind me CCing the debian-alpha list as it would be good > to get other eyes seeing the problems. Not a problem. The second of the two issues you mention would seem to be the long pole in the tent, as it were. > There are at least two problems: > > #896658 easily fixed, but when I made an account on the upstream bug > tracker they started spamming me with their product advertisements so > I am having absolutely nothing more to do with that. Went to the Debian bug site to have a look-see. Did Lisandro ever file the bug upstream per your request? The only real issue I see with a Debian-specific patch is, as has been the case with other large source packages, a point is eventually reached where it's no longer possible to apply custom patches. Much better if upstream can make the accommodation so it can be forward-ported as necessary. > And their is a much bigger problem: qttools-opensource-src build > depends on libclang which is not available on many of the ports. > I get the impression that upstream is not interested in making that > build dependency optional. I have taken a look at the old Alpha > back-end for LLVM but there is a substantial amount of work to get > it building with a recent LLVM. I started this but on realising the > amount of work still required have had to abandon it. There is a > possibility of getting clang support without needing an LLVM backend > but I haven't looked at that, and probably won't be able to in the > near future. In case other readers have forgotten an earlier thread, this is an issue with future "firefox" releases as well -- without LLVM, no "Rust", and if no "Rust", no "firefox". As far as "firefox" is concerned, unless most of the people reading this have large Alpha servers with plenty of RAM, I believe the point of diminishing returns has been reached -- future releases, assuming they can be built, are probably going to require more horsepower to run than my puny PWS can bring to the party. > So my view is, unless someone wants to pick up these issues and > provide fixes, building kde on Alpha has come to an end. (Begin tangential rant -- feel free to ignore :-)) Frankly, increased complexity and bloat are a given as time goes by. There was an article published recently on reasons why this happens, and at the risk of oversimplifying the author's point, throwing more resources at an inefficiently-written application is "better" (for some definition of the word) than spending time rewriting code. Not to put too fine a point on it, but Moore's Law simply doesn't apply where legacy iron is concerned. It probably won't be Debian when the smoke clears, but what about something like DSL for the Alpha and other legacy platforms currently dying a slow "death by ant bites"? It's a wondrous thing to behold what the hardware is capable of doing when the software can get out of its way :-). --Bob
[FTBFS] 4.18-rc7 bitsperlong.h issue on alpha
On an alpha system, got the following build error on the 4.18-rc7 mainline kernel source tree: HOSTCC net/bpfilter/main.o In file included from tools/include/uapi/asm/bitsperlong.h:17, from /usr/include/asm-generic/int-ll64.h:12, from /usr/include/alpha-linux-gnu/asm/types.h:24, from tools/include/linux/types.h:10, from ./include/uapi/linux/bpf.h:11, from net/bpfilter/main.c:9: tools/include/asm-generic/bitsperlong.h:14:2: error: #error Inconsistent word size. Check asm/bitsperlong.h #error Inconsistent word size. Check asm/bitsperlong.h ^ scripts/Makefile.host:107: recipe for target 'net/bpfilter/main.o' failed make[2]: *** [net/bpfilter/main.o] Error 1 scripts/Makefile.build:558: recipe for target 'net/bpfilter' failed make[1]: *** [net/bpfilter] Error 2 Makefile:1029: recipe for target 'net' failed make: *** [net] Error 2 Encountering this kind of error is not unusual on alpha. --Bob
firefox-esr 52.6.0 available for alpha
Yes. Just in case you don't regularly check for updates to the Debian "unstable/sid" distribution, the long wait for a modern version of Firefox on the Alpha is over. For now... No idea how much longer Michael and I can keep resurrecting Lazarus. At some point in the near future, it will no longer be possible to build Firefox on an Alpha unless someone takes up the gauntlet and puts together a working Rust compiler. Considering how few of us there are who might benefit from the effort, it would definitely have to be a labor of love, performed by someone who does the impossible just because someone said he couldn't do it. Of potential interest are some build statistics for Firefox on at least two different Alpha platforms. On one of the Debian buildd systems (an ES45 with 3 CPUs and an unknown amount of RAM), the build took 11:13:17 and approx. 12 GB of disk space. On my PWS 433au with 1.5 GB of RAM (maximum amount) and 2.5 GB of swap (pretty much the minimum required), the build takes over 5 *days* if tests are enabled. Anyway, enjoy, and pass along your thanks and kudos to Michael. I'm pretty much done with Firefox builds on my local machine after this go-round. The PWS is too underpowered and otherwise resource-starved to continue down this road :-(. --Bob
Re: package updates waiting on akregator update
On Tue, Jan 16, 2018 at 12:44:03PM -0600, Bob Tracy wrote: > Pretty well called it :-(. One of the build dependencies for > "qtwebengine-opensource-src" is a javascript optimizing compiler. I > suspect it's time to abandon "akregator" on "alpha". "kmail" is similarly afflicted with a dependency on the QT5-based web engine viewer. The irony here is especially thick... Back when dinosaurs roamed the earth, we used to protest "not all the world is a VAX". Fast-forward to 2017+, and we now have the exact same bias toward Intel chipsets and clones. I suppose the bias is a bit more justified in the present day, since Apple threw in the towel a few years ago on desktop/notebook computers, and Android-compatible hardware makes up the bulk of the rest of what's likely to be owned by the typical end-user. --Bob
package updates waiting on akregator update
Following the buildd log chain, the hold-up seems to be build dependencies that cannot be installed. At the base of the list of dependencies is "qtwebengine-opensource-src" for which "alpha" is no longer an architecture listed by the maintainer(s). Here's the high-level dependency chain: "akregator" depends on source package "kf5-messagelib" which depends on source package "qtwebengine-opensource-src". I'm looking into whether "qtwebengine-opensource-src" is buildable on "alpha", but I suspect this is going to be similar to the "iceweasel" dependency on "firefox-esr", i.e., not easily fixed. --Bob
Re: package updates waiting on akregator update
On Tue, Jan 16, 2018 at 12:36:03PM -0600, Bob Tracy wrote: > Following the buildd log chain, the hold-up seems to be build > dependencies that cannot be installed. At the base of the list of > dependencies is "qtwebengine-opensource-src" for which "alpha" is no > longer an architecture listed by the maintainer(s). > > Here's the high-level dependency chain: > > "akregator" depends on source package "kf5-messagelib" which depends on > source package "qtwebengine-opensource-src". > > I'm looking into whether "qtwebengine-opensource-src" is buildable on > "alpha", but I suspect this is going to be similar to the "iceweasel" > dependency on "firefox-esr", i.e., not easily fixed. Pretty well called it :-(. One of the build dependencies for "qtwebengine-opensource-src" is a javascript optimizing compiler. I suspect it's time to abandon "akregator" on "alpha". --Bob
Re: [BUG] 4.14 cannot find configured disks/partitions
On Fri, Dec 01, 2017 at 06:22:50PM -0600, Bob Tracy wrote: > On Sat, Dec 02, 2017 at 10:16:28AM +1300, Michael Cree wrote: > > Just got a chance to try out 4.14 on an Alpha and look like firmware > > for built-in drivers are being treated differently. I presume you > > have a qlogic scsi driver. Make sure qlogic/1040.bin (or whatever is > > appropriate for your scsi driver) is listed under Drivers->Generic > > Driver Options->Include in-kernel firmware blobs. > > Glad I saw this before trying a freshly-built 4.14 with a few more > kernel config options enabled per the README file for systemd. Doesn't > mean there isn't a systemd/udev issue as well, but if the firmware for > the SCSI controller isn't getting loaded, that makes detection of the > host and attached devices pretty much a non-starter. That was *it*. Up and running on 4.14 with no issues. Have reported success and requested closure of bug #883089. FYI, "1040.bin" *is* the correct firmware for the Qlogic ISP1020 card as confirmed by examination of the driver source. Oddly enough, I've been including the firmware for the video card as part of the kernel build since forever: the console wouldn't initialize properly without it. Never occurred to me the same thing might be needed for the built-in SCSI driver, mostly because everything worked great up through kernel version 4.13. Thanks again. --Bob
Re: [BUG] 4.14 cannot find configured disks/partitions
On Sat, Dec 02, 2017 at 10:16:28AM +1300, Michael Cree wrote: > On Tue, Nov 28, 2017 at 11:49:55PM -0600, Bob Tracy wrote: > > On Thu, Nov 23, 2017 at 11:10:10PM -0600, Bob Tracy wrote: > > > Perhaps the subject isn't entirely accurate, but that's what seems to be > > > the case. After loading the initial ramdisk, the boot process stalls > > > (loops indefinitely) with "mdadm" complaining about not being able to > > > scan > > > the disks defined in its configuration file, which is bone-stock. What > > > makes this particularly infuriating is, I don't have anything configured > > > for "mdadm" to worry about, which is what's normally found, i.e., > > > nothing of interest. > > > > > > The 4.13 kernel loads and boots just fine with exactly the same > > > up-to-date (unstable/experimental) userspace libraries and utilities. > > > > > > Bottom line: the 4.14 kernel doesn't seem to detect my SCSI disks for > > > some reason. All the recent patches (end of October time frame) for PCI > > > on alpha (I *think* mostly having to do with addressing some weird kind > > > of libata conflict) didn't make any difference. > > Just got a chance to try out 4.14 on an Alpha and look like firmware > for built-in drivers are being treated differently. I presume you > have a qlogic scsi driver. Make sure qlogic/1040.bin (or whatever is > appropriate for your scsi driver) is listed under Drivers->Generic > Driver Options->Include in-kernel firmware blobs. Glad I saw this before trying a freshly-built 4.14 with a few more kernel config options enabled per the README file for systemd. Doesn't mean there isn't a systemd/udev issue as well, but if the firmware for the SCSI controller isn't getting loaded, that makes detection of the host and attached devices pretty much a non-starter. Thanks! --Bob
Re: [BUG] 4.14 cannot find configured disks/partitions
Debian bug #883089 opened for this issue. Assuming there's a 4.13 or later Debian kernel for alpha available and anyone reading this is brave enough to try it, I'd be interested in knowing if it boots properly. Without a "yes, it works" from someone, I'm extremely reluctant to give it a try myself: been burned too many times trying to boot the Debian kernels on my system :-(. Many thanks in advance for anyone's time and trouble in helping me get to the bottom of this issue. --Bob
Re: [BUG] 4.14 cannot find configured disks/partitions
On Tue, Nov 28, 2017 at 11:49:55PM -0600, Bob Tracy wrote: > (...) > > Upon trying to reboot on my 4.13 kernel, I discovered *it's* now broken > as well, thanks to a recent udev update :-(. Now I get a bunch of > timeouts for all the filesystems (including the swap partition) not > mounted immediately at boot time. "journalctl -xb" is littered with > applicable messages of the form > > dev-sdaX.device: Job dev-sdaX.device/start timed out. > Timed out waiting for device dev-sdaX.device. > > In systemd's emergency mode, I can manually mount (or swapon as > appropriate) the various sdaX partitions. I *thought* this particular scenario sounded familiar. Nearly three years ago, this problem showed up when "systemd" and "udev" became dependent on CONFIG_FHANDLE being set in kernel builds. I didn't suddenly start unsetting it again, so what changed in 2017? --Bob
Re: [BUG] 4.14 cannot find configured disks/partitions
On Thu, Nov 23, 2017 at 11:10:10PM -0600, Bob Tracy wrote: > Perhaps the subject isn't entirely accurate, but that's what seems to be > the case. After loading the initial ramdisk, the boot process stalls > (loops indefinitely) with "mdadm" complaining about not being able to scan > the disks defined in its configuration file, which is bone-stock. What > makes this particularly infuriating is, I don't have anything configured > for "mdadm" to worry about, which is what's normally found, i.e., > nothing of interest. > > The 4.13 kernel loads and boots just fine with exactly the same > up-to-date (unstable/experimental) userspace libraries and utilities. > > Bottom line: the 4.14 kernel doesn't seem to detect my SCSI disks for > some reason. All the recent patches (end of October time frame) for PCI > on alpha (I *think* mostly having to do with addressing some weird kind > of libata conflict) didn't make any difference. > > Any idea what's causing this? Thanks... Another follow-up... Tried booting on 4.14.0 (final) this evening, not expecting that the upgrade from -rc8 would make any difference. It didn't. The SCSI host adapter is not being detected for whatever reason. The only device present on the system that's visible when I do "cat /proc/scsi/scsi" in the initramfs shell is the Toshiba IDE cdrom device. Normally, I'd also see the real SCSI host adapter and its associated SCSI disk. Upon trying to reboot on my 4.13 kernel, I discovered *it's* now broken as well, thanks to a recent udev update :-(. Now I get a bunch of timeouts for all the filesystems (including the swap partition) not mounted immediately at boot time. "journalctl -xb" is littered with applicable messages of the form dev-sdaX.device: Job dev-sdaX.device/start timed out. Timed out waiting for device dev-sdaX.device. In systemd's emergency mode, I can manually mount (or swapon as appropriate) the various sdaX partitions. USB detection/startup is similarly broken. I had to manually load the usb core and host modules. So... It appears there's something currently badly amiss in the systemd+udev universe. Such are the hazards of running the unstable/experimental distribution, and I suppose I should consider myself fortunate I can even get to a mostly-functional multi-user run state at this point. Did I miss a discussion somewhere about "/dev/sdaX" in "/etc/fstab" being deprecated? Moving to UUIDs might help with the current 4.13 brokenness, but won't solve the host adapter detection issue in 4.14. --Bob
[BUG] 4.14 cannot find configured disks/partitions
Perhaps the subject isn't entirely accurate, but that's what seems to be the case. After loading the initial ramdisk, the boot process stalls (loops indefinitely) with "mdadm" complaining about not being able to scan the disks defined in its configuration file, which is bone-stock. What makes this particularly infuriating is, I don't have anything configured for "mdadm" to worry about, which is what's normally found, i.e., nothing of interest. The 4.13 kernel loads and boots just fine with exactly the same up-to-date (unstable/experimental) userspace libraries and utilities. Bottom line: the 4.14 kernel doesn't seem to detect my SCSI disks for some reason. All the recent patches (end of October time frame) for PCI on alpha (I *think* mostly having to do with addressing some weird kind of libata conflict) didn't make any difference. Any idea what's causing this? Thanks... --Bob
Re: [BUG] 4.13.0 kernel build error on Alpha
Yesterday, I thought what I wanted to do was the ".c" file equivalent of the '.section .alphalib,"ax"' substitution we made to the ".S" files. I'm getting a good kernel build with the following patch: --CUT HERE-- --- linux/arch/alpha/lib/memcpy.c.orig 2016-10-20 01:11:37.0 -0500 +++ linux/arch/alpha/lib/memcpy.c 2017-09-11 22:38:41.634495379 -0500 @@ -149,7 +149,7 @@ DO_REST_ALIGNED_DN(d,s,n); } -void * memcpy(void * dest, const void *src, size_t n) +__attribute__((section(".alphalib"))) void * memcpy(void * dest, const void *src, size_t n) { if (!(((unsigned long) dest ^ (unsigned long) src) & 7)) { __memcpy_aligned_up ((unsigned long) dest, (unsigned long) src, --TUC EREH-- The GNU C documentation concerning the "section" attribute as applied to functions implies I should have been able to specify __attribute__((section(".alphalib,\"ax\""))) but the compiler didn't like the comma and argument following the section name. I'm guessing I'll probably end up having to do this same fixup for the rest of the ".c" files in the "arch/alpha/lib" directory at some point, but I'll cross that bridge when I come to it. --Bob
Re: [BUG] 4.13.0 kernel build error on Alpha
On Sun, Sep 10, 2017 at 10:16:41PM -0500, Bob Tracy wrote: > On Sun, Sep 10, 2017 at 07:59:40PM -0700, Matt Turner wrote: > > On Sun, Sep 10, 2017 at 3:34 PM, Bob Tracy <r...@gherkin.frus.com> wrote: > > > Here we go again :-(. Tool versions as follows: > > > (...) > > > > > > MODPOST vmlinux.o > > > WARNING: EXPORT symbol "callback_setenv" [vmlinux] version generation > > > failed, symbol will not be versioned. > > > (...) > > > WARNING: modpost: Found 24 section mismatch(es). > > > To see full details build your kernel with: > > > 'make CONFIG_DEBUG_SECTION_MISMATCH=y' > > > > All of this is fixed by > > > > commit 873f9b5bcbf27f6e89e1879714abe4532cacf5d7 > > Author: Ben Hutchings <b...@decadent.org.uk> > > Date: Wed Jul 19 01:01:16 2017 +0100 > > > > alpha: Restore symbol versions for symbols exported from assembly > > I guess that commit hasn't made it into Linus' tree :-(. If the patch > is short, please forward if you would be so kind. Many thanks in > advance. Never mind. Linus pulled it five days ago as I type this. The fixes obviously didn't make it in time for 4.13-final, but should be in 4.14. --Bob
Re: [BUG] 4.13.0 kernel build error on Alpha
On Sun, Sep 10, 2017 at 07:59:40PM -0700, Matt Turner wrote: > On Sun, Sep 10, 2017 at 3:34 PM, Bob Tracy <r...@gherkin.frus.com> wrote: > > Here we go again :-(. Tool versions as follows: > > > > gcc version 7.2.0 (Debian 7.2.0-3) > > GNU ld (GNU Binutils for Debian) 2.29 (binutils 2.29-9) > > > > Note evidence of the ".alphalib" section patch first tried with the 4.9 > > kernel source. It has worked well up through 4.12. I didn't try > > building any of the 4.13 release candidates because of all the compiler > > updates that came through during that time. > > > > MODPOST vmlinux.o > > WARNING: EXPORT symbol "callback_setenv" [vmlinux] version generation > > failed, symbol will not be versioned. > > (...) > > WARNING: modpost: Found 24 section mismatch(es). > > To see full details build your kernel with: > > 'make CONFIG_DEBUG_SECTION_MISMATCH=y' > > All of this is fixed by > > commit 873f9b5bcbf27f6e89e1879714abe4532cacf5d7 > Author: Ben Hutchings <b...@decadent.org.uk> > Date: Wed Jul 19 01:01:16 2017 +0100 > > alpha: Restore symbol versions for symbols exported from assembly I guess that commit hasn't made it into Linus' tree :-(. If the patch is short, please forward if you would be so kind. Many thanks in advance. > > arch/alpha/lib/memmove.o: In function `memmove': > > (.alphalib+0x2c): relocation truncated to fit: BRSGP against symbol > > `memcpy' defined in .text section in arch/alpha/lib/memcpy.o > > Makefile:1000: recipe for target 'vmlinux' failed > > make: *** [vmlinux] Error 1 > > I have not yet seen this. I *think* what I want to do is the equivalent of the ".S" file '.text --> .section .alphalib,"ax"' substitution for the affected ".c" files in "arch/alpha/lib". At the risk of baring my ignorance to the world, is there a straightforward way of accomplishing that? The "objdump" tool confirms it's not a strict renaming of one section to another: the ".text" section still exists in the compiled ".S" files that were patched. --Bob
[BUG] 4.13.0 kernel build error on Alpha
Here we go again :-(. Tool versions as follows: gcc version 7.2.0 (Debian 7.2.0-3) GNU ld (GNU Binutils for Debian) 2.29 (binutils 2.29-9) Note evidence of the ".alphalib" section patch first tried with the 4.9 kernel source. It has worked well up through 4.12. I didn't try building any of the 4.13 release candidates because of all the compiler updates that came through during that time. MODPOST vmlinux.o WARNING: EXPORT symbol "callback_setenv" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "strrchr" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__divl" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__divqu" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__memsetw" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "strchr" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__reml" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "strcat" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__copy_user" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__remq" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "clear_page" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "strncpy" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "memmove" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__remqu" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "memchr" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__memset" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "copy_page" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__divlu" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "strlen" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "strncat" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "callback_save_env" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "memset" [vmlinux] version generation failed, symbol will not be versioned. WARNING: "saved_config" [vmlinux] is COMMON symbol WARNING: EXPORT symbol "__clear_user" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "callback_getenv" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__divq" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "strcpy" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "___memset" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__remlu" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "csum_ipv6_magic" [vmlinux] version generation failed, symbol will not be versioned. WARNING: EXPORT symbol "__constant_c_memset" [vmlinux] version generation failed, symbol will not be versioned. WARNING: modpost: Found 24 section mismatch(es). To see full details build your kernel with: 'make CONFIG_DEBUG_SECTION_MISMATCH=y' arch/alpha/lib/memmove.o: In function `memmove': (.alphalib+0x2c): relocation truncated to fit: BRSGP against symbol `memcpy' defined in .text section in arch/alpha/lib/memcpy.o Makefile:1000: recipe for target 'vmlinux' failed make: *** [vmlinux] Error 1
[BUG] ntpd fails to start on alpha
No idea how long this has been broken. Noticed it this morning. The NTP daemon appears to start normally based on entries in "syslog", but dies almost immediately afterward with the following message: Apr 13 08:34:09 smirkin ntpd[6676]: Cannot find user ID 110 The appropriate user and group entries exist in /etc/passwd, /etc/shadow, /etc/group, /etc/gshadow. Current "ntp" package version is 1:4.2.8p10+dfsg-1. --Bob
Re: [BUG] alpha: module xxx: Unknown relocation: 1
On Wed, Apr 12, 2017 at 07:36:36PM +1200, Michael Cree wrote: > On Wed, Apr 12, 2017 at 07:57:52AM +0200, Helge Deller wrote: > > On 12.04.2017 04:59, Bob Tracy wrote: > > > Bottom line is, no kernel I've built since 4.9 can load a module. All > > > attempts to load a module result in the error message emitted by > > > "arch/alpha/kernel/module.c" as follows: > > > > > > module XXX: Unknown relocation: 1 > > > > I assume it's due this commmit "modversions: treat symbol CRCs as 32 bit > > quantities": > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=71810db27c1c853b335675bee335d893bc3d324b > > > > For parisc this patch solves it: > > parisc: support R_PARISC_SECREL32 relocation in modules > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5f655322b1ba4bd46e26e307d04098f9c84df764 > > > > > module XXX: Unknown relocation: 1 > > > > For alpha it seems you need to add similar code to handle R_ALPHA_REFLONG > > to apply_relocate_add() in arch/alpha/kernel/module.c > > Would the attached patch fix it? Untested because I don't see the > above issue. I'm up and running on 4.11.0-rc6. The patch works. Feel free to add me as the "Tested-by". Much appreciated! --Bob
[BUG] alpha: module xxx: Unknown relocation: 1
(Adding linux-kernel to the distribution. The issue seems to be architecture-specific, but I'm trying to understand what broke.) The 4.10-rc1 patch set made fairly extensive modifications to "a/kernel/module.c" (I'm leaving the "a" there so there's no doubt I mean the top-level "kernel/module.c" file and not any of the architecture-specific ones). One of the changes was to replace an include of with . This is potentially significant because of the mod we made to alpha's to fix the BRSGP relocation error on __copy_user() issue. Bottom line is, no kernel I've built since 4.9 can load a module. All attempts to load a module result in the error message emitted by "arch/alpha/kernel/module.c" as follows: module XXX: Unknown relocation: 1 I'll start attempting to revert the recent module patches to see if that helps. If anyone reading this knows what's happening, feel free to weigh-in before I spend too much time rebuilding kernels on a slow machine. --Bob
Re: current toolchain on Alpha is crap?
On Tue, Apr 11, 2017 at 06:12:32PM +1200, Michael Cree wrote: > > > commit e811f8794673153858eac86448d827002e50ac0a > > > Author: Michael Cree> > > Date: Wed Feb 8 13:49:02 2017 +1300 > > > > > > alpha: fix link errors in user_copy(). > > > > Missed both the above. > > This one attached. It might help with your relocation problems. Ah... I recognize this one as part of the "alphalib" patch set. Already applied. --Bob
Re: current toolchain on Alpha is crap?
On Mon, Apr 10, 2017 at 09:42:50PM +1200, Michael Cree wrote: > On Sun, Apr 09, 2017 at 07:47:55PM -0500, Bob Tracy wrote: > > Both my 4.10 and recent 4.11-rc5 builds fail to boot/run properly. > > Just built 4.10.9 for DP264 and that boots fine on the XP1000. > > I have the following four extra commits on my kernel: > > commit 058948f6ea70eb910c7602c9ce96c075913ff924 > Author: Michael Cree <mc...@orcon.net.nz> > Date: Fri Feb 10 21:17:22 2017 +1300 > > alpha: Shift library routines into own alphalib linker section Still running with this one locally. > commit 635891340c8a48f2746f501530454df8881f1ec9 > Author: Michael Cree <mc...@orcon.net.nz> > Date: Mon Apr 10 20:58:02 2017 +1200 > > Revert "mm/compaction.c: fix zoneindex in kcompactd()" Never saw this one, which means I probably wasn't paying attention. Found the original patch from 2016 and did a "patch -R": it applied against 4.11.0-rc6 with a small bit of fuzz, but seems correct by inspection. > commit 4ace6a63e6c7a545e5921ec340a7c93a0ac0b02f > Author: Richard Henderson <r...@twiddle.net> > Date: Wed Jul 30 11:42:31 2014 -1000 > > alpha: Remove "strange" OSF/1 fork semantics > > (...) > > commit e811f8794673153858eac86448d827002e50ac0a > Author: Michael Cree <mc...@orcon.net.nz> > Date: Wed Feb 8 13:49:02 2017 +1300 > > alpha: fix link errors in user_copy(). Missed both the above. Trying a 4.11.0-rc6 build with the "mm/compaction.c" revert in place. May have to instrument "arch/alpha/kernel/module.c" to get a better idea what's going on. --Bob
Re: current toolchain on Alpha is crap?
On Sun, Apr 09, 2017 at 07:47:55PM -0500, Bob Tracy wrote: > (...) > Both my 4.10 and recent 4.11-rc5 builds fail to boot/run properly. The > console spews an endless stream of "unix: Unknown relocation: 1" errors > on each attempt to load any module. I think I saw several messages to > the effect of "exec unknown format" as well. More info... The error message originates in "arch/alpha/kernel/module.c", and the "unix" string is due to trying to load the "net/unix/unix.ko" module. Also saw the module load error for "ipv6.ko" and *many* others. I seem to recall upstream messing around with stricter module checking. There are reports of people using the "nvidia" binary driver being stuck at 4.9 because of the associated recent kernel changes. The "file" command doesn't report anything unusual with respect to the relocation type for any of the modules, so I'm feeling a bit better about the integrity of the toolchain used for the builds. I suppose it's possible that Alpha got overlooked when the module handling changes were implemented. --Bob
current toolchain on Alpha is crap?
Well, maybe the subject line is a bit over the top, but there's either an element of truth to it, or the kernel developers have seriously screwed things up in a fundamental way for kernels on Alpha after 4.9. Both my 4.10 and recent 4.11-rc5 builds fail to boot/run properly. The console spews an endless stream of "unix: Unknown relocation: 1" errors on each attempt to load any module. I think I saw several messages to the effect of "exec unknown format" as well. Here's the current toolchain info: gcc version 6.3.0 20170321 (Debian 6.3.0-11) GNU ld (GNU Binutils for Debian) 2.28 Rebooting on 4.9 restores things to a sane state. Is anyone else "out there" having better luck? I'll probably end up having to recreate the 4.9 tree and build it with the current toolchain to eliminate the tools as the cause of what I'm seeing. That being said, a kernel build on the PWS-433au is an overnight-plus-a-bit-longer proposition that I'd like to avoid if there's another way to figure out what's going on. Thanks. --Bob
Re: nodejs package issues
On Sun, Feb 19, 2017 at 05:21:33PM +1300, Michael Cree wrote: > Thanks for doing that. If there is no action on that by the > maintainer in the near future, which is likely to be the case since > Debian is in hard freeze, I could upload a rebuilt libkf5purpose-bin > with the nodejs dependency removed to debian-ports unreleased. That seems to me to be a good way to go. Besides, does "hard freeze" have any meaning for a non-release architecture anyway? All our updates are from "unstable", "experimental", and "unreleased" as it is. > In the meantime I have found the source of the bug in the binutils > package (but not the fix) and have reported that upstream: > https://sourceware.org/bugzilla/show_bug.cgi?id=21181 Nice work. Going to be interesting to see what the explanation is for the section offset difference. --Bob
Re: nodejs package issues
On Sat, Feb 18, 2017 at 11:49:06AM -0600, Bob Tracy wrote: > On Fri, Feb 17, 2017 at 09:13:51AM +0100, Helge Deller wrote: > > On hppa we will not support jnode short-term (and I assume it's true for > > most other ports too). > > So, if you open a bug, please include hppa to have the nodejs dependency > > removed. > > I'll take care of this... Requested closure of #855259 with probable > status of "won't fix" due to the magnitude of the work required to port > "nodejs". Agreed that removing the inappropriate dependency on "nodejs" > by "libkf5purpose-bin" is the way to go. Done. See Debian bug #855486. "alpha" and "hppa" specifically mentioned. --Bob
Re: nodejs package issues
On Fri, Feb 17, 2017 at 09:13:51AM +0100, Helge Deller wrote: > On 17.02.2017 06:56, Michael Cree wrote: > > On Thu, Feb 16, 2017 at 08:43:28PM -0600, Bob Tracy wrote: > >> On Thu, Feb 16, 2017 at 08:23:05AM +1300, Michael Cree wrote: > >>> On Wed, Feb 15, 2017 at 03:43:02AM -0600, Bob Tracy wrote: > >>> (...) > >>>> Next issue is the "-m32" argument getting passed to the compiler. Not > >>>> appropriate for Alpha. > >>> > >>> That's a bug that should be reported to the package maintainer. > >> > >> Done. See Debian bug #855259 filed against the source package > > > > We have libkf5purpose-bin up to date in the archive but it depends on > > nodejs. We should have filed a bug against libkf5purpose-bin to have > > the nodejs dependency removed as has been already been done for armel > > which also does not have nodejs built. Indeed the bug report should > > probably ask for the dependency to be removed for all ports arches > > without nodejs (looks like alpha, hppa, m68k, powerpcspe, sh4, sparc64 > > and x32 but I am not sure whether there is active work or not to > > support nodejs on any of those.) > > On hppa we will not support jnode short-term (and I assume it's true for > most other ports too). > So, if you open a bug, please include hppa to have the nodejs dependency > removed. I'll take care of this... Requested closure of #855259 with probable status of "won't fix" due to the magnitude of the work required to port "nodejs". Agreed that removing the inappropriate dependency on "nodejs" by "libkf5purpose-bin" is the way to go. --Bob
Re: nodejs package issues
On Fri, Feb 17, 2017 at 06:58:02PM +1300, Michael Cree wrote: > On Thu, Feb 16, 2017 at 09:19:09PM -0600, Bob Tracy wrote: > > On Thu, Feb 16, 2017 at 08:23:05AM +1300, Michael Cree wrote: > > > I've got bigger fish to fry at the moment. In particular a > > > binutils/glibc bug that is causing segfaults in the dynamic symbol > > > resolver. Try this: write a simple "Hello world" program in C. > > > Compile with "-Wl,-z,now" linker option which causes the dynamic > > > loader to resolve all symbols at program invocation, rather than > > > resolving symbols when first used. If compiled with a recent > > > toolchain it segfaults [2]. > > > (...) > > > > > > [2] Toolchain gcc 4:6.1.1-1, binutils 2.27-8 produces a working > > > executable, but toolchains later than gcc 4:6.2.1-1, binutils > > > 2.27.90.20170124-2 are known to be bad. > > > > Verified the broken behavior for a kernel.org kernel version 4.9.0 with > > gcc 6.3.0 20170205 (Debian 6.3.0-6) and binutils 2.27.90.20170205. This > > is *nasty*. > > I don't understand. The toolchain bug is for executables built to run > in userspace, not for the kernel. What's the broken behaviour you are > seeing with the kernel? None whatsoever with the kernel. You spoke of a commit that needed to be backed out prior to building a kernel, but it wasn't clear from context what package that commit was against, and I ASSumed it was against the kernel tree without bothering to check. I went back and checked your earlier message, and I *think* the proper context was a warning not to try building a kernel with the broken tool chain. My kernel, built with the broken tool chain, seems to be working properly as best I can tell. Sorry for the confusion. --Bob
Re: nodejs package issues
On Thu, Feb 16, 2017 at 08:23:05AM +1300, Michael Cree wrote: > I've got bigger fish to fry at the moment. In particular a > binutils/glibc bug that is causing segfaults in the dynamic symbol > resolver. Try this: write a simple "Hello world" program in C. > Compile with "-Wl,-z,now" linker option which causes the dynamic > loader to resolve all symbols at program invocation, rather than > resolving symbols when first used. If compiled with a recent > toolchain it segfaults [2]. > (...) > > [2] Toolchain gcc 4:6.1.1-1, binutils 2.27-8 produces a working > executable, but toolchains later than gcc 4:6.2.1-1, binutils > 2.27.90.20170124-2 are known to be bad. Verified the broken behavior for a kernel.org kernel version 4.9.0 with gcc 6.3.0 20170205 (Debian 6.3.0-6) and binutils 2.27.90.20170205. This is *nasty*. --Bob
Re: nodejs package issues
On Thu, Feb 16, 2017 at 08:23:05AM +1300, Michael Cree wrote: > On Wed, Feb 15, 2017 at 03:43:02AM -0600, Bob Tracy wrote: > (...) > > Next issue is the "-m32" argument getting passed to the compiler. Not > > appropriate for Alpha. > > That's a bug that should be reported to the package maintainer. Done. See Debian bug #855259 filed against the source package. The "reportbug" script automatically assigned a severity that the package maintainer flagged as inappropriate due to alpha not being a release architecture :-(. Unknown whether this issue will get worked in a timely (for some definition of the word) fashion. --Bob
Re: undone by a dead CR2032 button cell
On Tue, Jan 24, 2017 at 12:03:37AM -0700, Alan Young wrote: > (...) > And should the day ever come, heavens forbid, that a card has passed into > the silicon beyond, remember that SRM can boot headless and has a serial > port console on COM1. > (...) Hmmm... As long as it is possible to do the AlphaBIOS --> SRM switch with the "foreign" video card (i.e., even though AlphaBIOS complains about the PCI configuration), being able to boot SRM headless to set "pci_device_override" is a good thing to know about. Thanks! --Bob
undone by a dead CR2032 button cell
I *knew* there was a reason I hung onto my old TGA video card :-). The batteries in the UPS to which my Alpha was attached required replacing. No way to do that with the machine plugged in and running, so I shut it down and got to work. Upon rebooting, it was obvious that my CMOS settings were gone. The system attempted to boot using AlphaBIOS (appropriate for Windows NT), and the system clock was set to January 1, 1995. Quick trip to the archives to verify I could switch back to SRM from within AlphaBIOS "Setup", and yes, one can do that from the "Advanced Setup" menu (F6 within "Setup"). First though, had to change out the CR2032 button cell. Easy to remove the main system board and change out the button cell. Took advantage of the opportunity to blow out a few years of accumulated dust. Don't judge me :-), but yes, it should never have gotten that bad. Next problem was a PCI configuration issue that was preventing booting. I remembered that I had long ago switched out the stock Alpha video card for an ATI Radeon, and the Alpha was *most* unhappy with the PCI configuration as a result. Another trip back to the archives, and I found this pearl of great price from Jay Estabrook in an e-mail exchange from 11 years ago: >>> set pci_device_override -1 First, had to remove the ATI video card and reinstall the old TGA. Next, boot the machine (still in AlphaBIOS at that point), go into setup and switch to the Digital UNIX console (SRM), save settings, and reboot. Now in SRM, at the ">>> " prompt, set pci_device_override as indicated above. Powered down the machine. Swapped out the video card. Voila! All is well. Definitely felt that cannonball whizzing by overhead. --Bob
Re: [BUG] 4.9.0 build error on Alpha
Credit to all who participated in working this issue. Matt -- attached is a patch set to be applied against the kernel.org 4.9.0 source tree. If it passes inspection, please approve and forward upstream. Apologies for not including in-line, but I don't trust my mailer to preserve formatting unless I send as an attachment. Thanks. --Bob This series of patches addresses vmlinux ld relocation errors on the Alpha architecture that appeared at some point after the 4.8.0 release of the kernel.org source tree. Fixes are as suggested by Helge Deller <del...@gmx.de> and Michael Cree <mc...@orcon.net.nz>, with significant input from Maciej W. Rozycki <ma...@linux-mips.org> and Matt Turner <matts...@gmail.com>. The patch set is known to apply cleanly to 4.9.0. Approach is to define a new "alphalib" section for all the Alpha-specific library functions to be linked into the final vmlinux. The "uaccess.h" patch addresses "__copy_user" relocation errors reported elsewhere. Tested-by: Bob Tracy <r...@frus.com> --- a/arch/alpha/include/asm/uaccess.h 2016-11-19 08:26:53.0 -0600 +++ b/arch/alpha/include/asm/uaccess.h 2016-12-31 15:01:41.621694846 -0600 @@ -344,7 +344,7 @@ /* This little bit of silliness is to get the GP loaded for a function that ordinarily wouldn't. Otherwise we could have it done by the macro directly, which can be optimized the linker. */ -#ifdef MODULE +#if 1 #define __module_address(sym) "r"(sym), #define __module_call(ra, arg, sym)"jsr $" #ra ",(%" #arg ")," #sym #else --- a/arch/alpha/kernel/vmlinux.lds.S 2016-11-19 08:26:53.0 -0600 +++ b/arch/alpha/kernel/vmlinux.lds.S 2016-12-31 09:06:15.548643355 -0600 @@ -20,6 +20,7 @@ _text = .; /* Text and read-only data */ .text : { HEAD_TEXT + *(.alphalib) TEXT_TEXT SCHED_TEXT CPUIDLE_TEXT --- a/arch/alpha/lib/callback_srm.S 2016-11-19 08:26:53.0 -0600 +++ b/arch/alpha/lib/callback_srm.S 2016-12-31 21:24:37.930036466 -0600 @@ -5,7 +5,7 @@ #include #include -.text +.section .alphalib,"ax" #define HWRPB_CRB_OFFSET 0xc0 #if defined(CONFIG_ALPHA_SRM) || defined(CONFIG_ALPHA_GENERIC) --- a/arch/alpha/lib/clear_page.S 2016-11-19 08:26:53.0 -0600 +++ b/arch/alpha/lib/clear_page.S 2016-12-31 21:25:11.237910422 -0600 @@ -4,7 +4,7 @@ * Zero an entire page. */ #include - .text + .section .alphalib,"ax" .align 4 .global clear_page .ent clear_page --- a/arch/alpha/lib/clear_user.S 2016-11-19 08:26:53.0 -0600 +++ b/arch/alpha/lib/clear_user.S 2016-12-31 00:54:48.458611939 -0600 @@ -24,6 +24,8 @@ * Clobbers: * $1,$2,$3,$4,$5,$6 */ +.section .alphalib,"ax" + #include /* Allow an exception for an insn; exit if we get one. */ --- a/arch/alpha/lib/copy_page.S2016-11-19 08:26:53.0 -0600 +++ b/arch/alpha/lib/copy_page.S2016-12-31 21:25:35.673189381 -0600 @@ -4,7 +4,7 @@ * Copy an entire page. */ #include - .text + .section .alphalib,"ax" .align 4 .global copy_page .ent copy_page --- a/arch/alpha/lib/copy_user.S2016-11-19 08:26:53.0 -0600 +++ b/arch/alpha/lib/copy_user.S2016-12-31 00:57:16.032150766 -0600 @@ -26,6 +26,7 @@ * $1,$2,$3,$4,$5,$6,$7 */ +.section .alphalib,"ax" #include /* Allow an exception for an insn; exit if we get one. */ --- a/arch/alpha/lib/csum_ipv6_magic.S 2016-11-19 08:26:53.0 -0600 +++ b/arch/alpha/lib/csum_ipv6_magic.S 2016-12-31 00:57:26.872418989 -0600 @@ -12,6 +12,7 @@ * added by Ivan Kokshaysky <i...@jurassic.park.msu.ru> */ +.section .alphalib,"ax" #include .globl csum_ipv6_magic .align 4 --- a/arch/alpha/lib/dbg_current.S 2007-12-03 00:52:36.0 -0600 +++ b/arch/alpha/lib/dbg_current.S 2016-12-31 21:26:09.439047824 -0600 @@ -7,7 +7,7 @@ #include - .text + .section .alphalib,"ax" .set noat .globl _mcount --- a/arch/alpha/lib/dbg_stackcheck.S 2007-12-03 00:52:36.0 -0600 +++ b/arch/alpha/lib/dbg_stackcheck.S 2016-12-31 21:26:21.816347141 -0600 @@ -7,7 +7,7 @@ #include - .text + .section .alphalib,"ax" .set noat .align 3 --- a/arch/alpha/lib/dbg_stackkill.S2007-12-03 00:52:36.0 -0600 +++ b/arch/alpha/lib/dbg_stackkill.S2016-12-31 21:26:31.401796458 -0600 @@ -8,7 +8,7 @@ #include - .text + .section .alphalib,"ax" .set noat .align 5 --- a/arch/alpha/lib/divide.S 2016-11-19 08:26:53.0 -0600 +++ b/arch/alpha/lib/divide.S 2016-12-31 00:58:07.403557879 -0600 @@ -45,6 +45,7 @@ * $2
Re: [BUG] 4.9.0 build error on Alpha
On Sat, Dec 31, 2016 at 09:32:58PM -0600, Bob Tracy wrote: > On Sun, Jan 01, 2017 at 01:23:06AM +, Maciej W. Rozycki wrote: > > You need to *replace* any `.text' pseudo-ops throughout with the said > > `.section' pseudo-op for this to have any effect. > > That makes sense, especially given the fact the *original* error > messages didn't change. Sorry for being "thick". Corrections made. > Build proceeding... That did the trick, along with Michael's "uaccess.h" patch. Success. --Bob
Re: [BUG] 4.9.0 build error on Alpha
On Sun, Jan 01, 2017 at 01:23:06AM +, Maciej W. Rozycki wrote: > You need to *replace* any `.text' pseudo-ops throughout with the said > `.section' pseudo-op for this to have any effect. That makes sense, especially given the fact the *original* error messages didn't change. Sorry for being "thick". Corrections made. Build proceeding... --Bob
Re: [BUG] 4.9.0 build error on Alpha
On Sun, Jan 01, 2017 at 08:38:51AM +1300, Michael Cree wrote: > On Sat, Dec 31, 2016 at 09:20:37AM -0600, Bob Tracy wrote: > > With '.section .alphalib,"ax"' added to the top of "arch/alpha/lib/*.S" > > (below opening comment block, if present, but prior to any include > > directives), > > and the modified patch to "arch/alpha/kernel/vmlinux.lds.S", I now get > > multiple relocation errors as follows: > > > > LD init/built-in.o > > arch/alpha/lib/lib.a(strcat.o): In function `strcat': > > (.text+0x60): relocation truncated to fit: BRADDR against symbol `__stxcpy' > > defined in .text section in arch/alpha/lib/lib.a(stxcpy.o) > > arch/alpha/lib/lib.a(strncat.o): In function `strncat': > > (.text+0x60): relocation truncated to fit: BRADDR against symbol > > `__stxncpy' defined in .text section in arch/alpha/lib/lib.a(stxncpy.o) > > drivers/built-in.o: In function `radeon_cs_parser_init.part.4': > > drivers/gpu/drm/radeon/radeon_cs.o:(.text+0x119bd0): relocation truncated > > to fit: BRSGP against symbol `__copy_user' defined in .alphalib section in > > arch/alpha/lib/lib.a(copy_user.o) > > Try changing the #ifdef MODULE above __copy_tofrom_user_nocheck to > #if 1 in uaccess.h. That should fix the copy_user relocation errors. Have accumulated enough patches that I'm running a "from scratch" build rather than trusting all the dependency checking. It will be sometime next year before I report back :-). --Bob