Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-03-15 Thread Christian Kastner
On 2023-03-13 18:28, Christian Kastner wrote:
> [ Impact ]
> The new versions are in far better shape: they've catched missing
> dependencies, added patches, improved the build process, etc.

Apologies, I was only thinking of the more recent releases.

Revision -2 fixed an RC bug in January, but never got the chance to
migrate because of an RC bug in a dependency. Revision -6 fixed another
RC bug.

All releases after -2 were incremental improvements that basically never
got the chance to migrate because of a dependency not migrating.



Bug#1030704: extra-package doesn't work with autopkgtest-virt-server=docker

2023-03-15 Thread Christian Kastner
Hi,

On 2023-02-06 17:33, Shengjing Zhu wrote:
> +--+
> | Update chroot   
>  |
> +--+
> 
> Get:1 
> file:/build/golang-github-coredhcp-coredhcp-xIXI4Z/resolver-dcoFh4/apt_archive
>  ./ InRelease
> Ign:1 
> file:/build/golang-github-coredhcp-coredhcp-xIXI4Z/resolver-dcoFh4/apt_archive
>  ./ InRelease
> Get:2 
> file:/build/golang-github-coredhcp-coredhcp-xIXI4Z/resolver-dcoFh4/apt_archive
>  ./ Release [603 B]
> Get:2 
> file:/build/golang-github-coredhcp-coredhcp-xIXI4Z/resolver-dcoFh4/apt_archive
>  ./ Release [603 B]
> Get:3 
> file:/build/golang-github-coredhcp-coredhcp-xIXI4Z/resolver-dcoFh4/apt_archive
>  ./ Release.gpg
> Ign:3 
> file:/build/golang-github-coredhcp-coredhcp-xIXI4Z/resolver-dcoFh4/apt_archive
>  ./ Release.gpg
> Get:4 
> file:/build/golang-github-coredhcp-coredhcp-xIXI4Z/resolver-dcoFh4/apt_archive
>  ./ Packages [999 B]
> Err:4 
> file:/build/golang-github-coredhcp-coredhcp-xIXI4Z/resolver-dcoFh4/apt_archive
>  ./ Packages
>   Could not open file 
> /var/lib/apt/lists/partial/_build_golang-github-coredhcp-coredhcp-xIXI4Z_resolver-dcoFh4_apt%5farchive_._Packages
>  - open (13: Permission denied)
> ...
> Fetched 440 kB in 2s (289 kB/s)
> Reading package lists...
> E: Failed to fetch 
> store:/var/lib/apt/lists/partial/_build_golang-github-coredhcp-coredhcp-xIXI4Z_resolver-dcoFh4_apt%5farchive_._Packages
>   Could not open file 
> /var/lib/apt/lists/partial/_build_golang-github-coredhcp-coredhcp-xIXI4Z_resolver-dcoFh4_apt%5farchive_._Packages
>  - open (13: Permission denied)
> E: Some index files failed to download. They have been ignored, or old ones 
> used instead.

Is it possible that you are running with umask 0027? If you, can you try
with umask 0022?

I ran into a similar issue in autopkgtest, and I think they could be
related. See [1].

Best,
Christian

[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1032487



Bug#1033352: sbuild: autokpgtest-virt-server needs host $HOME

2023-03-23 Thread Christian Kastner


Package: sbuild
Version: 0.85.2
Severity: normal

Hi josch,

Attempting to build a package with the autopkgtest-virt-podman backend
fails because of what I suspect is an issue with $HOME directory
handling. podman needs $HOME on the host to find containers, but it
defaults to /sbuild-nonexistent, which I guess is meant for the target
enviromnent.

So unless I'm misunderstanding something, when
autopkgtest-virt-server=podman, $HOME on the host should just remain
$HOME, and $HOME in the target environment can be cleared as usual.

Best,
Christian


Steps to reproduce:

# Create a container. This will be tagged 'autopkgtest/debian:unstable'
$ autopkgtest-build-podman


# Attempt to build a package (src:libocas has no dependencies, so it fails fast)
$ sbuild --chroot-mode=autopkgtest --autopkgtest-virt-server=podman 
--autopkgtest-virt-server-opt autopkgtest/debian:unstable --dist unstable 
libocas

> cannot resolve /sbuild-nonexistent: lstat /sbuild-nonexistent: no such file 
> or directory
> E: Error locking chroot session: skipping libocas


# On the host: create a dummy file, repeat build attempt
$ sudo touch /sbuild-nonexistent && sudo chmod 644 /sbuild-nonexistent
$ sbuild --chroot-mode=autopkgtest --autopkgtest-virt-server=podman 
--autopkgtest-virt-server-opt autopkgtest/debian:unstable --dist unstable 
libocas

> I: NOTICE: Log filtering will replace 'autopkgtest-virt-dummy-location' with 
> '<>'
> time="2023-03-23T09:26:41+01:00" level=error msg="stat 
> /sbuild-nonexistent/.config/containers/storage.conf: not a directory"

This suggests that podman was the culprit.

# On the host: remove the dummy file, link to $HOME instead, repeat
$ sudo rm /sbuild-nonexistent && sudo ln -s $HOME /sbuild-nonexistent
$ sbuild --chroot-mode=autopkgtest --autopkgtest-virt-server=podman 
--autopkgtest-virt-server-opt autopkgtest/debian:unstable --dist unstable 
libocas

> Build Architecture: amd64
> Build Type: binary
> Build-Space: 20496
> Build-Time: 6
> Distribution: unstable
> Host Architecture: amd64
> Install-Time: 16
> Job: libocas
> Lintian: pass
> Machine Architecture: amd64
> Package: libocas
> Package-Time: 34
> Source-Version: 0.97+dfsg-8
> Space: 20496
> Status: successful
> Version: 0.97+dfsg-8

Success!

$ sudo rm /sbuild-nonexistent



Bug#1032677: libamdhip64-5: Missing dependency on libamd-comgr2

2023-03-10 Thread Christian Kastner
Package: libamdhip64-5
Version: 5.2.3-5
Severity: serious

When working on rocrand, I noticed that the rocrand libraries did not
work without libamd-comgr2 installed.

Cordell Bloor pointed out that libamd-comgr2 is essential to any library
that contains GPU kernels, and the calls are likely to be made through
libhipamd64-5.

libamd-comgr2 is dynamically loaded, which is why dh_shlibdeps did not
discover it.



Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-03-16 Thread Christian Kastner
Hi Paul,

On 2023-03-16 10:31, Paul Gevers wrote:
> Control: tags -1 moreinfo On 16-03-2023 00:16, Christian Kastner 
> wrote: For next time, can you please contact us earlier? We could 
> have solved the earlier problems in testing-proposed-updates (in 
> January), then we would now be in a better position.

I didn't think of that solution as the RC-blocked dependency was only
available in unstable, and admittedly because I thought this would
resolve itself in time.

But in any case: yes, earlier contact would have been helpful, and I'll
do so in future.

> + * Reduce arch to amd64, arm64, ppc64el
> 
> But it fails on ppc64el; so why this selection?

Because those are the only architectures for which the required amdgpu
kernel driver is available [2].

> Also, as the other architectures FTBFS, we prefer in Debian to *not*
>  limit the architectures, but just let them fail [1]. This eases 
> porter efforts.

Thanks for pointing this out, I thought it was the other way around
(prefer *to* limit to avoid failures). Well, with ppc64el, we followed
that strategy.

> If the packages really don't make sense on some architectures, 
> consider using some of the "properties" provided by 
> bin:architecture-properties in your Build-Depends.

I wasn't aware of this package and I don't think it'll help us here
because we're specifically tracking [2]. But it'll be very useful to
some of my other packages, thanks!

> By the way, I checked, but none of the ci.d.n host will run any of 
> your tests, as none of them has an amdgpu (is that a thing you could 
> expect on non-amd architectures by the way?).

Correct! Tests will be skipped on official infra.

It's not just a matter of the missing hardware (we have it, but DSA has
understandable concerns), it's also about how to even express that a
package needs a GPU to run its tests (build-time or autopkgtest).

I recently initiated a discussion about this [3]. For now, the idea to
run parallel debci infra with guaranteed GPU presence, gather
experience, and to eventually share proposals on how a GPU dependency
could be expressed in d/control and d/tests/control.

> One thing I spotted along the way; the (Build-)Depends on llvm 
> related packages use the *versioned* ones. Is there a reason not to 
> use the unversioned ones from src:llvm-defaults? That would make llvm
> transitions a bit easier.

I'd have to check with the co-maintainers who added it, but from what I
gather so far, the ROCm stack needs a very recent llvm because of many
changes being upstreamed there.

> Overall, the diff is a bit long (and has some irrelevant stuff), so 
> I'm hesitant to offer t-p-u now (to avoid waiting for 
> llvm-toolchain-15).

Understood. Yeah, the diff is long, unfortunately, as the packaging
fixes accumulated over time.

Is this something that you could consider at a later point in time, if I
also break down the diff into more reviewable fragments (dependencies,
build, metadata, ...)? Because I do think that most changes are just
fixes of one sort or another - no features added.

Best,
Christian


> [1] https://lists.debian.org/debian-devel/2022/09/msg00105.html and follow-up 

[2] 
https://github.com/torvalds/linux/blob/v6.2/drivers/gpu/drm/amd/amdkfd/Kconfig#L6-L8
[3] https://lists.debian.org/debian-ai/2023/03/msg00038.html



Bug#1032487: autopkgtest-virt-podman: fails early with umask 0027

2023-03-07 Thread Christian Kastner


Package: autopkgtest
Version: 5.28
Severity: normal

When supplying autopkgtest with built binaries (-B) and using the
autopkgtest-virt-podman server, a umask of 0027 will lead to an early
failure, aborting the test.

I'll file an MR fixing this shortly.

Steps to reproduce:

  # Assuming .debs and .dscs from a recently built package
  $ autopkgtest -B *.dsc *.deb -- podman autopkgtest/debian:unstable

  | [...]
  | autopkgtest [23:41:08]: test executables: preparing testbed
  | Reading package lists...
  | Building dependency tree...
  | Reading state information...
  | The following NEW packages will be installed:
  |   apt-utils
  | 0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
  | Need to get 309 kB of archives.
  | After this operation, 1062 kB of additional disk space will be used.
  | Get:1 http://deb.debian.org/debian unstable/main amd64 apt-utils amd64 
2.6.0 [309 kB]
  | debconf: delaying package configuration, since apt-utils is not installed
  | Fetched 309 kB in 0s (1692 kB/s)
  | Selecting previously unselected package apt-utils.
  | (Reading database ... 9523 files and directories currently installed.)
  | Preparing to unpack .../apt-utils_2.6.0_amd64.deb ...
  | Unpacking apt-utils (2.6.0) ...
  | Setting up apt-utils (2.6.0) ...
  | Get:1 file:/tmp/autopkgtest-virt-docker.shared.bt1yg15k/downtmp/binaries  
InRelease
  | Ign:1 file:/tmp/autopkgtest-virt-docker.shared.bt1yg15k/downtmp/binaries  
InRelease
  | Get:2 file:/tmp/autopkgtest-virt-docker.shared.bt1yg15k/downtmp/binaries  
Release [816 B]
  | Get:2 file:/tmp/autopkgtest-virt-docker.shared.bt1yg15k/downtmp/binaries  
Release [816 B]
  | Get:3 file:/tmp/autopkgtest-virt-docker.shared.bt1yg15k/downtmp/binaries  
Release.gpg
  | Ign:3 file:/tmp/autopkgtest-virt-docker.shared.bt1yg15k/downtmp/binaries  
Release.gpg
  | Get:4 file:/tmp/autopkgtest-virt-docker.shared.bt1yg15k/downtmp/binaries  
Packages [2210 B]
  | Err:4 file:/tmp/autopkgtest-virt-docker.shared.bt1yg15k/downtmp/binaries  
Packages
  |   Could not open file 
/var/lib/apt/lists/partial/_tmp_autopkgtest-virt-docker.shared.bt1yg15k_downtmp_binaries_Packages
 - open (13: Permission denied)
  | Reading package lists...
  | E: Failed to fetch 
store:/var/lib/apt/lists/partial/_tmp_autopkgtest-virt-docker.shared.bt1yg15k_downtmp_binaries_Packages
  Could not open file 
/var/lib/apt/lists/partial/_tmp_autopkgtest-virt-docker.shared.bt1yg15k_downtmp_binaries_Packages
 - open (13: Permission denied)
  | E: Some index files failed to download. They have been ignored, or old ones 
used instead.
  | [...]



Bug#1032316: llvm-toolchain-15: is this version intended for Debian 12 'bookworm'?

2023-03-09 Thread Christian Kastner
(debian-ai, apologies for re-sending, I hit the wrong reply button.)

On 2023-03-08 18:21, Simon McVittie wrote:
> There is *a* version of llvm-toolchain-15 in bookworm, version 1:15.0.6-4,
> which is used by the rocm-hipamd_5.2.3-1 and mesa_22.3.3-1 in bookworm.
> I'm not suggesting that 1:15.0.6-4 should be *removed*. What I'm asking
> here is whether it's intended to be upgraded to 1:15.0.7-1 (or presumably
> a later version where #1029010 has been fixed), or kept at 1:15.0.6-4
Are the upstream differences between 15.0.6 and 15.0.7 that big?

I think for rocm-hipamd, the main issue is that 5.2.3-5 depends on
libclang-rt-15-dev, which was introduced to unstable by means of a
package split in 1:15.0.7-1.

I'm pretty sure rocm-hipamd could live with the version in bookworm, but
it would need a new upload to change the dependency to reflect the
pre-split view.

In fact, we almost went so far as to implement just that, but we let it
as-is after all, because we assumed that #1029010 would be resolved in
due time, one way or another.

Best,
Christian



Bug#1032487: MR filed

2023-03-07 Thread Christian Kastner
Control: tags -1 + patch

An MR has been filed under

https://salsa.debian.org/ci-team/autopkgtest/-/merge_requests/219



Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-03-13 Thread Christian Kastner
Package: release.debian.org
Severity: normal
User: release.debian@packages.debian.org
X-Debbugs-Cc: debian...@lists.debian.org
Usertags: unblock
Control: affects -1 + src:rocm-hipamd

Please unblock package rocm-hipamd

rocm-hipamd 5.2.3-1 has been in testing for a few months now, so have
the following -2 and -3 revisions.

The three revisions since January were blocked from migrating by its
dependency src:llvm-toolchain-15, where a package split was introduced
to unstable, and one of the new packages was not allowed to migrate
because of an RC bug. This bug was recently fixed.

[ Reason ]
The changes in -2 to -6 are all just added patches, or packaging fixes.

[ Impact ]
The new versions are in far better shape: they've catched missing
dependencies, added patches, improved the build process, etc.

[ Tests ]
Manual tests, on the workstations of multpile maintainers. These
packages cannot be tested on debci because the tests require GPUs to work.

[ Risks ]
Given that there are no upstream changes other than added patches for
fixing this, the risks are minimal.

[ Checklist ]
  [x] all changes are documented in the d/changelog
  [x] I reviewed all changes and I approve them
  [x] attach debdiff against the package in testing

unblock rocm-hipamd/5.2.3-6diff -Nru rocm-hipamd-5.2.3/debian/changelog rocm-hipamd-5.2.3/debian/changelog
--- rocm-hipamd-5.2.3/debian/changelog  2022-10-20 21:20:33.0 +0200
+++ rocm-hipamd-5.2.3/debian/changelog  2023-03-10 23:38:51.0 +0100
@@ -1,3 +1,78 @@
+rocm-hipamd (5.2.3-6) unstable; urgency=medium
+
+  * Reduce arch to amd64, arm64, ppc64el
+  * libamdhip64-5: Add dependency on libamd-comgr2 (Closes: #1032677)
+  * Add myself to Uploaders
+  * Fix Maintainer (same list, different name)
+
+ -- Christian Kastner   Fri, 10 Mar 2023 23:38:51 +0100
+
+rocm-hipamd (5.2.3-5) unstable; urgency=medium
+
+  * d/{libamdhip64-dev,rules}: fix version file
+Closes: #1031264
+  * add d/p/0020-replace-x86_64-with-variables.patch
+to fix build on aarch64
+  * d/control: add file to hipcc dependencies
+  * d/control: add dependencies for find_package(hip)
+Closes: #1031538
+  * add d/p/0021-fix-default-cmake-build-on-unsupported-gpus.patch
+to enable gpu arch autodetection with find_package(hip)
+  * d/not-installed: ignore doxygen docs
+  * d/p/000{4,8,9}*.patch: change hip-lang cmake files,
+to partially fix #1031540
+  * d/copyright: update copyright date
+  * d/control: add self to uploaders
+  * cleanup patch metadata
+
+ -- Cordell Bloor   Sun, 19 Feb 2023 03:51:26 -0700
+
+rocm-hipamd (5.2.3-4) unstable; urgency=medium
+
+  * d/t/hipcc: also skip when no kfd driver is loaded.
+
+ -- Étienne Mollier   Sat, 21 Jan 2023 12:54:49 +0100
+
+rocm-hipamd (5.2.3-3) unstable; urgency=medium
+
+  * d/control: build depends on libclang-rt-15-dev.
+  * d/control: hipcc depends on libclang-rt-15-dev.
+  * d/t/hipcc: add; basic script testing hipcc.
+  * d/t/hipconfig: add; script skipping hipconfig if no amdgpu is available.
+  * d/t/control: add hipcc to superficial autopkgtests.
+  * d/t/control: run the d/t/hipconfig test script instead of the command;
+this allows us to trigger conditions for when hardware is not available
+and the script has to be skipped.
+
+ -- Étienne Mollier   Wed, 18 Jan 2023 20:35:17 +0100
+
+rocm-hipamd (5.2.3-2) unstable; urgency=medium
+
+  [ Cordell Bloor ]
+  * d/patches: add 0020-hipcc-remove-rpath-flags.patch
+Closes: #1021642
+  * d/rules: trim unnecessary rules
+  * d/rules: strip RUNPATH from libamdhip64.so
+  * debian/patches: backport 56b3260 from upstream
+Closes: #1021643
+  * d/rules: disable creation of duplicate files
+  * d/patches: fix search paths when building with g++
+  * d/patches: add 0002-fix-cmake-library-notfound-check.patch
+  * d/libamdhip64-dev.install: install /usr/share/hip/version
+
+  [ Étienne Mollier ]
+  * 0005-clang-15.patch: also adjust llc postfix.
+Thanks to Jakub Jaszewski
+  * d/t/control: check hipconfig doesn't output error messages.
+  * d/control: hipcc depends on rocminfo.
+  * d/control: declare compliance to standards version 4.6.2.
+  * d/copyright: update copyright year.
+  * d/rules: build tests in parallel.
+  * d/rules: set library path to find the freshly built library.
+  * d/rules: force run tests sequentially; avoid bus contention on the GPU.
+
+ -- Étienne Mollier   Sat, 14 Jan 2023 11:16:01 +0100
+
 rocm-hipamd (5.2.3-1) unstable; urgency=medium
 
   * Migrate ROCm 5.2.3 to unstable.
diff -Nru rocm-hipamd-5.2.3/debian/control rocm-hipamd-5.2.3/debian/control
--- rocm-hipamd-5.2.3/debian/control2022-10-20 21:20:33.0 +0200
+++ rocm-hipamd-5.2.3/debian/control2023-03-10 23:38:51.0 +0100
@@ -6,12 +6,14 @@
 Section: devel
 Homepage: https://github.com/rocm-developer-tools/hipamd
 Priority: optional
-Standards-Version: 4.6.1
+Standards-Version: 4.6.2
 Vcs-Git: https://salsa.debian.org/rocm-team/rocm-hipamd.git
 Vcs-Browser: https

Bug#1031089: rocr-runtime: Segfault in hsa_init on Debian-derived systems

2023-02-19 Thread Christian Kastner
On 2023-02-19 12:16, Cordell Bloor wrote:
> I think this is ready. Would a team upload be appropriate for this sort
> of change?

(off-list)

[In all of the above, with "uploader" I mean the person who's name is
under the changelog entry (next to the date)]

It's a bit confusing at first, but strictly speaking,
  (1) Any person in Maintainer or Uploaders is allowed to upload
  (2) If Maintainer is a team, any member of that team is allowed to
  upload, but
  (3) If in (2), the team-member is not also in Uploaders, a "team
  upload" changelog is customary.

lintian and git-buildpackage all check this for you.


In practice, Maintainer is frequently a team, and Uploaders are all
members of that team (as is the case for the ROCm ecosystem). That may
seem redundant, but Uploaders is then just used to signify who the
"regular" maintainers/uploaders are.

For example, the Python Team has hundreds of members and thousands of
packages. If you want to reach the person most able to help you with a
team-maintained package python3-foo, that's where Uploaders helps you.


In some teams, it's generally considered polite to ask the regular
Uploaders before introducing a big change in their packages, even if
you're entitled to do so as a team member. I usually do that.

In other teams, like Debian Science, Debian AI, or ROCm, everything is
collaborative, so "just do it" :)

Best,
Christian



Bug#1031089: rocr-runtime: Segfault in hsa_init on Debian-derived systems

2023-02-19 Thread Christian Kastner
On 2023-02-19 12:16, Cordell Bloor wrote:
> I've prepared the rocr-runtime package for upload. I tested it with
> several of the math libraries on Debian Bookworm and Ubuntu Lunar. This
> patch has also gone through significant testing upstream.

LGTM, changes are minimal and targeted fixes, so in accordance with the
soft freeze guidelines, I'd say.

I'll upload as soon as my test build has completed.

> I think this is ready. Would a team upload be appropriate for this sort
> of change?

There's no need -- you prepared the change (as per the changelog entry),
and you're in Uploaders, so you're entitled to introduce this new
version into the official archive (which is what "Uploaders" means here)
anyway.

Best,
Christian



Bug#968388: build-rdeps: with dose-extra, lists too many rdeps

2023-03-04 Thread Christian Kastner
Hi Jakub,

On 2023-01-31 16:12, Jakub Wilk wrote:
> * Christian Kastner , 2020-08-14 10:58:
> Most of them build-depend on python3-sphinx, which depends on
> python3-pygments.

that explains it. I should have checked the man page.

> In my experience, you almost always want --old.

Indeed.

Thanks,
Christian



Bug#1014593: amd64-microcode: Updated version for bullseye/stable?

2023-03-01 Thread Christian Kastner
Thank you for the fast reply!

On 2023-03-01 12:07, Henrique de Moraes Holschuh wrote:
> Microcode updates are somewhat plagued with regressions, so usually I won't 
> push them to stable without a reasonable level of feedback.  And that is a 
> lot harder to come from AMD users than Intel users, for unknown-to-me reasons 
> (I can speculate, but that's not helpful).

Oh, I wasn't aware of this. I admittedly simply assumed that CPU
microcode updates are minimal (targeted fixes for errata, or some such),
and are thoroughly tested by the manufacturer.

> That said, with enough *it works* feedback, yes, we can push amd64-microcode 
> updates to stable.

I'd be happy to serve as a beta-tester.

I guess this could be automated to some degree with the help of
autopkgtests for a subset of packages, e.g. the scientific ones tend to
get really "close" to the CPU with their optimizations, and they usually
come with massive test suites.

> On Wed, Mar 1, 2023, at 07:09, Christian Kastner wrote:
>>> Users seem to be relying on this (as I was just asked about policies
>>> when microcode updates are updated/backported).
> 
> Really, you should rely on updated *firmware* if you can.  It still is the 
> only place where you can actually trust a microcode update (from either AMD 
> or Intel) to actually do all it was supposed to do.  I know for a fact the 
> Intel ones disable sections of the update that cannot be activated when not 
> loaded early enough.  For AMD, I know for a fact several updates of earlier 
> processors were never shipped to users because they *must* be done by the 
> firmware, nowadays maybe they do it like Intel.

Good to know, thanks.

With firmware, you mean BIOS updates, correct?

Makes sense but that would suck if still true for AMD, as manufacturers
stop providing updates far earlier than the useful live of the product.

>> Since microcode updates are generally fixes, sometimes even important
>> security fixes, I guess updates to stable (rather than going via
>> backports) would be permissible?
> 
> Yes, they usually are.  We can even send them in as security updates when we 
> get enough data to know it is going to fix a security issue **even when 
> loaded by the O.S.* (see remark above) and that it is not causing serious 
> regressions...

Best,
Christian



Bug#1014593: amd64-microcode: Updated version for bullseye/stable?

2023-03-01 Thread Christian Kastner
Hi,

On 2022-07-08 15:36, Michael Prokop wrote:
> https://wiki.debian.org/Microcode#Microcode_update_support_for_current_and_older_Debian_releases:
> 
> | Debian 11, codename "Bullseye" is supported, and will receive
> | updates both through the bullseye-backports official backports
> | repository (faster than point-releases), and through Debian stable
> | point-releases and security updates.
> 
> Users seem to be relying on this (as I was just asked about policies
> when microcode updates are updated/backported).
> 
> Would you please consider updating the package in stable? :)
> Thanks!

I'd like to second this.

This [1] popped up in my newsfeed today. I only then realized that the
amd64-microcode package in stable is from 2019.

Since microcode updates are generally fixes, sometimes even important
security fixes, I guess updates to stable (rather than going via
backports) would be permissible?

Best,
Christian

[1] https://lkml.org/lkml/2023/2/22/33



Bug#630538: Vixie cron PID confusion

2023-04-15 Thread Christian Kastner
Hi Teal,

I'm no longer a maintainer of cron, but I was the one last replying to
the original report (can't believe it's been 12 years...)

On 2023-04-08 12:30, Teal Bauer wrote:
> The same Selective logging patch added a version of the logging in the
> default branch of the fork() switch, so if the -L log levels for "log
> job start" and "log job pid" are set, the starting PID is not logged by
> the child but the parent process instead.
> 
> So basically there is now what seems to me to be a "do things right"
> flag - if log level includes 8 (log PIDs) then both CMD and END messages
> are sent by the same process and contain the same correct PIDs:
> 
>     Apr  8 10:17:56 e02fc37faf65 CRON[27]: (root) CMD ([28]
> /tmp/runner.sh >>/tmp/runner.log)
>     Apr  8 10:19:12 e02fc37faf65 CRON[27]: (root) END ([28]
> /tmp/runner.sh >>/tmp/runner.log)
> 
> (PID 27 is the cron parent, PID 28 is the command child, PID 29 is the
> PID of the actual command).
> If the log level includes only e.g. "log start" and "log end", then the
> PIDs will differ:
> 
>     Apr  8 10:14:06 2d9c73749325 CRON[28]: (root) CMD (/tmp/runner.sh
>>>/tmp/runner.log)
>     Apr  8 10:15:27 2d9c73749325 CRON[27]: (root) END (/tmp/runner.sh
>>>/tmp/runner.log)
> 
> (PID 28 is the command child which sends the CMD message, PID 27 is the
> cron parent which sends the END message, the actual command is PID 29)
> 
> I would like to propose (and intend on submitting a patch soon) to
> always log in the same place.
> Ideally, that would be the child process, so that the PID that openlog()
> uses and the PID that cron would log are the same, but I'm not sure
> that's possible in a reliable way. Doing it in the parent is just as
> well for me, though - my original intent was trying to match CMDs to
> ENDs in the logs of a wildly active system.
> 
> Curious to hear your thoughts!

Sounds good to me!

Best,
Christian



Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-21 Thread Christian Kastner
Control: tags -1 - moreinfo

Hi Paul,

On 2023-04-20 08:58, Paul Gevers wrote:
> Sorry for taking so long to respond (the moreinfo tag was still attached
> to the bug, so it didn't show up in my regular bts view, so please
> remove it when you reply).

done.

> On 16-03-2023 11:40, Christian Kastner wrote:
>>> Overall, the diff is a bit long (and has some irrelevant stuff), so
>>> I'm hesitant to offer t-p-u now (to avoid waiting for
>>> llvm-toolchain-15).
>>
>> Understood. Yeah, the diff is long, unfortunately, as the packaging
>> fixes accumulated over time.
> 
> That's why (especially around the freeze) we expect maintainers to keep
> track of migration and ensure they happen. You got stuck behind
> llvm-toolchain-15, but that's very unlikely to be fixed before the release.

We were actually well aware of the migration issue (it was, after all,
preventing our own migration). But that blocking RC bug appeared like an
isolated issue in llvm-toolchain-15, so we were kind of speculating on
the idea that it would eventually resolve itself in time. That bug got
overlooked out of sheer bad luck, though.

In the event that llvm-toolchain-15 will not be allowed to migrate:
there are some fixes in the current version of rocm-hipamd that really
should get into bookworm, most notably the missing  libamd-comgr-dev
dependency, and the added patches.

The only way to do that with llvm-toolchain-15 from testing is by
changing the dependency libclang-rt-15-dev back to
libclang-common-15-dev (the pre-split version).

If that is an option, I could prepare an upload, and also reduce out
whatever other changes you don't feel comfortable with in the larger diff.

>> Is this something that you could consider at a later point in time, if I
>> also break down the diff into more reviewable fragments (dependencies,
>> build, metadata, ...)? Because I do think that most changes are just
>> fixes of one sort or another - no features added.
> 
> I checked the diff again and I was about to propose to upload it to tpu,
> but I saw the following:
> 
> diff -Nru rocm-hipamd-5.2.3/debian/rules rocm-hipamd-5.2.3/debian/rules
> --- rocm-hipamd-5.2.3/debian/rules  2022-10-20 19:20:33.0 +
> +++ rocm-hipamd-5.2.3/debian/rules  2023-03-10 22:38:51.0 +
> 
> [...]
> +   -DHIP_PLATFORM=amd
> 
> Is that correct for the arm64 builds?

Thanks for checking! Yes, that refers to the GPU arch, not the CPU arch.
HIP code is portable in the sense that it can work with both AMD and
Nvidia GPUs.

Best,
Christian



Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-27 Thread Christian Kastner
Control: tags -1 - moreinfo

Hi Paul,

sorry this took a while.

On 2023-04-22 13:34, Paul Gevers wrote:
> On 21-04-2023 23:43, Christian Kastner wrote:
>> The only way to do that with llvm-toolchain-15 from testing is by
>> changing the dependency libclang-rt-15-dev back to
>> libclang-common-15-dev (the pre-split version).
> 
> Hmm, so this complicates things. Can you do this change in unstable, or
> would it be broken in unstable?

Luckily, the newer llvm-toolchain-15 is only needed for building tests.
These aren't run (cannot be run) by buildds, so by dropping them for
now, we can drop the problematic build dependency.

And for bin:hipcc, the only binary package affected, I believe the
dependency on libclang-rt-15-dev was wrong anyway, there's a broken
upgrade path for the files that moved in the dependency. The correct
specification  should be:

libclang-common-15-dev (<< 1:15.0.6-5~exp1) | libclang-rt-15-dev (>=
1:15:0.6-5~exp1)

>> If that is an option, I could prepare an upload, and also reduce out
>> whatever other changes you don't feel comfortable with in the larger
>> diff.
> 
> That would be good. Can you also share the minimal delta with the
> current version in unstable? I'll check if that's acceptable.

I've attached the new diff as 01 (FULL) but its d/changelog is noisy,
reflecting the ongoing development process we had in this younger library.

So I split that diff into 02 (patches) and 03 (NOT-patches), also attached.

02_rocm-hipamd-patches.diff
There were 5 patches added (Jan:4 Feb:1), and these represent fixes that
really must be in the package, but were held up by our dependency. One
patch was dropped. Many others just got DEP3 headers.

03_rocm-NOT-hipamd-patches.diff
The diff is not as large as d/changelog suggests. I've summarized all
the changes below, with (*) marking changes that really should get into
testing, and (+) marking changes that aren't strictly needed.

  * Build Depends added: llvm-15, file
  * (RC #1032677) Depends fixed: bin:libamdhip64-5, bin:libamdhip64-dev
  * Depends fixed: bin:hipcc (as described above)
  * *.install files fixed (+ one d/rules change), not-installed added
  * Build flags fixed in d/rules
  * Another RPATH removed
  * Updates to d/copyright

  + Build Depends added: rocminfo (just for tests)
  + Reduce architectures to amd64, arm64, ppc64el (the only platforms
with the necessary drivers)
  + Update Standards-Version from 4.6.1 to 4.6.2
  + autopkgtest added

Would a package with just the patches and the (*) changes be acceptable?

Best,
Christiandiff -Nru rocm-hipamd-5.2.3/debian/changelog rocm-hipamd-5.2.3/debian/changelog
--- rocm-hipamd-5.2.3/debian/changelog  2022-10-20 21:20:33.0 +0200
+++ rocm-hipamd-5.2.3/debian/changelog  2023-04-25 19:50:14.0 +0200
@@ -1,3 +1,85 @@
+rocm-hipamd (5.2.3-7) UNRELEASED; urgency=medium
+
+  * hipcc: Fix Depends to enable transition from split clang package
+  * Drop building of tests, and libclang-rt-15-dev dependency
+
+ -- Christian Kastner   Tue, 25 Apr 2023 19:50:14 +0200
+
+rocm-hipamd (5.2.3-6) unstable; urgency=medium
+
+  * Reduce arch to amd64, arm64, ppc64el
+  * libamdhip64-5: Add dependency on libamd-comgr2 (Closes: #1032677)
+  * Add myself to Uploaders
+  * Fix Maintainer (same list, different name)
+
+ -- Christian Kastner   Fri, 10 Mar 2023 23:38:51 +0100
+
+rocm-hipamd (5.2.3-5) unstable; urgency=medium
+
+  * d/{libamdhip64-dev,rules}: fix version file
+Closes: #1031264
+  * add d/p/0020-replace-x86_64-with-variables.patch
+to fix build on aarch64
+  * d/control: add file to hipcc dependencies
+  * d/control: add dependencies for find_package(hip)
+Closes: #1031538
+  * add d/p/0021-fix-default-cmake-build-on-unsupported-gpus.patch
+to enable gpu arch autodetection with find_package(hip)
+  * d/not-installed: ignore doxygen docs
+  * d/p/000{4,8,9}*.patch: change hip-lang cmake files,
+to partially fix #1031540
+  * d/copyright: update copyright date
+  * d/control: add self to uploaders
+  * cleanup patch metadata
+
+ -- Cordell Bloor   Sun, 19 Feb 2023 03:51:26 -0700
+
+rocm-hipamd (5.2.3-4) unstable; urgency=medium
+
+  * d/t/hipcc: also skip when no kfd driver is loaded.
+
+ -- Étienne Mollier   Sat, 21 Jan 2023 12:54:49 +0100
+
+rocm-hipamd (5.2.3-3) unstable; urgency=medium
+
+  * d/control: build depends on libclang-rt-15-dev.
+  * d/control: hipcc depends on libclang-rt-15-dev.
+  * d/t/hipcc: add; basic script testing hipcc.
+  * d/t/hipconfig: add; script skipping hipconfig if no amdgpu is available.
+  * d/t/control: add hipcc to superficial autopkgtests.
+  * d/t/control: run the d/t/hipconfig test script instead of the command;
+this allows us to trigger conditions for when hardware is not available
+and the script has to be skipped.
+
+ -- Étienne Mollier   Wed, 18 Jan 2023 20:35:17 +0100
+
+rocm-hipamd (5.2.3-2) unstable; urgency=medium
+
+  [ Cordell Bloor ]
+

Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-25 Thread Christian Kastner
Hi Paul,

just wanted to say sorry, this is taking a while.

On 2023-04-22 13:34, Paul Gevers wrote:
>> The only way to do that with llvm-toolchain-15 from testing is by
>> changing the dependency libclang-rt-15-dev back to
>> libclang-common-15-dev (the pre-split version).
> 
> Hmm, so this complicates things. Can you do this change in unstable, or
> would it be broken in unstable?

I did not think of that, and you are right, of course. The build breaks
in unstable; the relevant files have all been moved to libclang-rt-15-dev.

However: unless I'm utterly mistaken, these files are only needed for
building tests -- which we don't run on buildds anyway. The package
builds fine without this dependency if test building is skipped, so this
could be a solution when going through unstable.

However-however: libclang-rt-15-dev is also a dependency of the produced
binary package hipcc. That makes sense, since I may want to compile a
test skipped above on my own machine, for example.

It's this dependency makes things tricky (I'm pretty sure there's a
versioned Depends missing anyway) and I'd like to be 100% confident
before suggesting any change to this.

I'm leaving the moreinfo tag for now, and I'll remove it once this is
solved and tested thoroughly.

>> If that is an option, I could prepare an upload, and also reduce out
>> whatever other changes you don't feel comfortable with in the larger
>> diff.
> 
> That would be good. Can you also share the minimal delta with the
> current version in unstable? I'll check if that's acceptable.
Best,
Christian



Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-28 Thread Christian Kastner
Hi Paul,

On 2023-04-28 17:48, Paul Gevers wrote:
> On 28-04-2023 00:58, Christian Kastner wrote:
>> So I split that diff into 02 (patches) and 03 (NOT-patches), also
>> attached.
> 
> I think you forgot to add them.

I did, sorry.

>> Would a package with just the patches and the (*) changes be acceptable?
> 
> I asked you to *also* provide the diff between *current* unstable and
> your proposal (via unstable), because "I was about to propose to upload
> it to tpu" (2023-04-20).

Sure, the 04_ attachment is the debdiff between unstable -6 and the
proposed update -7, which removes all of the less important changes that
I marked with (+) in my previous log.

I may be misunderstanding something here. I interpreted your t-p-u hint
for the case where a fix via unstable wouldn't be possible because of
the dependency issue. The proposal, however would work via unstable.

Best,
Christian
diff -Nru rocm-hipamd-5.2.3/debian/patches/0001-Clang-version-munging.patch rocm-hipamd-5.2.3/debian/patches/0001-Clang-version-munging.patch
--- rocm-hipamd-5.2.3/debian/patches/0001-Clang-version-munging.patch	2022-10-20 21:20:33.0 +0200
+++ rocm-hipamd-5.2.3/debian/patches/0001-Clang-version-munging.patch	2023-04-25 19:50:14.0 +0200
@@ -1,9 +1,9 @@
 From: Maxime Chambonnet 
 Date: Sat, 11 Feb 2022 11:28:54 +0100
 Subject: Clang version munging
- https://github.com/ROCm-Developer-Tools/HIP/pull/2451
 
-Forwarded: yes
+Forwarded: https://github.com/ROCm-Developer-Tools/HIP/pull/2451
+Applied-Upstream: https://github.com/ROCm-Developer-Tools/HIP/commit/0c443d12011da16a036057e0472ae59c68bc901f
 ---
  hip/bin/hipcc | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
diff -Nru rocm-hipamd-5.2.3/debian/patches/0002-fix-cmake-library-notfound-check.patch rocm-hipamd-5.2.3/debian/patches/0002-fix-cmake-library-notfound-check.patch
--- rocm-hipamd-5.2.3/debian/patches/0002-fix-cmake-library-notfound-check.patch	1970-01-01 01:00:00.0 +0100
+++ rocm-hipamd-5.2.3/debian/patches/0002-fix-cmake-library-notfound-check.patch	2023-04-25 19:50:14.0 +0200
@@ -0,0 +1,47 @@
+From: Cordell Bloor 
+Date: Mon, 24 Oct 2022 00:07:40 -0400
+Subject: fix cmake library notfound check
+
+If find_library does not find the library, the given variable is
+set with a value that has a -NOTFOUND suffix. For example, the
+CLANGRT_BUILTINS variable will be set with the value
+CLANGRT_BUILTINS-NOTFOUND.
+
+Applied-Upstream: https://github.com/ROCm-Developer-Tools/HIP/commit/d12d0ebc578601de138765ee4b1ddd2dcbc79edf
+
+---
+diff --git a/hip-config.cmake.in b/hip-config.cmake.in
+index ba3e75c..a27badc 100755
+--- a/hip-config.cmake.in
 b/hip-config.cmake.in
+@@ -287,7 +287,7 @@ if(HIP_COMPILER STREQUAL "clang")
+ ${HIP_CLANG_INCLUDE_PATH}/../lib/linux)
+ 
+ # Add support for __fp16 and _Float16, explicitly link with compiler-rt
+-if(CLANGRT_BUILTINS-NOTFOUND)
++if(NOT CLANGRT_BUILTINS)
+   message(FATAL_ERROR "clangrt builtins lib not found")
+ else()
+   set_property(TARGET hip::host APPEND PROPERTY INTERFACE_LINK_LIBRARIES "${CLANGRT_BUILTINS}")
+diff --git a/hip/hip-lang-config.cmake.in b/hip/hip-lang-config.cmake.in
+index 1a72643..07f24f9 100644
+--- a/hip/hip-lang-config.cmake.in
 b/hip/hip-lang-config.cmake.in
+@@ -94,7 +94,7 @@ find_path(HSA_HEADER hsa/hsa.h
+ /opt/rocm/include
+ )
+ 
+-if (HSA_HEADER-NOTFOUND)
++if (NOT HSA_HEADER)
+   message (FATAL_ERROR "HSA header not found! ROCM_PATH environment not set")
+ endif()
+ 
+@@ -136,7 +136,7 @@ set_property(TARGET hip-lang::device APPEND PROPERTY
+ )
+ 
+ # Add support for __fp16 and _Float16, explicitly link with compiler-rt
+-if(CLANGRT_BUILTINS-NOTFOUND)
++if(NOT CLANGRT_BUILTINS)
+ message(FATAL_ERROR "clangrt builtins lib not found")
+ else()
+   set_property(TARGET hip-lang::device APPEND PROPERTY
diff -Nru rocm-hipamd-5.2.3/debian/patches/0003-hip-config.cmake.patch rocm-hipamd-5.2.3/debian/patches/0003-hip-config.cmake.patch
--- rocm-hipamd-5.2.3/debian/patches/0003-hip-config.cmake.patch	2022-10-20 21:20:33.0 +0200
+++ rocm-hipamd-5.2.3/debian/patches/0003-hip-config.cmake.patch	2023-04-25 19:50:14.0 +0200
@@ -2,6 +2,7 @@
 Date: Thu, 27 Jan 2022 18:47:04 +0100
 Subject: hip-config.cmake
 
+Forwarded: no
 ---
  hip-config.cmake.in | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
diff -Nru rocm-hipamd-5.2.3/debian/patches/0004-hip-cmake-install.patch rocm-hipamd-5.2.3/debian/patches/0004-hip-cmake-install.patch
--- rocm-hipamd-5.2.3/debian/patches/0004-hip-cmake-install.patch	2022-10-20 21:20:33.0 +0200
+++ rocm-hipamd-5.2.3/debian/patches/0004-hip-cmake-install.patch	2023-04-25 19:50:14.0 +0200
@@ -2,6 +2,7 @@
 Date: Tue, 8 Feb 2022 12:41:33 +0100
 Subject: hip cmake install
 
+Applied-Upstream: https://github.com/ROCm-Developer-Tools/hipamd/commit/f892306e227983a7c1943992ba70bf

Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-30 Thread Christian Kastner
Hi Paul,

On 2023-04-30 07:59, Paul Gevers wrote:
> Please go ahead with your 04_ proposal and please remove the moreinfo
> tag once the upload happened.

In -7, there was a typo that broke installability for hipcc,
specifically a version contained a second colon where a dot was expected:

> [...] | libclang-rt-15-dev (>= 1:15:0.6-5~exp1),
^^^
I went ahead with an -8 upload that changes just that one typo, and I
successfully tested all install/upgrade paths for hipcc:
  bookworm: none -> -8
  bookworm:  -1  -> -8
  unstable: none -> -8
  unstable:  -7  -> -8

I'm sorry for the noise. I'm more than puzzled how this could have snuck
in, as I tested the above upgrade paths before proposing the change.

Best,
Christian



Bug#1032899: unblock: rocm-hipamd/5.2.3-6

2023-04-30 Thread Christian Kastner
Control: tags -1 - moreinfo

Hi Paul,

On 2023-04-30 07:59, Paul Gevers wrote:
>> I may be misunderstanding something here. I interpreted your t-p-u hint
>> for the case where a fix via unstable wouldn't be possible because of
>> the dependency issue. The proposal, however would work via unstable.
> 
> It was for *me*. I reviewed the version in unstable and had some
> concerns. Due to the size of the debdiff, it's easier to look at the
> proposed delta to unstable (that I reviewed), than to review from scratch.

Got it.

> Please go ahead with your 04_ proposal and please remove the moreinfo
> tag once the upload happened.
Best,
Christian



Bug#1034479: [pre-approval] unblock: rocprim/5.3.3-4

2023-05-01 Thread Christian Kastner
Control: tags -1 - moreinfo

On 2023-04-26 07:46, Paul Gevers wrote:
> Ack. I'm uncomfortable with the changes in -3, in particular the
> arch:any -> arch:all and associated lib -> shared move. I propose you
> upload a version -4 which reverts -3 for now and only adds the missing
> dependency. experimental and later trixie is there to take the current
> changes.
> 
> Please remove the moreinfo tag once the package is in unstable.



Bug#1034476: librocprim-dev: Missing Depends on libamdhip64-dev

2023-04-16 Thread Christian Kastner
Package: librocprim-dev
Version: 5.3.3-3
Severity: serious

librocprim-dev needs libamdhip64-dev, but this dependency is missing
from the package.



Bug#1034479: [pre-approval] unblock: rocprim/5.3.3-4

2023-04-16 Thread Christian Kastner
Package: release.debian.org
Severity: normal
User: release.debian@packages.debian.org
Usertags: unblock
X-Debbugs-Cc: debian...@lists.debian.org
Control: affets -1 + src:rocprim

We'd like to get approval to unblock rocprim/5.3.3-4. It has been
prepared, but not yet uploaded.

This version in itself isn't controversial, but unfortunately the
previous version 5.3.3-3 did not yet migrate to testing, so I'm asking
for pre-approval to to 5.3.3-4 and allow it to eventually migrate, and
otherwise guidance for an alternative solution.

[ Reason ]
Mainly to close #1034476, a missing dependency on a package key to
librocprim-dev.

The other notable changes (from -2 to -3) are: adding hardening flags,
and changing from arch:any to arch:all, as it always should have been
(header-only library).

[ Impact ]
Users installing this package will not be able to use it, unless they
discover this missing dependency by themselves.

[ Tests ]
autopkgtests are implemented but not part of this package yet. This is
mostly because they require GPU access to work.

An upload with autopkgtest packages enabled (but skipped on systems
lacking the hardware) will go through experimental once this request has
been resolved.

[ Risks ]
None. The only user-visible changes are the hardening flags, and they
passed on our ends.

[ Checklist ]
  [X] all changes are documented in the d/changelog
  [X] I reviewed all changes and I approve them
  [X] attach debdiff against the package in testing

unblock rocprim/5.3.3-4diff -Nru rocprim-5.3.3/debian/changelog rocprim-5.3.3/debian/changelog
--- rocprim-5.3.3/debian/changelog  2023-01-04 10:09:08.0 +0100
+++ rocprim-5.3.3/debian/changelog  2023-04-16 13:40:58.0 +0200
@@ -1,3 +1,22 @@
+rocprim (5.3.3-4) unstable; urgency=medium
+
+  * Add libamdhip64-dev to Depends (Closes: #1034476)
+  * Fix Maintainer name
+  * Add myself to Uploaders
+
+ -- Christian Kastner   Sun, 16 Apr 2023 13:40:58 +0200
+
+rocprim (5.3.3-3) unstable; urgency=medium
+
+  * Move cmake files to /usr/share.
+  * d/rules: drop override for debug symbols
+  * d/control: update standards version to 4.6.2
+  * d/control: library is arch-independent
+  * d/rules: enable hardening flags for tests
+  * d/rules: enable gfx1010 and gfx1011 for tests
+
+ -- Cordell Bloor   Mon, 06 Mar 2023 00:55:17 -0700
+
 rocprim (5.3.3-2) unstable; urgency=medium
 
   * d/rules: add rules to handle rocm-cmake >= 5.3
diff -Nru rocprim-5.3.3/debian/control rocprim-5.3.3/debian/control
--- rocprim-5.3.3/debian/control2022-11-17 20:47:12.0 +0100
+++ rocprim-5.3.3/debian/control2023-04-16 13:40:58.0 +0200
@@ -2,26 +2,27 @@
 Section: devel
 Homepage: https://github.com/rocmsoftwareplatform/rocprim
 Priority: optional
-Standards-Version: 4.6.1
+Standards-Version: 4.6.2
 Vcs-Git: https://salsa.debian.org/rocm-team/rocprim.git
 Vcs-Browser: https://salsa.debian.org/rocm-team/rocprim
-Maintainer: ROCm Team 
+Maintainer: Debian ROCm Team 
 Uploaders: Maxime Chambonnet ,
Cordell Bloor ,
+   Christian Kastner ,
 Build-Depends: debhelper-compat (= 13),
cmake,
hipcc,
libamd-comgr-dev,
libhsa-runtime-dev,
rocminfo,
-   rocm-cmake,
+   rocm-cmake (>= 5.3.0),
libgtest-dev 
 Rules-Requires-Root: no
 
 Package: librocprim-dev
 Section: libdevel
-Architecture: any
-Depends: ${misc:Depends}
+Architecture: all
+Depends: ${misc:Depends}, libamdhip64-dev
 Description: parallel primitives for GPU-accelerated code - headers
  rocPRIM is a header-only library providing HIP parallel primitives for
  developing performant GPU-accelerated code on the AMD ROCm platform.
diff -Nru rocprim-5.3.3/debian/librocprim-dev.install 
rocprim-5.3.3/debian/librocprim-dev.install
--- rocprim-5.3.3/debian/librocprim-dev.install 2022-10-22 16:53:03.0 
+0200
+++ rocprim-5.3.3/debian/librocprim-dev.install 2023-04-16 13:40:58.0 
+0200
@@ -1,2 +1,2 @@
 usr/include/rocprim
-usr/lib/cmake/rocprim
+usr/share/cmake/rocprim
diff -Nru rocprim-5.3.3/debian/rules rocprim-5.3.3/debian/rules
--- rocprim-5.3.3/debian/rules  2023-01-04 10:09:08.0 +0100
+++ rocprim-5.3.3/debian/rules  2023-04-16 13:40:58.0 +0200
@@ -1,15 +1,17 @@
 #!/usr/bin/make -f
 export CXX=hipcc
-export DEB_BUILD_MAINT_OPTIONS = hardening=-all
+export DEB_BUILD_MAINT_OPTIONS = hardening=+all
 export VERBOSE=1
 #export AMD_LOG_LEVEL=4
 
-%:
-   dh $@ -Scmake
+# filter incompatible options from affecting device code
+CFLAGS   := $(subst -fstack-protector-strong,-Xarch_host 
-fstack-protector-strong,$(CFLAGS))
+CXXFLAGS := $(subst -fstack-protector-strong,-Xarch_host 
-fstack-protector-strong,$(CXXFLAGS))
 
 CMAKE_FLAGS = \
-DCMAKE_BUILD_TYPE=Release \
-   -DCMAKE_INSTALL_LIBDIR=lib \
+   -DCMAKE_INSTALL_LIBDIR=share \
+   
-DAMDGPU_TARGETS="gfx803;gfx900;gfx906;gfx908;gfx90a;gfx1

Bug#1037322: amqp-tools: Process leaks authentication data

2023-06-15 Thread Christian Kastner
Control: tag -1 fixed-upstream

On 2023-06-11 12:28, Christian Kastner wrote:
> Package: amqp-tools
> Version: 0.11.0-1
> Severity: grave
> Tags: security
> Forwarded: https://github.com/alanxz/rabbitmq-c/issues/575
> 
> When passing authentication data with either --password or --url, the
> data is exposed in the process list, where it can be seen by any user.
> 
> Example:
>   $ pgrep -a ampq-consume
>   62287 amqp-consume --url amqp://user:pass@192.168.0.1 --queue=myqueue
> 
> This is an upstream issue. I've filed a pull request upstream that adds
> an option --authfile with which authentication data can be read from a file.

A patch for this has been merged upstream:

https://github.com/alanxz/rabbitmq-c/commit/463054383fbeef889b409a7f843df5365288e2a0

Best,
Christian



Bug#1022702: [pkg-gnupg-maint] Bug#1022702: gnupg: Migrating packaging from 2.2.x to "stable" 2.3.x

2023-07-12 Thread Christian Kastner
Hi Daniel,

On 2023-06-12 17:01, Sune Stolborg Vuorela wrote:
> Any chance you can give Andreas a go ahead to push a newer Gnupg2 to at least 
> experimental, or preferably unstable ?

I, too, would appreciate a newer version. It turns out that in versions
prior to 2.3, the 'kdf-setup' option with cards does not work [1]. At
least, that was the case with both Yubikeys and Nitrokeys here on my end.

Best,
Christian

[1] https://dev.gnupg.org/T3891#142195



Bug#1036885: unblock: hipsparse/5.3.3+dfsg-2

2023-06-01 Thread Christian Kastner
control: tags -1 - moreinfo

Hi Paul,

On 2023-06-01 08:54, Paul Gevers wrote:
> Please upload hipsparse to tpu (targeting bookworm in the changelog)
> with no other changes than a changelog entry on top of what you have in
> unstable. Please use the version number 5.3.3+dfsg-2~deb12u1.

Thank you for considering this.

> Remove the moreinfo tag once the upload has happened. The upload window
> is tight. The *migration* needs to happen before Sunday.

Done. I hope I made it into the time window.

Best,
Christian



Bug#1036885: unblock: hipsparse/5.3.3+dfsg-2

2023-05-31 Thread Christian Kastner
Hi Graham,

On 2023-05-31 08:58, Graham Inggs wrote:
> Hi Christian
> 
> On Sun, 28 May 2023 at 18:48, Christian Kastner  wrote:
>> unblock hipsparse/5.3.3+dfsg-2
> 
> The debdiff looks good to me, however the migration of
> hipsparse/5.3.3+dfsg-2 appears to be blocked by rocsparse/5.3.0+dfsg-3
> [1].>
> Migrates after: rocsparse

I didn't notice this because I didn't expect this, and to be honest I'm
still a bit confused: I can't see why rocsparse 5.3.0+dfsg-3 would block
hipsparse? The Depends and Build-Depends aren't versioned.

> Migration status for hipsparse (5.3.3+dfsg-1 to 5.3.3+dfsg-2):
> BLOCKED: Needs an approval (either due to a freeze, the source suite
> or a manual hint)
> Issues preventing migration:
> ∙ ∙ Not touching package due to block request by freeze (Follow the
> freeze policy when applying for an unblock)
> ∙ ∙ Too young, only 2 of 5 days old
> ∙ ∙ Build-Depends(-Arch): hipsparse rocsparse
> ∙ ∙ Depends: hipsparse rocsparse
> 
> I don't see an unblock request for rocsparse/5.3.0+dfsg-3, would you
> file one please?

I'd be happy to, but the debdiff for rocsparse/5.3.0+dfsg-3 to -2 would
be a bit larger than for hipsparse; this is the changelog:

> * Update patch DEP-3 metadata fields.
>* d/rules: use DWARF 4 debug symbols
>* d/rules: enable hardening flags
>* d/rules: enable gfx1010 and gfx1011
>* Add d/p/0003-fix-oob-access-in-rocsparse-test.patch
>  to fix out-of-bound accesses in test suite.
>* Reduce arch to amd64, arm64, ppc64el

There's nothing dramatic in there, and the changes have been in unstable
for almost 3 months now, so we would be fine with letting that migrate
if that's the call.

I'd also be happy to prepare an upload with some of the changes reduced,
but I'm not sure how that would work on your end, schedule-wise.

Anyway, perhaps there is a simpler resolution to this, namely the
rocsparse block just being a false positive.

Best,
Christian



Bug#1036885: unblock: hipsparse/5.3.3+dfsg-2

2023-05-31 Thread Christian Kastner
On 2023-05-31 19:28, Adam D. Barratt wrote:
> In the versions in testing, both packages only built for amd64. In
> unstable, they have also built for arm64. Migrating the arm64 hipsparse
> binaries from unstable therefore requires migrating a version of
> rocsparse with arm64 binaries.

Oh, that's a good catch, never thought of that, mainly because in
practice, we only look at amd64. This is a rather new ecosystem and
we're still ironing out the kinks.

A successful build on arm64 is a bit annoying, as we don't expect many
users there -- I'd be surprised if one manages to even get the required
mainboard.

I'm willing to do what it takes to get this fixed in testing, but I'm
not sure which solution, if any, is agreeable to the RT:
  (1) Request an unblock for the rocsparse/5.3.0+dfsg-3 as-is
  (2) Re-upload hipsparse with a reduced arch: amd64
  (3) Prepare new (minimal debdiff) upload for rocsparse, file unblock
  request
  (4) Remove the arm64 binaries (is that even possible?)
  (5) Fix this in the first point release
  (6) Alternatives?

Please let me know what, if any, option you'd prefer.

I'm aware that we are shortly before the release and that this might
limit the available options.

Best,
Christian



Bug#1037322: amqp-tools: Process leaks authentication data

2023-06-11 Thread Christian Kastner
Package: amqp-tools
Version: 0.11.0-1
Severity: grave
Tags: security
Forwarded: https://github.com/alanxz/rabbitmq-c/issues/575

When passing authentication data with either --password or --url, the
data is exposed in the process list, where it can be seen by any user.

Example:
  $ pgrep -a ampq-consume
  62287 amqp-consume --url amqp://user:pass@192.168.0.1 --queue=myqueue

This is an upstream issue. I've filed a pull request upstream that adds
an option --authfile with which authentication data can be read from a file.

Best,
Christian



Bug#1036885: unblock: hipsparse/5.3.3+dfsg-2

2023-05-28 Thread Christian Kastner
Package: release.debian.org
Severity: normal
User: release.debian@packages.debian.org
Usertags: unblock
X-Debbugs-Cc: hipspa...@packages.debian.org
Control: affects -1 + src:hipsparse

Please unblock package hipsparse

[ Reason ]
hipsparse is missing explicit dependencies on libamdhip64-dev.

[ Impact ]
Users installing libhipsparse-dev will not be able to use it without
also installing libamdhip64-dev, and it is not immediately made clear
what the actual cause of the error is.

[ Tests ]
This package does not yet have autopkgtests but in this particular case,
the change is minimal and only affects d/control.

[ Risks ]
None, compared to the previous release -1.

[ Checklist ]
  [X] all changes are documented in the d/changelog
  [X] I reviewed all changes and I approve them
  [X] attach debdiff against the package in testing

[ Other info ]
None.

unblock hipsparse/5.3.3+dfsg-2diff -Nru hipsparse-5.3.3+dfsg/debian/changelog 
hipsparse-5.3.3+dfsg/debian/changelog
--- hipsparse-5.3.3+dfsg/debian/changelog   2023-01-24 11:35:25.0 
+0100
+++ hipsparse-5.3.3+dfsg/debian/changelog   2023-05-28 17:17:36.0 
+0200
@@ -1,3 +1,15 @@
+hipsparse (5.3.3+dfsg-2) unstable; urgency=medium
+
+  * Team upload.
+
+  [ Cordell Bloor ]
+  * d/control: explicitly depend on libamdhip64-dev
+hipsparse.h includes hip/hip_complex.h and hip/hip_runtime.h, so users
+must have the hip headers installed to use the hipsparse headers.
+(Closes: #1035789)
+
+ -- Christian Kastner   Sun, 28 May 2023 17:17:36 +0200
+
 hipsparse (5.3.3+dfsg-1) unstable; urgency=medium
 
   * Initial release. (Closes: #1023092)
diff -Nru hipsparse-5.3.3+dfsg/debian/control 
hipsparse-5.3.3+dfsg/debian/control
--- hipsparse-5.3.3+dfsg/debian/control 2023-01-24 11:35:25.0 +0100
+++ hipsparse-5.3.3+dfsg/debian/control 2023-05-28 17:17:36.0 +0200
@@ -15,6 +15,7 @@
hipcc,
libamd-comgr-dev,
libhsa-runtime-dev,
+   libamdhip64-dev,
librocsparse-dev,
libgtest-dev 
 Rules-Requires-Root: no
@@ -33,7 +34,8 @@
 Package: libhipsparse-dev
 Section: libdevel
 Architecture: any
-Depends: libhipsparse0 (= ${binary:Version}),${misc:Depends}, 
${shlibs:Depends},
+Depends: libhipsparse0 (= ${binary:Version}), ${misc:Depends}, 
${shlibs:Depends},
+ libamdhip64-dev
 Description: portable interface for sparse linear algebra on the GPU - headers
  hipSPARSE is a wrapper library that provides a common interface to rocSPARSE
  and cuSPARSE. The hipSPARSE library is designed to help applications using


Bug#1036887: unblock: rocrand/5.3.3-4

2023-05-28 Thread Christian Kastner
Package: release.debian.org
Severity: normal
User: release.debian@packages.debian.org
Usertags: unblock
X-Debbugs-Cc: rocr...@packages.debian.org
Control: affects -1 + src:rocrand

Please unblock package rocrand

Note that the changes to bookworm are minimal and the only effective
change is fixing the missing dependencies in d/control, as stated under
Reason below.

However: d/changelog is noisy because we had changes in unstable that I
reverted for this -4 release, so that the fix can go through unstable.
(We had changes in unstable, rather than experimental, due to an
extremely poor judgment call on my end. Sorry.)

[ Reason ]
rocrand is missing explicit dependencies on libamdhip64-dev.

[ Impact ]
Users installing librocrand-dev or libhiprand-dev will not be able to
use them without also installing libamdhip64-dev, and it is not
immediately made clear what the actual cause of the error is.

[ Tests ]
This package does not yet have autopkgtests but in this particular case,
the change is minimal and only affects d/control.

[ Risks ]
None, compared to the previous release in bookworm -1.

[ Checklist ]
  [X] all changes are documented in the d/changelog
  [X] I reviewed all changes and I approve them
  [X] attach debdiff against the package in testing

[ Other info ]
None.

unblock rocrand/5.3.3-4diff -Nru rocrand-5.3.3/debian/changelog rocrand-5.3.3/debian/changelog
--- rocrand-5.3.3/debian/changelog  2023-02-07 08:06:45.0 +0100
+++ rocrand-5.3.3/debian/changelog  2023-05-28 18:25:03.0 +0200
@@ -1,3 +1,33 @@
+rocrand (5.3.3-4) unstable; urgency=medium
+
+  * Temporarily revert fixes unfit for bookworm.
+Specifically, revert all changes from after 5.3.3-1.
+
+  * Add missing dependency on libamdhip64-dev (Closes: #1035784, #1035787)
+
+ -- Christian Kastner   Sun, 28 May 2023 18:25:03 +0200
+
+rocrand (5.3.3-3) unstable; urgency=medium
+
+  * Upload to unstable.
+
+ -- Christian Kastner   Sun, 16 Apr 2023 22:45:08 +0200
+
+rocrand (5.3.3-3~exp1) experimental; urgency=medium
+
+  * Add myself to Uploaders
+  * Fix Maintainer name
+  * Add packages librocrand1-test, libhiprand1-test providing autopkgtests
+
+ -- Christian Kastner   Thu, 13 Apr 2023 23:41:30 +0200
+
+rocrand (5.3.3-2) unstable; urgency=medium
+
+  * d/rules: enable hardening flags
+  * d/rules: enable gfx1010 and gfx1011
+
+ -- Cordell Bloor   Mon, 06 Mar 2023 00:41:11 -0700
+
 rocrand (5.3.3-1) unstable; urgency=medium
 
   * d/{watch,gbp.conf}: recombine with hiprand as MUT
diff -Nru rocrand-5.3.3/debian/control rocrand-5.3.3/debian/control
--- rocrand-5.3.3/debian/control2023-02-07 07:22:45.0 +0100
+++ rocrand-5.3.3/debian/control2023-05-28 18:25:03.0 +0200
@@ -14,6 +14,7 @@
hipcc,
git,
libamd-comgr-dev,
+   libamdhip64-dev,
libhsa-runtime-dev,
patchelf,
rocminfo,
@@ -38,7 +39,10 @@
 Package: librocrand-dev
 Section: libdevel
 Architecture: any
-Depends: librocrand1 (= ${binary:Version}),${misc:Depends}, ${shlibs:Depends},
+Depends: librocrand1 (= ${binary:Version}),
+ libamdhip64-dev,
+ ${misc:Depends},
+ ${shlibs:Depends},
 Description: generate pseudo- and quasi-random numbers - headers
  The rocRAND project provides functions that generate pseudo-random and
  quasi-random numbers.
@@ -64,7 +68,10 @@
 Package: libhiprand-dev
 Section: libdevel
 Architecture: any
-Depends: libhiprand1 (= ${binary:Version}),${misc:Depends}, ${shlibs:Depends},
+Depends: libhiprand1 (= ${binary:Version}),
+ libamdhip64-dev,
+ ${misc:Depends},
+ ${shlibs:Depends},
 Description: wrapper library to port from cuRAND applications to HIP - headers
  The rocRAND project includes a wrapper library called hipRAND which allows
  user to easily port CUDA applications that use cuRAND library to the HIP


Bug#1042945: rabbitmq-server: Please support conf.d drop-in directory

2023-08-03 Thread Christian Kastner
Package: rabbitmq-server
Version: 3.10.8-1.1
Severity: wishlist
Tags: patch

Hi,

it would be nice to have a /etc/rabbitmq/conf.d directory. This is
supported by upstream [1].

I haven't tried this, but postinst looks simple enough, and the attached
patch should accomplish this.

Best,
Christian

[1] https://www.rabbitmq.com/configure.html#config-confd-directorydiff -Nru rabbitmq-server-3.10.8-current/debian/rabbitmq-server.postinst rabbitmq-server-3.10.8/debian/rabbitmq-server.postinst
--- rabbitmq-server-3.10.8-current/debian/rabbitmq-server.postinst	2022-09-28 15:40:58.0 +0200
+++ rabbitmq-server-3.10.8/debian/rabbitmq-server.postinst	2023-08-03 09:57:43.719988265 +0200
@@ -18,8 +18,8 @@
 --disabled-login rabbitmq
 	fi
 
-	mkdir -p /etc/rabbitmq
-	chown rabbitmq:rabbitmq /etc/rabbitmq
+	mkdir -p /etc/rabbitmq/conf.d
+	chown rabbitmq:rabbitmq /etc/rabbitmq /etc/rabbitmq/conf.d
 	if [ -r /usr/share/rabbitmq/rabbitmq-env.conf ] && ! [ -e /etc/rabbitmq/rabbitmq-env.conf ] ; then
 		install -m 0644 -o rabbitmq -g rabbitmq /usr/share/rabbitmq/rabbitmq-env.conf /etc/rabbitmq/rabbitmq-env.conf
 	fi


Bug#944386: autopkgtest: can autopkgtest-build-qemu create a QEMU/KVM image without requiring superuser privileges?

2023-06-26 Thread Christian Kastner
On 2023-06-26 21:20, Johannes Schauer Marin Rodrigues wrote:
> this is not a daydream and I think we have nearly all building blocks in place
> to make all of this happen very soon! Here is a summary:
> 
>  1. You use `debvm-create --arch $foo` to create a filesystem image for any
> architecture supported by QEMU without superuser privileges. The
> debvm-create utility is a thin wrapper around mmdebstrap and passes all 
> the
> right arguments to create this filesystem image.
> 
>  2. You use either autopkgtest-virt-qemu or autopkgtest-virt-ssh (depending on
> whether Simon prefers merge request !236 or !237) to run your autopkgtest
> through qemu with that disk image
> 
>  3. ???
> 
>  4. Profit!

As somebody who uses QEMU images for a lot of things, I'm really glad to
hear this. Finally! Getting rid of root is a such a big win, in many ways.

> If you want to upgrade the disk image you can do so by using `debvm-run` which
> gives you a root shell for that disk image where you can just run "apt
> upgrade".
> 
> Now lets get to your "bonus". Once either MR !236 or !237 is accepted I will
> rewrite sbuild-qemu-create to not call autopkgtest-build-qemu (which uses 
> vmdb2
> which requires root) but debvm-create instead. Similarly, the sbuild-qemu
> script will be adapted to call sbuild with the autopkgtest backend and the
> right option (again depending on whether MR !236 or !237 get merged). And the
> sbuild-qemu-update tool will automate the updating so that you don't have to
> run debvm-run yourself.
> 
> Of course once either MR !236 or !237 get merged, you do not have to wait for
> me to change sbuild-qemu. Even before I have done this work you can always
> manually call sbuild with the autopkgtest backend and do your package builds 
> in
> the QEMU image created by debvm-create without superuser privileges.

For FWIW, as stated earlier I fully support this -- in the sense that
whatever changes are necessary or convenient, just go for it. e.g.: I
chose Python for those scripts, but if Perl is more convenient, just
drop the old stuff. Or if I can help, let me know.

I just have one question about debvm though (hence Helmut in CC), which
is beyond my depth: is that kernel direct boot something akin to EFI, or
BIOS, or of its own kind?

As I've managed to land on a use case where EFI boot seems to be needed
(PCI device passthrough), at least according to the various docs I found.

Best,
Christia



Bug#944386: autopkgtest: can autopkgtest-build-qemu create a QEMU/KVM image without requiring superuser privileges?

2023-06-26 Thread Christian Kastner
On 2023-06-27 00:15, Christian Kastner wrote:
> I just have one question about debvm though (hence Helmut in CC), which
> is beyond my depth: is that kernel direct boot something akin to EFI, or
> BIOS, or of its own kind?
> 
> As I've managed to land on a use case where EFI boot seems to be needed
> (PCI device passthrough), at least according to the various docs I found.

Having slept over this, I realize this isn't a big issue for sbuild-qemu
after all. It's only needed if sbuild-qemu wants to run build-time tests.

That would be a nice-to-have, but we also ship (or will ship) upstream's
extensive test suites as autopkgtests, so we have testing covered, and
that nice-to-have shouldn't be a blocker for anything.

Best,
Christian



Bug#1038139: debci-worker: Process leaks authentication data via amqp-tools

2023-06-16 Thread Christian Kastner
On 2023-06-16 17:56, Antonio Terceiro wrote:
> Note that the variable where you inserted a username and password is
> calle debci_amqp_server, and was never supposed to be used for putting a
> password in plain text.

I think this is where the documentation of the --amqp option threw me
off, from debci(1):

--amqp amqp://[user:password@]hostname[:port]

> For the c.d.n deployment we use SSL client certificates for
> authentication, and that's why the variables debci_amqp_cacert,
> debci_amqp_cert, debci_amqp_key are there.

Yeah, I was guessing as much.

I just wanted to make sure that in the case of only the server
certificate + client auth/pass, there's a safer way to do that.

> IMO that is no different from any other program that takes a url as a
> command line parameter: you can pass a URL containing a username and
> password, but then that's on you.

Indeed. I only mentioned it since it's not entirely obvious for a
first-time debci user that the debci_amqp_server config option is passed
on via CLI to some other utility, rather than consumed by a library, or
similar.

Best,
Christian



Bug#1038139: debci-worker: Process leaks authentication data via amqp-tools

2023-06-15 Thread Christian Kastner


Package: debci
Version: 3.6
Severity: serious
Tags: security
X-Debbugs-Cc: Debian Security Team 

Hi,

When using authentication in AMQP connections, the username and password
supplied in the --url option to amqp-consume resp. amqp-publish are
exposed in the proces list, see #1037322:

  $ pgrep -a ampq-consume
  62287 amqp-consume --url amqp://user:pass@192.168.0.1 --queue=myqueue

A patch has been accepted upstream to read the username and password
from a file. I assume this will make its way into ampq-tools soon.

Unless I'm mistaken, debci will need to be updated for this, e.g. by
adding a debci_amqp_pwfile config option + NEWS entry suggesting that
people migrate to this new option. I'd be happy to file an MR for this,
once ampq-tools has been fixed.

Best,
Christian


-- System Information:
Debian Release: 11.7
  APT prefers oldstable-updates
  APT policy: (500, 'oldstable-updates'), (500, 'oldstable-security'),
(500, 'oldstable')
Architecture: amd64 (x86_64)

Kernel: Linux 6.1.0-0.deb11.7-amd64 (SMP w/24 CPU threads; PREEMPT)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8),
LANGUAGE=en_US:en
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages debci depends on:
ii  adduser 3.118
pn  amqp-tools  
ii  curl7.88.1-7~bpo11+2
ii  dctrl-tools 2.24-3+b1
ii  debian-archive-keyring  2021.1.1+deb11u1
ii  debootstrap 1.0.128+nmu2~bpo11+1
ii  devscripts  2.22.2~bpo11+1
pn  distro-info 
ii  fonts-font-awesome  5.0.10+really4.7.0~dfsg-4.1
ii  jq  1.6-2.1
ii  libjs-bootstrap 3.4.1+dfsg-2
ii  libjs-jquery3.5.1+dfsg+~3.5.5-7
pn  libjs-jquery-flot   
pn  moreutils   
ii  netcat-openbsd  1.217-3
pn  parallel
ii  patchutils  0.4.2-1
pn  retry   
ii  rsync   3.2.7-1~bpo11+1
ii  ruby1:2.7+2
pn  ruby-activerecord   
pn  ruby-bunny  
pn  ruby-erubi  
pn  ruby-kaminari-activerecord  
pn  ruby-pg 
pn  ruby-sinatra
pn  ruby-sinatra-contrib
pn  ruby-sqlite3
pn  ruby-thor   
pn  sudo

Versions of packages debci recommends:
ii  systemd-timesyncd [time-daemon]  252.5-2~bpo11+1

Versions of packages debci suggests:
pn  apt-cacher-ng  



Bug#1036123: [pre-approval] unblock: libcap2/1:2.66-4

2023-05-15 Thread Christian Kastner
(re-sent, this time to the right recipients. Apologies, it's been a long
day)

On 2023-05-15 21:15, Salvatore Bonaccorso wrote:
>> +libcap2 (1:2.66-4) unstable; urgency=medium
>> +
>> +  * Apply upstream patches for CVE-2023-2602, CVE-2023-2603
>> +
>> + -- Christian Kastner   Mon, 15 May 2023 20:34:57 +0200
> 
> We had I guess a small overlap in bugreporting, can you as well
> include bug closer for #1036114 in your upload?

Thanks for catching this, Salvatore.

Updated debdiff attached.

Best,
Christian
diff -Nru libcap2-2.66/debian/changelog libcap2-2.66/debian/changelog
--- libcap2-2.66/debian/changelog   2022-12-21 21:19:49.0 +0100
+++ libcap2-2.66/debian/changelog   2023-05-15 20:34:57.0 +0200
@@ -1,3 +1,10 @@
+libcap2 (1:2.66-4) unstable; urgency=medium
+
+  * Apply upstream patches for CVE-2023-2602, CVE-2023-2603
+    (Closes: #1036114)
+
+ -- Christian Kastner   Mon, 15 May 2023 20:34:57 +0200
+
 libcap2 (1:2.66-3) unstable; urgency=medium
 
   * Add gcc to autopkgtest for upstream tests.
diff -Nru 
libcap2-2.66/debian/patches/Correct-the-check-of-pthread_create-s-return-value.patch
 
libcap2-2.66/debian/patches/Correct-the-check-of-pthread_create-s-return-value.patch
--- 
libcap2-2.66/debian/patches/Correct-the-check-of-pthread_create-s-return-value.patch
1970-01-01 01:00:00.0 +0100
+++ 
libcap2-2.66/debian/patches/Correct-the-check-of-pthread_create-s-return-value.patch
2023-05-15 20:34:57.0 +0200
@@ -0,0 +1,39 @@
+From: "Andrew G. Morgan" 
+Date: Wed, 3 May 2023 19:18:36 -0700
+Subject: Correct the check of pthread_create()'s return value.
+
+This function returns a positive number (errno) on error, so the code
+wasn't previously freeing some memory in this situation.
+
+Discussion:
+
+  https://stackoverflow.com/a/3581020/14760867
+
+Credit for finding this bug in libpsx goes to David Gstir of
+X41 D-Sec GmbH (https://x41-dsec.de/) who performed a security
+audit of the libcap source code in April of 2023. The audit
+was sponsored by the Open Source Technology Improvement Fund
+(https://ostif.org/).
+
+Audit ref: LCAP-CR-23-01 (CVE-2023-2602)
+
+Signed-off-by: Andrew G. Morgan 
+
+Origin: upstream, 
https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=bc6b36682f188020ee4770fae1d41bde5b2c97bb
+---
+ psx/psx.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/psx/psx.c b/psx/psx.c
+index d9c0485..65eb2aa 100644
+--- a/psx/psx.c
 b/psx/psx.c
+@@ -516,7 +516,7 @@ int __wrap_pthread_create(pthread_t *thread, const 
pthread_attr_t *attr,
+ pthread_sigmask(SIG_BLOCK, , NULL);
+ 
+ int ret = __real_pthread_create(thread, attr, _psx_start_fn, starter);
+-if (ret == -1) {
++if (ret > 0) {
+   psx_new_state(_PSX_CREATE, _PSX_IDLE);
+   memset(starter, 0, sizeof(*starter));
+   free(starter);
diff -Nru 
libcap2-2.66/debian/patches/Large-strings-can-confuse-libcap-s-internal-strdup-code.patch
 
libcap2-2.66/debian/patches/Large-strings-can-confuse-libcap-s-internal-strdup-code.patch
--- 
libcap2-2.66/debian/patches/Large-strings-can-confuse-libcap-s-internal-strdup-code.patch
   1970-01-01 01:00:00.0 +0100
+++ 
libcap2-2.66/debian/patches/Large-strings-can-confuse-libcap-s-internal-strdup-code.patch
   2023-05-15 20:34:57.0 +0200
@@ -0,0 +1,53 @@
+From: "Andrew G. Morgan" 
+Date: Wed, 3 May 2023 19:44:22 -0700
+Subject: Large strings can confuse libcap's internal strdup code.
+
+Avoid something subtle with really long strings: 1073741823 should
+be enough for anybody. This is an improved fix over something attempted
+in libcap-2.55 to address some static analysis findings.
+
+Reviewing the library, cap_proc_root() and cap_launcher_set_chroot()
+are the only two calls where the library is potentially exposed to a
+user controlled string input.
+
+Credit for finding this bug in libcap goes to Richard Weinberger of
+X41 D-Sec GmbH (https://x41-dsec.de/) who performed a security audit
+of the libcap source code in April of 2023. The audit was sponsored
+by the Open Source Technology Improvement Fund (https://ostif.org/).
+
+Audit ref: LCAP-CR-23-02 (CVE-2023-2603)
+
+Signed-off-by: Andrew G. Morgan 
+
+Origin: upstream, 
https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=422bec25ae4a1ab03fd4d6f728695ed279173b18
+---
+ libcap/cap_alloc.c | 12 +++-
+ 1 file changed, 7 insertions(+), 5 deletions(-)
+
+diff --git a/libcap/cap_alloc.c b/libcap/cap_alloc.c
+index c826e7a..25f9981 100644
+--- a/libcap/cap_alloc.c
 b/libcap/cap_alloc.c
+@@ -105,15 +105,17 @@ char *_libcap_strdup(const char *old)
+   errno = EINVAL;
+   return NULL;
+ }
+-len = strlen(old) + 1 + 2*sizeof(__u32);
+-if (len < sizeof(struct _cap_alloc_s)) {
+-  len = sizeof(struct _cap_alloc_s);
+-}
+-if ((len & 0x) != len) {
++
++len = strlen(old);
++if ((len & 0x3ff

Bug#1036123: [pre-approval] unblock: libcap2/1:2.66-4

2023-05-15 Thread Christian Kastner
Package: release.debian.org
Severity: normal
User: release.debian@packages.debian.org
Usertags: unblock
X-Debbugs-Cc: libc...@packages.debian.org
Control: affects -1 + src:libcap2

Please unblock package libcap2

This fixes two minor CVEs for which the fix was published today. The fix
consists of cherry-picking two small patches from upstream.

I'm erring on the side of caution here and asking for pre-approval, as
the issues this fixes were considered to be minor and I'm not sure
whether "CVE" by itself automatically satisfies the threshold for direct
upload.

[ Reason ]
Fix for two security issues.

[ Impact ]
Without this release, users will be left vulnerable to two minor issues.

[ Tests ]
All upstream tests passed, including those requiring root (tested within
a VM).

[ Risks ]
Little to none. The two patches are trivial.

[ Checklist ]
  [X] all changes are documented in the d/changelog
  [X] I reviewed all changes and I approve them
  [X] attach debdiff against the package in testing

unblock libcap2/1:2.66-4diff -Nru libcap2-2.66/debian/changelog libcap2-2.66/debian/changelog
--- libcap2-2.66/debian/changelog   2022-12-21 21:19:49.0 +0100
+++ libcap2-2.66/debian/changelog   2023-05-15 20:34:57.0 +0200
@@ -1,3 +1,9 @@
+libcap2 (1:2.66-4) unstable; urgency=medium
+
+  * Apply upstream patches for CVE-2023-2602, CVE-2023-2603
+
+ -- Christian Kastner   Mon, 15 May 2023 20:34:57 +0200
+
 libcap2 (1:2.66-3) unstable; urgency=medium
 
   * Add gcc to autopkgtest for upstream tests.
diff -Nru 
libcap2-2.66/debian/patches/Correct-the-check-of-pthread_create-s-return-value.patch
 
libcap2-2.66/debian/patches/Correct-the-check-of-pthread_create-s-return-value.patch
--- 
libcap2-2.66/debian/patches/Correct-the-check-of-pthread_create-s-return-value.patch
1970-01-01 01:00:00.0 +0100
+++ 
libcap2-2.66/debian/patches/Correct-the-check-of-pthread_create-s-return-value.patch
2023-05-15 20:34:57.0 +0200
@@ -0,0 +1,39 @@
+From: "Andrew G. Morgan" 
+Date: Wed, 3 May 2023 19:18:36 -0700
+Subject: Correct the check of pthread_create()'s return value.
+
+This function returns a positive number (errno) on error, so the code
+wasn't previously freeing some memory in this situation.
+
+Discussion:
+
+  https://stackoverflow.com/a/3581020/14760867
+
+Credit for finding this bug in libpsx goes to David Gstir of
+X41 D-Sec GmbH (https://x41-dsec.de/) who performed a security
+audit of the libcap source code in April of 2023. The audit
+was sponsored by the Open Source Technology Improvement Fund
+(https://ostif.org/).
+
+Audit ref: LCAP-CR-23-01 (CVE-2023-2602)
+
+Signed-off-by: Andrew G. Morgan 
+
+Origin: upstream, 
https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=bc6b36682f188020ee4770fae1d41bde5b2c97bb
+---
+ psx/psx.c | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/psx/psx.c b/psx/psx.c
+index d9c0485..65eb2aa 100644
+--- a/psx/psx.c
 b/psx/psx.c
+@@ -516,7 +516,7 @@ int __wrap_pthread_create(pthread_t *thread, const 
pthread_attr_t *attr,
+ pthread_sigmask(SIG_BLOCK, , NULL);
+ 
+ int ret = __real_pthread_create(thread, attr, _psx_start_fn, starter);
+-if (ret == -1) {
++if (ret > 0) {
+   psx_new_state(_PSX_CREATE, _PSX_IDLE);
+   memset(starter, 0, sizeof(*starter));
+   free(starter);
diff -Nru 
libcap2-2.66/debian/patches/Large-strings-can-confuse-libcap-s-internal-strdup-code.patch
 
libcap2-2.66/debian/patches/Large-strings-can-confuse-libcap-s-internal-strdup-code.patch
--- 
libcap2-2.66/debian/patches/Large-strings-can-confuse-libcap-s-internal-strdup-code.patch
   1970-01-01 01:00:00.0 +0100
+++ 
libcap2-2.66/debian/patches/Large-strings-can-confuse-libcap-s-internal-strdup-code.patch
   2023-05-15 20:34:57.0 +0200
@@ -0,0 +1,53 @@
+From: "Andrew G. Morgan" 
+Date: Wed, 3 May 2023 19:44:22 -0700
+Subject: Large strings can confuse libcap's internal strdup code.
+
+Avoid something subtle with really long strings: 1073741823 should
+be enough for anybody. This is an improved fix over something attempted
+in libcap-2.55 to address some static analysis findings.
+
+Reviewing the library, cap_proc_root() and cap_launcher_set_chroot()
+are the only two calls where the library is potentially exposed to a
+user controlled string input.
+
+Credit for finding this bug in libcap goes to Richard Weinberger of
+X41 D-Sec GmbH (https://x41-dsec.de/) who performed a security audit
+of the libcap source code in April of 2023. The audit was sponsored
+by the Open Source Technology Improvement Fund (https://ostif.org/).
+
+Audit ref: LCAP-CR-23-02 (CVE-2023-2603)
+
+Signed-off-by: Andrew G. Morgan 
+
+Origin: upstream, 
https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=422bec25ae4a1ab03fd4d6f728695ed279173b18
+---
+ libcap/cap_alloc.c | 12 +++-
+ 1 file changed, 7 insertions(+), 5 deletions(-)
+
+diff --git a/libcap/cap_al

Bug#1036123: [pre-approval] unblock: libcap2/1:2.66-4

2023-05-16 Thread Christian Kastner
Control: tags -1 - moreinfo

On 2023-05-15 22:12, Sebastian Ramacher wrote:
> Please go ahead and remove the moreinfo tag once the package is
> available in unstable.

Done (this time with the right recipients)



Bug#1034428: Fwd: Bug#1034428: unblock: vmdb2/0.27-1

2023-05-06 Thread Christian Kastner
(re-sending to bug which I forgot to CC)

Hi,

On 2023-05-05 20:05, Helmut Grohne wrote:
> Also sbuild-qemu is a direct reverse dependency of vmdb2. I haven't
> looked deep there.

sbuild-qemu is just a simple wrapper around autopkgtest-build-qemu that
simplifies the image customziation process.

sbuild-qemu doesn't need vmdb2 itself; it only depends on it because
autopkgtest only suggest vmdb2. This makes sense, are there are many
backends other than qemu for autopkgtest.

I haven't looked at the code changes in vmdb2 but I'm confident that if
autopkgtest gets updated accordingly, everything should be fine for
sbuild-qemu. Either way, I'm happy to help.

Best,
Christian



Bug#1063349: libamd-comgr2 exports wrong symbol version

2024-02-08 Thread Christian Kastner
On 2024-02-06 14:28, Christian Kastner wrote:
> As discussed in this thread [1], libamd-comgr2 exports
> amd_comgr_get_isa_count@1.8 when upstream is at @2.0.
> 
> This is because the symbol was erroneously not removed from @1.8 when it
> was added to @2.0 when the ABI changed.

Following [1] (bottom example), I updated the patch restoring the old
symbol with the suggested trickery, and together with the restored
export map, this seems to have worked.

unstable:
  $nm -gD /usr/lib/x86_64-linux-gnu/libamd_comgr.so.2.4.0 | grep isa_count
  005cbd10 T amd_comgr_get_isa_count@@amd_comgr_1.8

experimental+updated patch:
  $ nm -gD /usr/lib/x86_64-linux-gnu/libamd_comgr.so.2.6.0 | grep isa_count
  00491d70 T amd_comgr_get_isa_count@@amd_comgr_2.0
  00491d70 T amd_comgr_get_isa_count@amd_comgr_1.8

To test this, I compiled a minimal test program using the old version.
Then I updated the library, and compiled a new test program. The old
program continued to work fine with @1.8, and the new one used @2.0.

I'm not sure how portable this is, but this is localized to us anyway.

Best,
Christian

[1] https://gcc.gnu.org/wiki/SymbolVersioning



Bug#1054606: libcap2-bin: Move /sbin/getcap to /bin

2024-02-10 Thread Christian Kastner
Hi Peter,

On 2023-10-26 19:29, Peter Samuelson wrote:
> Package: libcap2-bin
> Version: 1:2.66-4
> Severity: wishlist
> 
> In my opinion, getcap(8) is useful to run as a non-root user, so it
> should be in /bin rather than /sbin.  This seems analogous to ip(8)
> from iproute2, which was moved to /bin many years ago.

this makes sense in general, but is a bit tricky.

It can't just be moved, as this would break scripts that hard-code
/usr/sbin/getcap.

A symlink wouldn't do much good.

I think what's needed is a move to /bin and a wrapper in /sbin calling
the version in /bin, but emitting a decprecation notice.

However, in the end, this would need to be coordinated by upstream, and
we'd want consistency over all distros. I'll make a ping.

Best,
Christian



Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++

2024-02-10 Thread Christian Kastner
Package: wnpp
Severity: wishlist
Owner: Christian Kastner 
X-Debbugs-Cc: debian-de...@lists.debian.org, debian...@lists.debian.org

* Package name: llama.cpp
  Version : b2116
  Upstream Author : Georgi Gerganov
* URL : https://github.com/ggerganov/llama.cpp
* License : MIT
  Programming Lang: C++
  Description : Inference of Meta's LLaMA model (and others) in pure C/C++

The main goal of llama.cpp is to enable LLM inference with minimal
setup and state-of-the-art performance on a wide variety of hardware -
locally and in the cloud.

* Plain C/C++ implementation without any dependencies
* Apple silicon is a first-class citizen - optimized via ARM NEON,
  Accelerate and Metal frameworks
* AVX, AVX2 and AVX512 support for x86 architectures
* 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for
  faster inference and reduced memory use
* Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD
  GPUs via HIP)
* Vulkan, SYCL, and (partial) OpenCL backend support
* CPU+GPU hybrid inference to partially accelerate models larger than
  the total VRAM capacity

This package will be maintained by the Debian Deep Learning Team.



Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++

2024-02-13 Thread Christian Kastner
Hi Petter,

On 2024-02-13 08:36, Petter Reinholdtsen wrote:
> I tried building the CPU edition on one machine and run it on another,
> and experienced illegal instruction exceptions.  I suspect this mean one
> need to be careful when selecting build profile to ensure it work on all
> supported Debian platforms.

yeah, that was my conclusion from my first experiments as well.

This is a problem though, since one key point of llama.cpp is to make
best use of the current hardware. If we'd target some 15-year-old amd64
lowest common denominator, we'd go against that.

In my first experiments, I've also had problems with ROCm builds on
hosts without a GPU.

I have yet to investigate if/how capabilities can be generally enabled,
and use determined at runtime.

Another issue that stable is clearly the wrong distribution for this.
This is a project that is continuously gaining new features, so we'd
need to stable-updates.

> I would be happy to help getting this up and running.  Please let me
> know when you have published a git repo with the packaging rules.

I'll push a first draft soon, though it will definitely not be
upload-ready for the above reasons.

Best,
Christian



Bug#1060435: sbuild-qemu-create shows error because "zerofree" not installed

2024-02-20 Thread Christian Kastner
Hi Carles,

On 2024-01-11 12:01, Carles Pina i Estany wrote:
> It ended with:
> Exec: ['sh', '-ec', 'export AUTOPKGTEST_BUILD_QEMU=1; 
> /usr/share/sbuild/sbuild-qemu-create-modscript "$ROOT"']
> Exec: ['zerofree', '-v', '/dev/mapper/loop0p1']
> ERROR: [Errno 2] No such file or directory: 'zerofree'
> ERROR: FileNotFoundError(2, 'No such file or directory')
> Exec: ['kpartx', '-dsv', 
> '/srv/sbuild/qemu/unstable-autopkgtest-amd64.img.raw']
> Exec: ['losetup', '--json', '-l', '/dev/loop0']
> All went fine.
> 
> Installing "zerofree" does not show the error.
> 
> I expected "zerofree" to be in Depends or Recommends.

I'm not entirely sure how to best address this bug, as it's not a bug in
sbuild-qemu, but in its dependency vmdb2, and has recently been fixed
there (#1021341).

I supposed it can't hurt to add the dependency it to sbuild-qemu, though.

Best,
Christian



Bug#1033352: sbuild: autokpgtest-virt-server needs host $HOME

2024-02-21 Thread Christian Kastner
Hi josch,

On 2024-02-21 08:02, Johannes Schauer Marin Rodrigues wrote:
> Quoting Christian Kastner (2023-03-23 09:53:05)
>> Attempting to build a package with the autopkgtest-virt-podman backend fails
>> because of what I suspect is an issue with $HOME directory handling. podman
>> needs $HOME on the host to find containers, but it defaults to
>> /sbuild-nonexistent, which I guess is meant for the target enviromnent.
> 
> is this a duplicate of #1061388?

I *think* so, but I'm not sure.

The cause definitely seems to be the same: on the host, prior to opening
the chroot, $HOME is set to /sbuild-nonexistent, which triggers these
two bugs.

I'm not sure because I don't know if the $HOME thing above is buggy ,or
it's correct (though strict) and autopkgtest-build-podman, incus, or
something else needs fixing.

For example, in #1061388, a possible fix in incus is mentioned.

>From my current understanding, this wouldn't work for podman, because
its attempted use of $HOME/.config/local/*.conf is not just legitimate,
I'd actually call it required.

Best,
Christian



Bug#1064070: RFP: cxxheaderparser -- python library for parsing C++ headers

2024-02-18 Thread Christian Kastner
Control: tag -1 pending

I've uploaded this to experimental, though with the Debian Python Team
as maintainer. Because although it is a dependency of the ROCm stack, it
seems to be a generally useful Python library.

Best,
Christian

On 2024-02-16 18:57, Cordell Bloor wrote:
> Package: wnpp
> Severity: wishlist
> X-Debbugs-Cc: c...@slerp.xyz, debian...@lists.debian.org
> 
> * Package name: cxxheaderparser
>   Version : 1.3.1
>   Upstream Contact: RobotPy Development Team 
> * URL : https://github.com/robotpy/cxxheaderparser
> * License : BSD-3-Clause
>   Programming Lang: Python
>   Description : Python library for parsing C++ headers
> 
> The cxxheaderparser library is used to parse syntactically valid C++
> code and operate on the results. It provides both a visitor-style
> interface to process the results as they are being parsed or the option
> of a single data structure containing all parsed information.
> 
> This library is a successor to CppHeaderParser, which is a build
> dependency of the AMD ROCm GPU profiling libraries used by PyTorch.
> Specifically, CppHeaderParser is required for roctracer and parts of
> rocm-hipamd. As CppHeaderParser has been deprecated by its authors,
> those libraries will need to be migrated to cxxheaderparser.
> 



Bug#1063349: libamd-comgr2 exports wrong symbol version

2024-02-06 Thread Christian Kastner
Package: libamd-comgr2
Version: 5.2.3-2
Severity: important

As discussed in this thread [1], libamd-comgr2 exports
amd_comgr_get_isa_count@1.8 when upstream is at @2.0.

This is because the symbol was erroneously not removed from @1.8 when it
was added to @2.0 when the ABI changed.

The discrepancy between upstream and our version needs to be resolved.

[1] https://lists.debian.org/debian-ai/2024/01/msg00087.html



Bug#1057265: cron: Uncheck return values of set*id() family functions

2023-12-02 Thread Christian Kastner
Hi Jeffrey,

On 2023-12-02 11:39, Jeffrey Bencteux wrote:
> Hi,
> 
> Both setuid() and setgid() return values are not checked in cron's code used 
> to execute user-provided commands:

This issue was reported as CVD-2006-2607 and fixed a long time ago.

Here's the relevant patch:

https://sources.debian.org/src/cron/3.0pl1-162/debian/patches/fixes/Check-privilege-drop-results-CVE-2006-2607.patch/

Are you perhaps looking at the unpatched source?

Best,
Christian



Bug#1056170: libhsa-runtime64-1: ROCr must assume xnack is disabled

2023-11-24 Thread Christian Kastner
Hi Cory,

On 2023-11-23 08:35, Cordell Bloor wrote:
> On 2023-11-22 03:19, Christian Kastner wrote:
>>> The Linux kernel on Debian is built without HSA_AMD_SVM enabled. That is
>>> the KConfig for "Enable HMM-based shared virtual memory manager", which
>>> is required for xnack+ operation. The xnack feature allows some AMD GPUs
>>> to retry memory accesses that fail due to a page fault, which is used as
>>> a mechanism for migrating managed memory automatically from host to
>>> device. With xnack disabled, page faults in device code are not
>>> recoverable [1].
>> I've rebuilt our kernel with this option enabled, and the message indeed
>> went away. Great!
>>
>> This also required DEVICE_PRIVATE (and that one also suggests
>> HMM_MIRROR). I don't see any downside to these; should we request them
>> from the Kernel Team?
> 
> I suppose the downside would be that more code means more bugs. I'm not
> sure what inclusion criteria is used by the maintainers, but it seems

you linked to [1] in one of your replies. Under "Supported Hardware",
the article states:

> Not all GPUs are supported. Most GFX9 GPUs from the GCN series usually 
> support XNACK, but only APU platforms enabled it by default. On dedicated 
> graphics cards, it’s disabled by the Linux amdgpu kernel driver, possibly due 
> to stability concerns as it’s still an experimental feature.
> 
> For users of GFX10/GFX11 GPUs from the RDNA series, unfortunately, XNACK is 
> no longer supported. Only computing cards from the CDNA series has XNACK 
> support, such as Instinct MI100 and MI200 - and they also belong to the 
> GFX900 series.

I don't think the lack of official support is a problem here, evaluating
this is what we have our CI for. We could build an image with a fixed
kernel, and see what happens to tests there.

However, unlikely as it may seem, I'd still like to ask: is there any
risk of negatively affecting the graphics side of this? Can this change
somehow break a regular user's video output?

This is far-fetched, but it's not entirely inconceivable that some
external stack might rely on the current behavior.

As a workaround, I was hoping that setting HSA_XNACK=0 would disable the
check, but it didn't work on my end, unfortunately.

Best,
Christian

> [1]: https://niconiconi.neocities.org/tech-notes/xnack-on-amd-gpus/



Bug#1056667: librocthrust-tests: test failures across all architectures

2023-11-24 Thread Christian Kastner
Package: librocthrust-tests
Version: 5.3.3-5
Severity: important

All CI tests of librocthrust-tests have failed, with two different
failure modes.

On gfx803 and gfx906, the tests pass, but do so in under one minute,
with no test output. That cannot be right.

On all other architectures, about two minutes into the test, it seems an
infinite loop is encountered. The test eventually hits the timeout limit
and is then terminated by autopkgtest.

Filing this to track the issue.

[1] https://ci.rocm.debian.net/packages/r/rocthrust/



Bug#1056170: libhsa-runtime64-1: ROCr must assume xnack is disabled

2023-11-24 Thread Christian Kastner
Hi Cory,

thanks for clarifying, I indeed misunderstood a few things.

On 2023-11-24 16:42, Cordell Bloor wrote:
>> However, unlikely as it may seem, I'd still like to ask: is there any
>> risk of negatively affecting the graphics side of this? Can this change
>> somehow break a regular user's video output?
>>
>> This is far-fetched, but it's not entirely inconceivable that some
>> external stack might rely on the current behavior.
> 
> Yes, there is always a risk when enabling a new feature that it will
> introduce bugs. I see there's an issue on the amdgpu bug tracker with a
> user who has both an AMD GPU and an NVIDIA GPU on their system. It seems
> that HSA_AMD_SVM is causing issues with switching the NVIDIA card back
> and forth between the host driver and vfio-pci [2].

In that case, I'd like to postpone asking the Kernel Team for now, at
least until this known issue has been addressed. (I'm not concerned
about this particular use case, but rather that the issue may be a
symptom of an underlying cause with broader reach).

That doesn't mean we can't do our own experiments, in fact I like the
idea of "forking" unstable with a customized kernel more and more, call
it "unstable-amdsvm" or whatever.

The advantage of having our own APT repo is that it's pretty easy to do
things like that.

Best,
Christian



Bug#1056053: debci-worker: QEMU backend arguments cannot contain quoted arguments

2023-11-16 Thread Christian Kastner
Package: debci-worker
Version: 3.7
Severity: normal

3.7 added this nice feature where arguments can be passed to backends. However,
the --qemu-options parameter of the QEMU backend cannot be used with the
current implementation because its argument usually contains spaces, and these
get misinterpreted during expansion of "$@".

For example, given the following setting,

debci_autopkgtest_args_qemu"--ram-size 32768 --cpus 4 --qemu-options='--cpu 
host'"

running a test will tmpfail with

autopkgtest-virt-qemu: error: unrecognized arguments: host'

because |--qemu-options='--cpu| and |host'| get interpreted as two words,
rather than one.


A POSIXly solution doesn't immediately jump to my mind, but I'd thought I'd
report it for now, just to track the issue.

I don't think the other backends are affected, they don't have similar
parameters.

Best,
Christian



Bug#1056170: libhsa-runtime64-1: ROCr must assume xnack is disabled

2023-11-22 Thread Christian Kastner
Hey Cory,

On 2023-11-21 21:01, Cordell Bloor wrote:
> On 2023-11-18 00:39, Cordell Bloor wrote:
>> Each time a HIP application is executed, the rocr-runtime prints the message:
>>
>> KFD does not support xnack mode query.
>> ROCr must assume xnack is disabled.
>>
>> It is unclear to me whether something is actually wrong or not. This
>> message is emitted from a debug_print statement in amd_topology.cpp. An
>> example of this message can be found in the CI logs [1].
> 
> This is a debug message. It is guarded by NDEBUG, so it would not be
> printed if rocr were built in Release mode. There is a bit of discussion
> upstream as to whether the debug_print should instead be guarded by an
> environment variable rather than a preprocessor definition.

> The Linux kernel on Debian is built without HSA_AMD_SVM enabled. That is
> the KConfig for "Enable HMM-based shared virtual memory manager", which
> is required for xnack+ operation. The xnack feature allows some AMD GPUs
> to retry memory accesses that fail due to a page fault, which is used as
> a mechanism for migrating managed memory automatically from host to
> device. With xnack disabled, page faults in device code are not
> recoverable [1].

I've rebuilt our kernel with this option enabled, and the message indeed
went away. Great!

This also required DEVICE_PRIVATE (and that one also suggests
HMM_MIRROR). I don't see any downside to these; should we request them
from the Kernel Team?

That did remind me of another message I've seen in dmesg, repeated a
few dozen times, when some (but not all) tests are run:

amdgpu: init_user_pages: Failed to get user pages: -1

rocrand is a good example where these occur.

Despite the failure, I did not observe any negative side effects, but
the above change also did not solve this. Have you seen this message in
dmesg as well?

Best,
Christian



Bug#960729: More issues trying to create an Ubuntu focal image

2024-04-11 Thread Christian Kastner
On 2024-04-11 15:25, Paride Legovini wrote:
> On 2024-04-11 08:35, Christian Kastner wrote:
> Ubuntu did indeed switch to something else: that's netplan.io.
> On a Bionic system:
> 
> $ apt show netplan.io
> Package: netplan.io
> Version: 0.99-0ubuntu3~18.04.5
> Priority: important
Oh, that explains ifupdown moving to universe, I guess...

> So we have another option: teach setup-testbed how to configure
> netplan.

> This would be a more realistic setup for a modern Ubuntu system,
> and won't need any extra dependency outside of what debootstrap
> installs automatically.

I think "more realistic" is a strong argument for a test environment.
And if the package is already installed, adding the configuring step
seems the simplest solution.

Best,
Christian



Bug#1068748: autopkgtest-build-qemu: Ubuntu EFI VMs fail to boot

2024-04-10 Thread Christian Kastner
Package: autopkgtest
Version: 5.34
Severity: normal
Block: -1 by 1068746

When building an Ubuntu image with autopkgtest-build-qemu and using
--boot=efi, the resulting image fails to boot when running autopkgtests
on it.

An attempted manual boot with qemu-system-x86_64 shows EFI complaining
about some disk not found.

The cause for this is #1068746, vmdb2 invoking grub-install with flipped
logic on Ubuntu, related to #951766.

I've already filed an MR at vmdb2 upstream that fixes the logic, but I
thought it might be best to track the issue here as well. Please feel
free to close this bug if you think it is superfluous.

Best,
Christian



Bug#1068746: vmdb2: Flipped boolean argument breaks Ubuntu EFI boot

2024-04-10 Thread Christian Kastner
Package: vmdb2
Version: 0.28-2
Severity: normal
Forwarded: https://gitlab.com/larswirzenius/vmdb2/-/merge_requests/143

In #951766, grub's --force-extra-movable (Debian) was conditionalized
for Ubuntu grub's --no-extra-removable. But the logic is flipped: where
Debian uses --force-extra-removable, Ubuntu does this by default, so it
should simply be omitted.

Best,
Christian



Bug#1068748: autopkgtest-build-qemu: Ubuntu EFI VMs fail to boot

2024-04-11 Thread Christian Kastner
On 2024-04-10 15:29, Simon McVittie wrote:
> On Wed, 10 Apr 2024 at 11:54:02 +0200, Christian Kastner wrote:
>> I've already filed an MR at vmdb2 upstream that fixes the logic, but I
>> thought it might be best to track the issue here as well. Please feel
>> free to close this bug if you think it is superfluous.
> 
> If there's no actionable bug in autopkgtest, and a fix in vmdb2 will
> automatically fix autopkgtest without any autopkgtest code changes being
> required or desirable, then the usual way to represent that would be to
> give the vmdb2 bug report an "affects" on autopkgtest.

Good to know, thanks. I've marked vmdb2's bug as such.

How does this discoverability? Do "affects" bugs appear in the BTS view
of the affected package? (I'm thinking of the zerofree issue for
example, which was also a vmdb2 issue, but users reported it against
autopkgtest and sbuild-qemu).

I'll leave it open to the maintainers to determine if a code change is
desirable (eg: bump dependency) or if this bug here can be closed.

Best,
Christian



Bug#960729: More issues trying to create an Ubuntu focal image

2024-04-11 Thread Christian Kastner
Control: tags -1 - pending

On 2024-04-08 15:21, Paride Legovini wrote:
> Fixed in master by:
> 
> https://salsa.debian.org/ci-team/autopkgtest/-/merge_requests/315

Sadly, it turns out that this wasn't the fix, at least not in a wider sense.

Yes, images can be built now, but without ifupdown their network
interface is left unconfigured, and thus autopkgtests can't download
packages.

With the move of ifupdown to universe, I was assuming that Ubuntu did
things differently. The cloud images *do* things differently, namely
they have systemd-networkd. But autopkgtest allows for alternative init
systems, so we can't rely on that.

So ifupdown seems to be needed, and this poses an interesting problem
with at least the following possible solutions:

(1) Enable universe in autopkgtest-build-qemu after all, as in [1]

(2) Modify setup-commands/setup-testbed and generally move ifupdown
installation there. This would require re-ordering stuff, as
interface configuration is currently performed *before* APT
configuration, when in needs to be after (otherwise ifupdown can't
be downloaded from universe).

Note that (1) and (2) don't exclude each other. One could do (1) to fix
things short-term, then implement (2).

And of course, there could be other solutions. And though I'm doing more
and more Ubuntu stuff, I'm not yet familiar enough to propose better ideas.

Best,
Christian

PS: Apologies for not discovering this right away, I was having troubles
getting images to boot, but that's fixed now.

[1] https://salsa.debian.org/ci-team/autopkgtest/-/merge_requests/309



Bug#960729: More issues trying to create an Ubuntu focal image

2024-04-11 Thread Christian Kastner
On 2024-04-11 09:27, Paul Gevers wrote:
>> (2) Modify setup-commands/setup-testbed and generally move ifupdown
>>  installation there. This would require re-ordering stuff, as
>>  interface configuration is currently performed *before* APT
>>  configuration, when in needs to be after (otherwise ifupdown can't
>>  be downloaded from universe).
> 
> I'm against this route. I think we should require images to have
> networking to not grow the setup script even more than needed.
> Networking is nothing special, it should just be there.

I concur. From my first experiments with this, the script just felt like
the wrong place. It involves guessing the testbed type, and it would
require somewhat nuanced restructuring of the script.

And as long as universe is enabled unconditionally in the post-build
image customization by setup-command, limiting the build to main doesn't
seem to gain much.

Best,
Christian



Bug#1071000: sbuild-qemu: runs lintian on .changes file with Distribution != changelog target and lintian complains

2024-05-14 Thread Christian Kastner
Hi,

@josch, sorry, I overlooked that you referenced me earlier in this bug.

On 2024-05-14 12:12, Johannes Schauer Marin Rodrigues wrote:
> In general, you should avoid using -d or --dist with sbuild. But I see that
> sbuild-qemu always adds this option to the sbuild call. I am unable to say why
> it does that. Maybe Christian can clarify?

That's simple: because I wasn't aware of this side effect.

Internally, the distribution is set because some value is needed to
auto-guess the image name. It defaults to 'unstable' but if --dist is
set, its value is used is guessing the name.

This was actually wrong for many reasons. Another one would be that it
was passed on twice if the user used --dist.[1]

I just pushed a fix. Tested with
  $ sbuild-qemu --noexec foo
  $ sbuild-qemu --noexec --dist unstable foo
  $ sbuild-qemu --noexec --dist experimental --image /path/to/img foo

Best,
Christian

[1] Clean-up and proper tests are still on my TODO list :(



Bug#1065329: O: numpy -- Fast array facility to the Python 3 language

2024-03-14 Thread Christian Kastner
Hi Timo,

On 2024-03-14 10:06, Timo Röhling wrote:
>> Having read up on debian-python, I have misread the situation. I think
>> there needs to be a policy resolution first.
> I don't understand what you mean. The orphaning process is not tied to
> DPT policy, is it?
> 
> FWIW, I am a regular user of this package and would also like to help
> maintain it.

I underspecified -- what I meant is resolution of both policy and the
active team dynamics.

In any case, if this isn't resolved soon, I'm also happy to contribute.

Best,
Christian



Bug#1063673: ITP: llama.cpp -- Inference of Meta's LLaMA model (and others) in pure C/C++

2024-03-09 Thread Christian Kastner
Hey Ptter,

On 2024-03-08 20:21, Petter Reinholdtsen wrote:
> [Christian Kastner 2024-02-13]
>> I'll push a first draft soon, though it will definitely not be
>> upload-ready for the above reasons.
> 
> Where can I find the first draft?

I've discarded the simple package and now plan another approach: a
package that ships a helper to rebuild the utility when needed, similar
to DKMS. Rationale:
  * Continuously developed upstream, no build suited for stable
  * Build optimized for the current host's hardware, which is a key
feature. Building for our amd64 ISA standard would be absurd.
I'm open for better ideas, though.

I had to pause this primarily because of ROCm infrastructure work and
our updates to 5.7 in preparation of the gfx1100,gfx1101,gfx1102
architectures, and that is still my focus.

Incidentally, we could use some help with that, see thread at [1].
MIOpen in particular is something that our ROCm stack will eventually need.

Best,
Christian

[1] https://lists.debian.org/debian-ai/2024/03/msg00029.html
[2] https://github.com/ROCm/MIOpen



Bug#1065701: rocm_agent_enumerator: crash on systems without AMD GPU

2024-03-09 Thread Christian Kastner
Control: found -1 5.2.3-3

Hi Cory,

On 2024-03-09 07:20, Cordell Bloor wrote:
> On systems, the rocm_agent_enumerator command may crash with an error:
> 
> Traceback (most recent call last):
>   File "/usr/bin/rocm_agent_enumerator", line 260, in 
> main()
>   File "/usr/bin/rocm_agent_enumerator", line 244, in main
> target_list = readFromKFD()
>   ^
>   File "/usr/bin/rocm_agent_enumerator", line 193, in readFromKFD
> for node in sorted(os.listdir(topology_dir)):
>
> FileNotFoundError: [Errno 2] No such file or directory: 
> '/sys/class/kfd/kfd/topology/nodes/'

I've been seeing this one for a long time in package builds, but it
didn't occur to me that this is a user-visible issue, too.

Seen here [1], for example.

Best,
Christian

[1] 
https://buildd.debian.org/status/fetch.php?pkg=rocblas=amd64=5.3.3%2Bdfsg-2=1685955323=0



Bug#1068022: Document the Testsuite-Triggers field

2024-03-29 Thread Christian Kastner
Package: debian-policy
Version: 4.6.2.0
Severity: wishlist

Policy 5.6.30 lists the Testsuite field, but it doesn't list the
Testsuite-Triggers field that seems to be part of Sources files and is
generated by dpkg-source >= 1.18.8.

This field is quite useful, as given my package src:foo, I can find out
which packages have autopkgtests that depend on it, and are thus in the
set of reverse dependencies that I could check for breakage.

I'd provide a patch based on the documentation in dsc(5), but I don't
know what the current process is. Does anyone have a link to a doc on
how to submit a change?

Best,
Christian



Bug#1068022: Document the Testsuite-Triggers field

2024-03-30 Thread Christian Kastner
Hi Sean,

On 2024-03-30 02:35, Sean Whitton wrote:
>> I'd provide a patch based on the documentation in dsc(5), but I don't
>> know what the current process is. Does anyone have a link to a doc on
>> how to submit a change?
> 
> There is a chapter of Policy regarding the Policy Changes Process.

ironically enough, I did find that one before I submitted, but I didn't
see this as a Policy change, instead as documenting a change that had
already happened. Which, in hindsight, was of course an erroneous
assumption.

I'll continue on the proper process after the three-day-weekend here in
Austria.

Best,
Christian



Bug#1061816: mmdebstrap-autopkgtest-build-qemu VM image cannot be updated with sbuild-qemu-update

2024-03-28 Thread Christian Kastner
Hi josch,

On 2024-03-28 11:28, Johannes Schauer Marin Rodrigues wrote:
> I think sbuild-qemu-boot and sbuild-qemu-update should do the same as
> autopkgtest did here:
> 
> https://salsa.debian.org/ci-team/autopkgtest/-/commit/7a4954ded0f24221ac34ca0aaf10f3f9b083afa2

Thanks for pointing this out! Applied (though stupidly, with two commits.)

Though if I understand #1061816 correctly, then this should probably be
a separate bug, right? (Wanted to check before I clone)

Best,
Christian



Bug#1068199: librocfft0: callback test failures on gfx900 and gfx1030

2024-04-02 Thread Christian Kastner
Hey Cory,

thank you for the analysis. I'll try to reproduce and lock this down on
my end, too.

Best,
Christian

On 2024-04-02 00:35, Cordell Bloor wrote:
> I tried to reproduce the rocfft callback bug with a W6800 (gfx1030). I
> used a Debian Unstable docker container on an Ubuntu Noble host, but the
> tests all passed. This made me realize that the test failure pattern on
> the CI is that all the qemu-based workers are failing and all the
> podman-based workers are passing.
> 
> This issue seems to be somehow related to the qemu+rocm autopkgtest
> environment.



Bug#1033352: Not pending

2024-04-07 Thread Christian Kastner
Control: tags -1 - pending

I close the proposed MR implementing this as in light of the recently
published workaround, it is too invasive.



Bug#1068588: redesign of how autopkgtest talks to the testbed

2024-04-07 Thread Christian Kastner
Hi Paul,

On 2024-04-07 16:42, Paul Gevers wrote:
> The following issues have come up several times over the years. I
> propose to discuss them in one place (this bug report) to define the
> solution strategy. I haven't gone through all the details myself, so I
> might be thinking in the wrong direction, please correct me if you think
> so. Please also voice agreement, if not on the details, then on the
> general concept.

I'm not a maintainer but I use autopkgtest a lot. I hope it's OK if I
contribute input.

I generally agree with all of what you said, and would add the following:

> * [mostly orthogonal] currently the autopkgtest code has a lot of state
> in a non-Pythonic way. Reasoning about what goes on and debugging
> autopkgtest code flow is non-trivial.

It is indeed very difficult to keep track of what's going on. A lot of
state is kept in/communicated through globals, and it can be challenging
to remember which role the running threads are playing, and in which
relationship.

(smcv put this into historical context once.)

> Solution direction
> ==
> * handle communication between runner/autopkgtest and the virt servers
> and the ssh driver via Python classes instead of the text based
> protocol. Do this in a "plugin" friendly way such that backends can
> still easily be used without changes to src:autopkgtest.

I would add to this that testbed I/O and test I/O could benefit from
separate communications channels.

Example: the Debian ROCm Team requested the --timeout-poweroff option
for the QEMU backend because the hardware we pass in needs a clean
shutdown procedure. But it is not possible to trigger a shutdown when a
test is running, because on the I/O channel is being waited on for
output. So a timeout still ends with a SIGTERM of the testbed.

Best,
Christian



Bug#1068022: Document the Testsuite-Triggers field

2024-04-05 Thread Christian Kastner
Hi again,

On 2024-03-29 20:30, Christian Kastner wrote:
> Policy 5.6.30 lists the Testsuite field, but it doesn't list the
> Testsuite-Triggers field that seems to be part of Sources files and is
> generated by dpkg-source >= 1.18.8.
> 
> This field is quite useful, as given my package src:foo, I can find out
> which packages have autopkgtests that depend on it, and are thus in the
> set of reverse dependencies that I could check for breakage.

I've read up on the change process [1], and I guess my proposal to
submit a patch was too far into the process.

Thus, I take a step back, and seek discussion first.

In addition to what I've said above, I think documenting this field
would not only enhance discoverability, but give more weight to it for
tooling that makes use of these fields.

For discussion context, I'd like to quote dsc(5) on this field:
> Testsuite-Triggers: package-list
> 
> This field declares the comma-separated union of all test dependencies 
> (Depends fields in debian/tests/control file), with all restrictions removed, 
> and OR dependencies flattened (that is, converted to separate AND 
> relationships), except for binaries generated by this source package and its 
> meta-dependency equivalent @.
> 
> Rationale: this field is needed because otherwise to be able to get the test 
> dependencies, each source package would need to be unpacked.
Best,
Christian

[1] https://www.debian.org/doc/debian-policy/ap-process.html



Bug#1033352: sbuild: autokpgtest-virt-server needs host $HOME

2024-04-05 Thread Christian Kastner
On 2024-02-21 09:22, Christian Kastner wrote:
> On 2024-02-21 08:02, Johannes Schauer Marin Rodrigues wrote:
>> is this a duplicate of #1061388?
> 
> I *think* so, but I'm not sure.
> 
> The cause definitely seems to be the same: on the host, prior to opening
> the chroot, $HOME is set to /sbuild-nonexistent, which triggers these
> two bugs.
> 
> I'm not sure because I don't know if the $HOME thing above is buggy ,or
> it's correct (though strict) and autopkgtest-build-podman, incus, or
> something else needs fixing.
> 
> For example, in #1061388, a possible fix in incus is mentioned.
> 
> From my current understanding, this wouldn't work for podman, because
> its attempted use of $HOME/.config/local/*.conf is not just legitimate,
> I'd actually call it required.

I'm happy to say that I've found a workaround.

podman needs $HOME for runtime configuration and storage location, but
$HOME cannot be used even when put in ENVIRONMENT_FILTER.

However: podman also looks into the XDG_ directories, which *can* be
added to ENVIRONMENT_FILTER.

So by

(1) adding

  XDG_CACHE_HOME
  XDG_CONFIG_HOME
  XDG_DATA_HOME

to $environment_filter in .sbuildrc, and

(2) assuming that one has created a suitable container image with
autopkgtest-build-podman,

(3) one can run an autopkgtest with the podman backend as follows:

  $ export XDG_DATA_HOME="${XDG_DATA_HOME:-$HOME/.local/share}"
  $ export XDG_CONFIG_HOME="${XDG_CONFIG_HOME:-$HOME/.config}"
  $ export XDG_CACHE_HOME="${XDG_CACHE_HOME:-$HOME/.cache}"

  $ sbuild \
--chroot-mode=autopkgtest \
--autopkgtest-virt-server=podman \
--autopkgtest-virt-server-opt= \
--purge-deps=never \
--apt-update --apt-upgrade \
--dist  \
<...>

If not for one missing variable->path mapping, one could even explicitly
set podman-specific variables, which would make integration much easier
because they could not affect any other part of sbuild.



Bug#1068199: librocfft0: callback test failures on gfx900 and gfx1030

2024-04-04 Thread Christian Kastner
On 2024-04-02 00:35, Cordell Bloor wrote:
> I tried to reproduce the rocfft callback bug with a W6800 (gfx1030). I
> used a Debian Unstable docker container on an Ubuntu Noble host, but the
> tests all passed. This made me realize that the test failure pattern on
> the CI is that all the qemu-based workers are failing and all the
> podman-based workers are passing.
> 
> This issue seems to be somehow related to the qemu+rocm autopkgtest
> environment.

The issue is already visible with AMD_LOG_LEVEL=1, it's the lack of PCIe
atomics:

> half epsilon: 0.000977single epsilon: 3.75e-05double epsilon: 
> 1e-15
> Random seed: 1392424582
> rocFFT version: 1.0.23.
> Note: Google Test filter = 
> rocfft_UnitTest.default_load_callback_complex_single
> [==] Running 1 test from 1 test suite.
> [--] Global test environment set-up.
> [--] 1 test from rocfft_UnitTest
> [ RUN  ] rocfft_UnitTest.default_load_callback_complex_single
> :1:rocvirtual.cpp   :2949: 1796815625 us: [pid:1917  
> tid:0x7f4a2102c980] Pcie atomics not enabled, hostcall not supported
> :1:rocvirtual.cpp   :3289: 1796816120 us: [pid:1917  
> tid:0x7f4a2102c980] AQL dispatch failed> 
> clients/tests/default_callbacks_test.cpp:280: Failure
> Expected equality of these values:
>   rocfft_execute(plan, _ptr, _ptr, info)
> Which is: 1
>   rocfft_status_success
> Which is: 0
> 
> clients/tests/default_callbacks_test.cpp:310: Failure
> Expected: (diff.l_inf) < (type_epsilon()), actual: 32.230823516845703 
> vs 3.75e-05
> 
> [  FAILED  ] rocfft_UnitTest.default_load_callback_complex_single (907 ms)
> [--] 1 test from rocfft_UnitTest (908 ms total)

(I did not check all 130 failures, so strictly speaking there could be
additional causes, too.)

In an older ROCm ticket, a workaround to enable PCIe atomics in the
guest was discussed [1], but I never got this to work. The relevant bit
is not set after invoking setpci.

I don't know how to best address this. A workaround would be to skip
these tests if the host is a guest VM, but that would reduce coverage.
However, switching everything to podman would reduce coverage even more
if we only use the latest kernel.

Best,
Christian

PS: Full AMD_LOG_LEVEL=4 attached, for reference.

[1] https://github.com/ROCm/ROCK-Kernel-Driver/issues/26#issuecomment-313857180half epsilon: 0.000977  single epsilon: 3.75e-05double epsilon: 1e-15
Random seed: 3631874771
rocFFT version: 1.0.23.
Note: Google Test filter = rocfft_UnitTest.default_load_callback_complex_single
[==] Running 1 test from 1 test suite.
[--] Global test environment set-up.
[--] 1 test from rocfft_UnitTest
[ RUN  ] rocfft_UnitTest.default_load_callback_complex_single
:3:rocdevice.cpp:442 : 1761239558 us: [pid:1890  
tid:0x7f13ab583980] Initializing HSA stack.
:3:rocdevice.cpp:208 : 1761720693 us: [pid:1890  
tid:0x7f13ab583980] Numa selects cpu 
agent[0]=0x563d5daf67a0(fine=0x563d4cdd0020,coarse=0x563d5db0c3b0) for gpu 
agent=0x563d5db36800 CPU<->GPU XGMI=0
:3:rocdevice.cpp:1680: 1761721192 us: [pid:1890  
tid:0x7f13ab583980] Gfx Major/Minor/Stepping: 10/3/0
:3:rocdevice.cpp:1682: 1761722377 us: [pid:1890  
tid:0x7f13ab583980] HMM support: 0, XNACK: 0, Direct host access: 0
:3:rocdevice.cpp:1684: 1761722570 us: [pid:1890  
tid:0x7f13ab583980] Max SDMA Read Mask: 0x0, Max SDMA Write Mask: 0x0
:4:rocdevice.cpp:2063: 1761722742 us: [pid:1890  
tid:0x7f13ab583980] Allocate hsa host memory 0x7f13a9dfc000, size 0x38
:4:rocdevice.cpp:2063: 1761723175 us: [pid:1890  
tid:0x7f13ab583980] Allocate hsa host memory 0x7f129570, size 0x101000
:4:rocdevice.cpp:2063: 1761723636 us: [pid:1890  
tid:0x7f13ab583980] Allocate hsa host memory 0x7f129550, size 0x101000
:4:runtime.cpp  :83  : 1761723779 us: [pid:1890  
tid:0x7f13ab583980] init
:3:hip_context.cpp  :48  : 1761723839 us: [pid:1890  
tid:0x7f13ab583980] Direct Dispatch: 1
:3:hip_memory.cpp   :1302: 1761724035 us: [pid:1890  
tid:0x7f13ab583980]  hipMemcpyFromSymbol ( 0x563d4bfa06c8, 0x7fff49c710f8, 
8, 0, hipMemcpyDeviceToHost ) 
:3:devprogram.cpp   :2681: 1761725089 us: [pid:1890  
tid:0x7f13ab583980] Using Code Object V4.
:3:rocdevice.cpp:2230: 1761730456 us: [pid:1890  
tid:0x7f13ab583980] device=0x563d5db5ff10, freeMem_ = 0x3fef8
:3:rocdevice.cpp:2732: 1761730568 us: [pid:1890  
tid:0x7f13ab583980] number of allocated hardware queues with low priority: 0, 
with normal priority: 0, with high priority: 0, maximum per priority is: 4
:3:rocdevice.cpp:2810: 1761734314 us: [pid:1890  
tid:0x7f13ab583980] created hardware queue 0x7f13a9d7c000 with size 16384 with 
priority 1, cooperative: 0
:3:rocdevice.cpp:2902: 1761734476 us: [pid:1890  
tid:0x7f13ab583980] acquireQueue refCount: 0x7f13a9d7c000 (1)
:4:rocdevice.cpp  

Bug#975509: ITP: nbdime -- Jupyter Notebook Diff and Merge tools

2024-04-06 Thread Christian Kastner
Hi Joe,

On 2020-11-23 06:03, Joseph Nahmias wrote:
> Package: wnpp
> Severity: wishlist
> Owner: Joseph Nahmias 
> 
> * Package name: nbdime
>   Version : 2.1.0
>   Upstream Author : Jupyter Development Team 
> * URL : https://nbdime.readthedocs.io/
> * License : BSD
>   Programming Lang: Python
>   Description : Jupyter Notebook Diff and Merge tools
> 
> nbdime provides tools for diffing and merging of Jupyter Notebooks.

Xuanteng (in CC) is working on packages jupyter-cache and myst-nb, both
of which make use of nbdime in their test suites. We were wondering if
you could provide us with a status update for nbdime?

We gave our own brief look, after upgrading nbdime to 4.0.1, but the
build fails because npm tries to download source-map-loader:

> INFO:hatch_jupyter_builder.utils:> /usr/bin/npm install
> npm ERR! code ECONNREFUSED
> npm ERR! syscall connect
> npm ERR! errno ECONNREFUSED
> npm ERR! FetchError: request to 
> https://registry.npmjs.org/source-map-loader/-/source-map-loader-4.0.1.tgz 
> failed, reason: connect ECONNREFUSED 127.0.0.1:9

We've added it [1] and node-jupyterlab [2] to B-D as a stab in the dark,
but that didn't fix it. Unfortunately, we don't have experience with
node stuff in Debian.

Do you have an idea what's going on here? Otherwise I could ask on the
debian-js list.

Best,
Christian

[1] https://packages.debian.org/sid/node-source-map-loader
[2] https://packages.debian.org/sid/node-jupyterlab



Bug#1068199: librocfft0: callback test failures on gfx900 and gfx1030

2024-04-04 Thread Christian Kastner
On 2024-04-04 09:05, Christian Kastner wrote:
> The issue is already visible with AMD_LOG_LEVEL=1, it's the lack of PCIe
> atomics:
> 
>> [ RUN  ] rocfft_UnitTest.default_load_callback_complex_single
>> :1:rocvirtual.cpp   :2949: 1796815625 us: [pid:1917  
>> tid:0x7f4a2102c980] Pcie atomics not enabled, hostcall not supported
>> :1:rocvirtual.cpp   :3289: 1796816120 us: [pid:1917  
>> tid:0x7f4a2102c980] AQL dispatch failed> clients/tests

> In an older ROCm ticket, a workaround to enable PCIe atomics in the
> guest was discussed [1], but I never got this to work. The relevant bit
> is not set after invoking setpci.

In a more recent issue [2], a lack of PCIe atomics was also discovered
on physical hardware (it can depend on the CPU and/or the PCIe slot).

In that issue, it was stated that updating to ROCm 6.0 (and PyTorch)
resolved the issue.

I just rebuilt rocfft to 6.0.2 but the issue is still present. But that
was naive, there are other < 6.0 components in the stack that could
affect this.

> [1] 
> https://github.com/ROCm/ROCK-Kernel-Driver/issues/26#issuecomment-313857180

[2] https://github.com/ROCm/ROCm/issues/2429



Bug#1064637: lintian: Detect ~exp revision and warn if suite not experimental

2024-02-25 Thread Christian Kastner
Package: lintian
Version: 2.117.0
Severity: wishlist

It would be nice if lintian could warn about packages with ~exp in their
Debian revisions where suite is not experimental.

So these would warn:

  libfoo (1.0-1~exp) unstable; urgency=medium
  libfoo (1.0-1~exp1) unstable; urgency=medium
  ^^

But these wouldn't:

  libfoo (1.0-1) unstable; urgency=medium
  libfoo (1.0-1) experimental; urgency=medium

(More generally, anything without ~exp would be ignored.)

Best,
Christian

-- System Information:
Debian Release: trixie/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)

Kernel: Linux 6.5.0-0.deb12.4-amd64 (SMP w/24 CPU threads; PREEMPT)
Locale: LANG=C, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: unable to detect

Versions of packages lintian depends on:
ii  binutils2.42-2
ii  bzip2   1.0.8-5+b2
ii  diffstat1.65-1
ii  dpkg1.22.4
ii  dpkg-dev1.22.4
ii  file1:5.45-2+b1
ii  gettext 0.21-14+b1
ii  gpg 2.2.40-1.1+b1
ii  intltool-debian 0.35.0+20060710.6
ii  iso-codes   4.16.0-1
ii  libapt-pkg-perl 0.1.40+b3
ii  libarchive-zip-perl 1.68-1
ii  libberkeleydb-perl  0.64-2+b2
ii  libcapture-tiny-perl0.48-2
ii  libclass-xsaccessor-perl1.19-4+b2
ii  libclone-perl   0.46-1+b1
ii  libconfig-tiny-perl 2.30-1
ii  libconst-fast-perl  0.014-2
ii  libcpanel-json-xs-perl  4.37-1+b1
ii  libdata-dpath-perl  0.59-1
ii  libdata-validate-domain-perl0.10-1.1
ii  libdata-validate-uri-perl   0.07-2
ii  libdevel-size-perl  0.83-2+b2
pn  libdigest-sha-perl  
ii  libdpkg-perl1.22.4
ii  libemail-address-xs-perl1.05-1+b2
ii  libfile-basedir-perl0.09-2
ii  libfile-find-rule-perl  0.34-3
ii  libfont-ttf-perl1.06-2
ii  libhtml-html5-entities-perl 0.004-3
ii  libhtml-tokeparser-simple-perl  3.16-4
ii  libio-interactive-perl  1.025-1
ii  libipc-run3-perl0.049-1
ii  libjson-maybexs-perl1.004005-1
ii  liblist-compare-perl0.55-2
ii  liblist-someutils-perl  0.59-1
ii  liblist-utilsby-perl0.12-2
ii  libmldbm-perl   2.05-4
ii  libmoo-perl 2.005005-1
ii  libmoox-aliases-perl0.001006-2
ii  libnamespace-clean-perl 0.27-2
ii  libpath-tiny-perl   0.144-1
ii  libperlio-gzip-perl 0.20-1+b2
ii  libperlio-utf8-strict-perl  0.010-1+b1
ii  libproc-processtable-perl   0.636-1+b1
ii  libregexp-wildcards-perl1.05-3
ii  libsereal-decoder-perl  5.004+ds-1+b1
ii  libsereal-encoder-perl  5.004+ds-1+b1
ii  libsort-versions-perl   1.62-3
ii  libsyntax-keyword-try-perl  0.29-1+b1
ii  libterm-readkey-perl2.38-2+b2
ii  libtext-levenshteinxs-perl  0.03-5+b2
ii  libtext-markdown-discount-perl  0.16-1+b1
ii  libtext-xslate-perl 3.5.9-1+b3
ii  libtime-duration-perl   1.21-2
ii  libtime-moment-perl 0.44-2+b2
ii  libtimedate-perl2.3300-2
ii  libunicode-utf8-perl0.62-2+b1
ii  liburi-perl 5.27-1
ii  libwww-mechanize-perl   2.18-1
ii  libwww-perl 6.76-1
ii  libxml-libxml-perl  2.0207+dfsg+really+2.0134-1+b2
ii  libyaml-libyaml-perl0.89+ds-1
ii  lzop1.04-2
ii  man-db  2.12.0-3
ii  patchutils  0.4.2-1
ii  perl [libencode-perl]   5.38.2-3
ii  plzip [lzip-decompressor]   1.11-1
ii  t1utils 1.41-4
ii  unzip   6.0-28
ii  xz-utils5.4.5-0.3

lintian recommends no packages.

Versions of packages lintian suggests:
pn  binutils-multiarch 
pn  libtext-template-perl  

-- no debconf information



Bug#1033352: sbuild doesn't support autotpkgest-virt-incus: Error: mkdir /sbuild-nonexistent: permission denied

2024-02-26 Thread Christian Kastner
Hi all,

adding #1033352 as below contains information for that bug, too.

On 2024-02-25 19:50, Johannes Schauer Marin Rodrigues wrote:
> in that issue you asked exactly the question I was about to ask you. :)
> 
> Though it seems incus should now be able to deal gracefully with the situation
> and there is nothing else that sbuild needs to do to handle this, correct?
> 
> Note, that the autopkgtest/podman backend has a similar issue, see #1033352 
> for
> details. To fix this, Christian submitted this MR against sbuild:
> 
> https://salsa.debian.org/debian/sbuild/-/merge_requests/55
> 
> Reading how incus worked around this problem, maybe podman can do the same?

I believe there is one difference: unless Im mistaken, incus has a
centralized location for containers, whereas podman rootless container
images are stored in $HOME.

Also, the podman configuration lives $HOME.

> On Tue, 23 Jan 2024 14:30:07 + stefa...@debian.org wrote:
>> Filed an incus upstream bug about handling this situation more 
>> gracefully: https://github.com/lxc/incus/issues/422

That issue does mention a practical solution: grabbing $HOME from
/etc/passwd. But it mentions that incus only does that when $HOME is not
in env, so HOME=/sbuild-nonexistent would impede that.

Technically, by providing the right envvars, setting ENVIRONMENT_FILTER,
and other tricks, it should be possible to trick
sbuild+autopkgtest+podman to ignore $HOME. But unfortunately, podman
does not allow this -- similar to the incus issue above.

Another trick I thought of was to hack autopkgtest-virt-podman: when
run, if HOME=/sbuild-nonexistent, ignore it and use the value
/etc/passwd. This might be less invasive than my MR to sbuild. It still
has a failure mode (if HOME is deliberately set to something else) but
that's a fairly unusual use case.

Best,
Christian



Bug#1064809: libhipsparse0-tests: csr2bsr test failures

2024-02-25 Thread Christian Kastner
On 2024-02-26 07:23, Cordell Bloor wrote:
> The update of rocsparse 5.5.1 to 5.7.1 seems to have caused a regression
> in hipsparse. Although, it's also possible that this problem was because
> rocsparse was therefore rebuilt with the updated rocprim 5.7.1.

Interestingly, this passed on gfx900 [1].

(I checked a small sample of tests to check that they weren't just
being skipped)

[1]: https://ci.rocm.debian.net/packages/h/hipsparse/unstable/amd64+gfx900/



Bug#1064811: libhipsparse0-tests: HIPSPARSE_STATUS_INTERNAL_ERROR

2024-02-26 Thread Christian Kastner
Package: libhipsparse0-tests
Version: 5.7.1-1~exp1
Severity: normal

On gfx1031/gfx1032/gfx1034, there are numerous occurrences of
HIPSPARSE_STATUS_INTERNAL_ERROR, see [1] for a full log. Interestingly,
only some of them lead to test failures (some examples below), and
sometimes there is more than one occurrence per test.

These passed on gfx900/gfx1030 so I don't immediately suspect my update
to the optional-test-matrices resp. allow-missing-matrix-data-in-tests
patches to be the cause.

>  63s hipSPARSE error: HIPSPARSE_STATUS_INTERNAL_ERROR
>  63s ./clients/tests/test_dense_to_sparse_csr.cpp:38: Failure
>  63s Expected equality of these values:
>  63s   status
>  63s Which is: 7
>  63s   HIPSPARSE_STATUS_SUCCESS
>  63s Which is: 0
>  63s 
>  63s [  FAILED  ] dense_to_sparse_csr.dense_to_sparse_csr_i32_i32_float (7 ms)

> 63s [ RUN  ] dense_to_sparse_csc.dense_to_sparse_csc_i64_i32_double
>  63s hipSPARSE error: HIPSPARSE_STATUS_INTERNAL_ERROR
>  63s hipSPARSE error: HIPSPARSE_STATUS_INTERNAL_ERROR
>  63s [   OK ] dense_to_sparse_csc.dense_to_sparse_csc_i64_i32_double (3 
> ms)

> 66s [ RUN  ] bsrmv/parameterized_bsrmv.bsrmv_float/260
>  66s hipSPARSE error: HIPSPARSE_STATUS_INTERNAL_ERROR
>  66s hipSPARSE error: HIPSPARSE_STATUS_INTERNAL_ERROR
>  66s hipSPARSE error: HIPSPARSE_STATUS_INTERNAL_ERROR
>  66s hipSPARSE error: HIPSPARSE_STATUS_INTERNAL_ERROR
>  66s ./clients/tests/test_bsrmv.cpp:114: Failure
>  66s Expected equality of these values:
>  66s   status
>  66s Which is: 7
>  66s   HIPSPARSE_STATUS_SUCCESS
>  66s Which is: 0
>  66s 
>  66s [  FAILED  ] bsrmv/parameterized_bsrmv.bsrmv_float/260, where GetParam() 
> = (500, 842, 3, 1, 9, 0, 0) (1 ms)

[1]: 
https://ci.rocm.debian.net/data/autopkgtest/unstable/amd64+gfx1031/h/hipsparse/7762/log.gz



Bug#1061208: Please upgrade to llvm-toolchain-17

2024-02-25 Thread Christian Kastner
Hi Sebastian,

writing to you as you bumped the severity to 'serious': could the rT
please give us an extension on the autoremoval for this particular bug.

The transition from first-filing-to-serious was unusually short notice,
and caught us in the middle of our own update of the stack.

This affects a key package of our stack, so it's not just about testing
this particular package, but the entire stack, and everything else in
the archive that depends on it.

A fix is in progress.

Best,
Christian

On 2024-01-20 21:59, Sylvestre Ledru wrote:
> Source: rocm-hipamd
> Severity: important
> 
> Dear Maintainer,
> 
> As part of the effort to limit the number of llvm packages in the
> archive, it would be great if you could upgrade to -17.
> 
> This package depends on 15.



Bug#1064629: libamd-comgr2: segfault in rocfft

2024-03-02 Thread Christian Kastner
Hey Cory,

On 2024-02-28 21:16, Cordell Bloor wrote:
> This segfault does seem to be caused by mixing clang-15 and clang-17 in
> the HIP RTC codepath. When libamdhip64 from ROCm 5.6.1 (built with the
> same clang-17 as rocm-compilersupport 6.0+git20231212.4510c28+dfsg-1) is
> used, the segfault disappeared [1].

I think that this also needs to be fixed in bin:hipcc. It currently has
an unversioned Depends on libamdhip64-dev, making it possible to use
clang-17 hipcc with clang-15 libamdhip64-5.

# should also work with s/podman/docker/, of course
$ podman run --rm -it debian:experimental sh -c 'apt update && apt install -s 
hipcc/experimental | grep "Inst.*libamdhip64"'
[...]
Inst libamdhip64-5 (5.2.3-13 Debian:unstable [amd64])
Inst libamdhip64-dev (5.2.3-13 Debian:unstable [amd64])

I'd file a bug and fix the dependency in rocm-hipamd myself, but I'm
only 90% confident that I'm not missing something, so wanted to check
first.

If it's indeed missing from bin:hipcc, I guess it should be updated to
libamdhip64-dev (= ${binary:Version})

Discovered when building the newer rocFFT, which only build-depends on
hipcc.

Best,
Christian

> [1]: https://ci.rocm.debian.net/packages/r/rocfft/unstable/amd64+gfx1030/7998/



Bug#1065329: O: numpy -- Fast array facility to the Python 3 language

2024-03-02 Thread Christian Kastner
Control: retitle -1 O: numpy -- Fast array facility to the Python 3 language
Control: tags -1 - pending

Having read up on debian-python, I have misread the situation. I think
there needs to be a policy resolution first.

On 2024-03-02 22:18, Christian Kastner wrote:
> Control: retitle -1 ITA: numpy -- Fast array facility to the Python 3 language
> Control: tags -1 pending



Bug#1065329: O: numpy -- Fast array facility to the Python 3 language

2024-03-02 Thread Christian Kastner
Control: retitle -1 ITA: numpy -- Fast array facility to the Python 3 language
Control: tags -1 pending

I intend to put this under the Debian Python Team.

On 2024-03-02 21:46, Sandro Tosi wrote:
> Package: wnpp
> Severity: normal
> X-Debbugs-Cc: nu...@packages.debian.org, mo...@debian.org
> Control: affects -1 + src:numpy
> 
> I intend to orphan the numpy package.
> 
> The package description is:
>  Numpy contains a powerful N-dimensional array object, sophisticated
>  (broadcasting) functions, tools for integrating C/C++ and Fortran
>  code, and useful linear algebra, Fourier transform, and random number
>  capabilities.
>  .
>  Numpy replaces the python-numeric and python-numarray modules which are
>  now deprecated and shouldn't be used except to support older
>  software.
>  .
>  This package contains Numpy for Python 3.
> 



Bug#1061208: Please upgrade to llvm-toolchain-17

2024-02-29 Thread Christian Kastner
Hi Étienne,

On 2024-02-26 19:29, Étienne Mollier wrote:
> If that helps, the autoremoval timer is reset each time the RC
> critical bug triggering the autoremoval is updated, e.g. when
> reporting an evolution of the situation in a new comment.
thanks for the info! That's incredible useful to know.

Funny how I never realized that this was a thing.

Best,
Christian



Bug#1064809: libhipsparse0-tests: csr2bsr test failures

2024-02-26 Thread Christian Kastner
On 2024-02-26 07:23, Cordell Bloor wrote:
> The update of rocsparse 5.5.1 to 5.7.1 seems to have caused a regression
> in hipsparse. Although, it's also possible that this problem was because
> rocsparse was therefore rebuilt with the updated rocprim 5.7.1.

This one looks a bit tricky as it also seems to be GPU-arch dependent.

The following is all going by the logs of gfx1034 [1]:

One issue is that the test suite is often aborted early, so the csr2bsr
tests (from this bug) don't even get run. From the tail of the latest
log in experimental [2]:

>  82s [ RUN  ] dense2csr/parameterized_dense2csr.dense2csr_float/158
>  82s [   OK ] dense2csr/parameterized_dense2csr.dense2csr_float/158 (0 ms)
>  82s [ RUN  ] dense2csr/parameterized_dense2csr.dense2csr_float/159
>  82s [   OK ] dense2csr/parameterized_dense2csr.dense2csr_float/159 (0 ms)
>  82s [ RUN  ] dense2csr/parameterized_dense2csr.dense2csr_float/160
>  82s hipSPARSE error: HIPSPARSE_STATUS_INTERNAL_ERROR
>  82s hipSPARSE error: HIPSPARSE_STATUS_INTERNAL_ERROR
>  82s double free or corruption (out)
>  82s Aborted
>  82s autopkgtest [00:40:28]: test command1: ---]
>  83s command1 FAIL non-zero exit status 1
>  83s autopkgtest [00:40:29]: test command1:  - - - - - - - - - - results - - 
> - - - - - - - -
>  84s autopkgtest [00:40:30]:  summary
>  84s command1 FAIL non-zero exit status 1

The last fully completed run (no abort) was on 2024-01-27 [3], with a
runtime of 3m38s. And then the rocsparse-5.7.1 upgrade happens, and
indeed, after that update, no other successful completion can be seen,
suggesting that this might be a factor.

But, within the preceding 24hrs, we had two other test runs [4,5] abort
early -- with rocsparse=5.5.1-2.

Some of the earlier logs have more informative error messages prior to
the abort, eg:

>  85s Memory access fault by GPU node-1 (Agent handle: 0x5608f9e506c0) on 
> address 0x7fa318408000. Reason: Page not present or supervisor privilege.
>  85s Nearby memory map:
> [...]
> 85s hipsparse-test: ./src/core/runtime/runtime.cpp:1276: static bool 
> rocr::core::Runtime::VMFaultHandler(hsa_signal_value_t, void*): Assertion 
> `false && "GPU memory access fault."' failed.
>  85s ./clients/common/unit.cpp:128: Failure

I think these tests were all in VMs, I'll try to reproduce them on bare
metal just to be sure.

And to make things even more interesting, the test history of gfx1030
[6] suggests that rocsparse indeed was a factor. Tests on gfx1030 passed
until rocsparse=5.7.1-2, then failed, and now pass again with
hipsparse=5.7.1-1~exp1.

Best,
Christian

[1]: https://ci.rocm.debian.net/packages/h/hipsparse/unstable/amd64+gfx1034/
[2]: 
https://ci.rocm.debian.net/data/autopkgtest/unstable/amd64+gfx1034/h/hipsparse/7767/log.gz
[3]: 
https://ci.rocm.debian.net/data/autopkgtest/unstable/amd64+gfx1034/h/hipsparse/5234/log.gz
[4]: 
https://ci.rocm.debian.net/data/autopkgtest/unstable/amd64+gfx1034/h/hipsparse/5075/log.gz
[5]: 
https://ci.rocm.debian.net/data/autopkgtest/unstable/amd64+gfx1034/h/hipsparse/5003/log.gz



Bug#1071456: autopkgtest-virt-qemu: autopkgtest [15:14:50]: ERROR: testbed failure: sent `auxverb_debug_fail', got `timeout', expected `ok...'

2024-05-20 Thread Christian Kastner
Hi Paride,

On 2024-05-20 18:25, Paride Legovini wrote:
> On 2024-05-20 17:55, Christian Kastner wrote:
>> The test trigger we recorded was "linux-signed-amd64=6.8.9+1" but that
>> could just be coincidental.
> 
> Hi, this seems to be the same of:
> 
> https://bugs.launchpad.net/ubuntu/+source/autopkgtest/+bug/2056461
> 
> which turned out to be a kernel bug. If you want to verify that's
> actually the case, I suggest running autopkgtest --debug and checking
> that the timeout happens during a "copydown" operation.

yes that's exactly it.

Great, that just saved me a *lot* of time debugging :)

I was also hitting this earlier in our Ubuntu VMs, but they're still WIP
ion our team's CI so I thought it might be something else.

Best,
Christian



Bug#1071533: autopkgtest: timed out waiting for 'command prompt on serial console'

2024-05-20 Thread Christian Kastner
Package: autopkgtest
Version: 5.28
Severity: normal

In the Debian ROCm Team's CI, QEMU workers occasionally tmpfail with the
following error:

> autopkgtest: timed out waiting for 'command prompt on serial console'

>From [1], I guess that the official CI hit this, too.

However, this is an output parsing issue, rather than a timeout issue.
When the command preceding this returns fast enough, then it includes
the command prompt as "\n# ", but the code checking for this
expects "\n# ". I suspect that the uuid was an addition at some point.

A merge request fixing this will be filed shortly.

Best,
Christian

[1]: 
https://salsa.debian.org/ci-team/autopkgtest/-/commit/57cd6f0b98695362ce4c13f1115688b87d56a691



Bug#1071456: autopkgtest-virt-qemu: autopkgtest [15:14:50]: ERROR: testbed failure: sent `auxverb_debug_fail', got `timeout', expected `ok...'

2024-05-20 Thread Christian Kastner
Hi,

On 2024-05-19 17:06, Wouter Verhelst wrote:
> Package: autopkgtest
> Version: 5.35
> Severity: normal
> 
>> sudo autopkgtest-build-qemu --architecture amd64 sid 
>> /opt/chroots/autopkgtest-qemu.img
> 
> followed by
> 
>> autopkgtest . --test-name=initrd-boot -- qemu 
>> /opt/chroots/autopkgtest-qemu.img
> 
> in a directory that is a checkout of https://salsa.debian.org/wouter/nbd.git
> 
> It installed the test dependencies, and then failed on:
> 
>> autopkgtest [16:55:00]: Setting up user "user" to sudo without password...
>> qemu-system-x86_64: terminating on signal 15 from pid 150414 
>> (/usr/bin/python3)
>> autopkgtest [17:00:02]: ERROR: testbed failure: sent `auxverb_debug_fail', 
>> got `timeout', expected `ok...'
> 
> which I did not expect...

I've also starting seeing this recently in the Debian ROCm Team's CI.

In my case, this happens only with packages that have 2+ tests. When the
testbed is rebooted after the first test concludes, everything works
fine right until the "Setting up user "" to sudo without
password...", and then the timeout occurs.

In our particular case, the tests first started failing on 2024-05-17.
This was still with autopkgtest 5.34, and nothing else changed in our
infra over the past few days.

The test trigger we recorded was "linux-signed-amd64=6.8.9+1" but that
could just be coincidental.

Best,
Christian



Bug#1071456: Possible fix for 9p change breaking autopkgtest-build-qemu

2024-06-16 Thread Christian Kastner
FYI, a contributor has submitted a patch to the kernel Bugzilla [1] and
it indeed fixed the issue for the packages where I was seeing this.

Let's hope it gets recognized and accepted soon.

Best,
Christian

[1]: https://bugzilla.kernel.org/show_bug.cgi?id=218916



Bug#1073509: sbuild-qemu-update works, but increases image allocated size each time

2024-06-20 Thread Christian Kastner
Hi,

On 2024-06-17 08:34, Johannes Schauer Marin Rodrigues wrote:
> Quoting Francesco Poli (wintermute) (2024-06-16 19:09:08)
>> But, the allocated size has significantly grown:
>>
>>   $ cd ~/.cache/sbuild/
>>   $ ls -altrFs --si
>>   total 4.4G
>>   4.1k drwx-- 37 $USER $USER 4.1k May  4 16:03 ../
>>   4.1k drwxrwx---  2 $USER $USER 4.1k May 13 23:19 build/
>>   3.6G -rw-rw  1 $USER $USER  27G Jun  9 22:00 
>> OLD_unstable-autopkgtest-amd64.img
>>   4.1k drwxrwx---  3 $USER $USER 4.1k Jun 16 18:26 ./
>>   832M -rw-rw  1 $USER $USER  27G Jun 16 18:26 
>> unstable-autopkgtest-amd64.img
>>
>> Now the allocated size is 832 MB, instead of 705 MB !!!

In this particular case, one factor would be that the update caused new
APT lists to be downloaded, which have tens of MB.

>> But why does the allocated size increase?
>> Maybe there's something about sparse file support that I do not
>> fully understand.
>>
>> Is there anything that can be done inside sbuild-qemu-update to prevent
>> the allocated size from growing indefinitely?
>> Apart from periodically regenerating the image from scratch, I mean...
> 
> as you suspected this is because of how sparse files work. Whenever you 
> upgrade
> something in your image, data gets deleted and new data gets added. The
> filesystem driver in the kernel does not zero-out those parts that it deletes
> and even if it would, qemu has no idea which blocks of the underlying image
> file it should now mark "sparse".

Exactly this. One has to differentiate between what goes on in the guest
(file deletion from ext4) and what QEMU sees (just blocks being used up).

I guess clever stuff could be done but honestly, it's probably simpler
to occasionally regenerate an image.

A hacky solution would be to use the --snapshot option to
sbuild-qemu-update on first run. In future runs, you could reset the
image to that snapshot using qemu-img. That would be a tradeoff though
as with time, updates would take longer and longer.

> One tool that should reduce size again is e2image from e2fsprogs:
> 
> $ e2image -rap old.img new.img
> 
> But this requires copying the actual file data. I didn't try it out, but there
> is also the "discard" extended option of e2fsck:
> 
> $ e2fsck -E discard your.img
> 
> Lastly, I do not know if the zerofree tool has support for sparse files? Maybe
> try running it on your FS and see what happens. :)

Best,
Christian



Bug#1073509: sbuild-qemu-update works, but increases image allocated size each time

2024-06-24 Thread Christian Kastner
On 2024-06-22 12:49, Francesco Poli wrote:
> On Fri, 21 Jun 2024 00:42:21 +0200 Christian Kastner wrote:
>> In this particular case, one factor would be that the update caused new
>> APT lists to be downloaded, which have tens of MB.
> 
> How so?
> 
> As I said in the [original] bug report, I tried to run
> sbuild-qemu-update on the image that I had just regenerated from
> scratch. It seems that it found nothing to update, not even the APT
> repository lists (which were already up-to-date, as expected). Please
> take a look at the output of sbuild-qemu-update in the [original] bug
> report...

Indeed. Sorry, I replied with gut instinct here.

> Come on, I am sure you are clever enough to figure out a good strategy
> to automatically prevent the image allocated size from growing
> indefinitely!

Apparently, there is a trick after all.

(1) sbuild-qemu-{boot,update} need to add discard=unmap to the block
device options of the image
(2) In the guest,
  (2a) the root partition needs to be remounted with discard
  (2b) fstrim /
(3) On the host, qemu-img convert -O qcow2 foo.img bar.img

At least on my end, this reduced the image size.

fstrim was the only solution I could find that could trigger TRIMs on a
mounted filesystem.

I've pushed a fix implementing this. It seems safe enough, though I
wonder if I shouldn't have guarded this with a --shrink option.

Best,
Christian



Bug#1073648: zd1211-firmware: diff for NMU version 1:1.5-10.1

2024-07-12 Thread Christian Kastner
Hi Chris,

On 2024-07-08 01:12, Chris Hofstaedtler wrote:
> Dear maintainer,
> 
> I've prepared an NMU for zd1211-firmware (versioned as 1:1.5-10.1) and
> uploaded it to DELAYED/7. Please feel free to tell me if I
> should delay it longer.

It's been a while since I've dealt with a delayed NMU. I just went ahead
and uploaded a new version (crediting you). I hope that was OK that way?

Is there something I should be doing with the NMU, or will it be gauged
as obsolete based on the version?

Best,
Christian



Bug#1073436: python-xmlschema: FTBFS: AssertionError: 'file://///filer01/MY_HOME/dev/XMLSCHEMA/test.xsd' != 'file:////filer01/MY_HOME/dev/XMLSCHEMA/test.xsd'

2024-06-26 Thread Christian Kastner
On 2024-06-16 15:10, Lucas Nussbaum wrote:
> Source: python-xmlschema
> Version: 3.3.1-1
> Severity: serious
> Justification: FTBFS
> Tags: trixie sid ftbfs
> User: lu...@debian.org
> Usertags: ftbfs-20240615 ftbfs-trixie
> 
> Hi,
> 
> During a rebuild of all packages in sid, your package failed to build
> on amd64.

>> ==
>> FAIL: test_normalize_url_slashes 
>> (tests.test_locations.TestLocations.test_normalize_url_slashes)
>> --
>> Traceback (most recent call last):
>>   File 
>> "/<>/.pybuild/cpython3_3.12_xmlschema/build/tests/test_locations.py",
>>  line 314, in test_normalize_url_slashes
>> self.assertRegex(normalize_url('root/dir1/schema.xsd'),
>> AssertionError: Regex didn't match: 'file:root/dir1/schema.xsd' not 
>> found in 'file://root/dir1/schema.xsd'
>>
>> ==
>> FAIL: test_normalize_url_with_base_unc_path 
>> (tests.test_locations.TestLocations.test_normalize_url_with_base_unc_path)
>> --
>> Traceback (most recent call last):
>>   File 
>> "/<>/.pybuild/cpython3_3.12_xmlschema/build/tests/test_locations.py",
>>  line 283, in test_normalize_url_with_base_unc_path
>> self.assertEqual(url, 'file:filer01/MY_HOME/dev/XMLSCHEMA/test.xsd')
>> AssertionError: 'file:/filer01/MY_HOME/dev/XMLSCHEMA/test.xsd' != 
>> 'file:filer01/MY_HOME/dev/XMLSCHEMA/test.xsd'
>> - file:/filer01/MY_HOME/dev/XMLSCHEMA/test.xsd
>> ?  -
>> + file:filer01/MY_HOME/dev/XMLSCHEMA/test.xsd

This only occurs with Python 3.12. I'll look into it.



<    4   5   6   7   8   9