This is an automated email from the ASF dual-hosted git repository.

slawrence pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/daffodil-infrastructure.git


The following commit(s) were added to refs/heads/main by this push:
     new 8d4ed58  Improve checking of RPM reproducibility
8d4ed58 is described below

commit 8d4ed58bf33be3ddf642d278511a44c11be5463d
Author: Steve Lawrence <[email protected]>
AuthorDate: Thu Sep 4 08:17:59 2025 -0400

    Improve checking of RPM reproducibility
    
    When dist RPMs are created, they are signed with an embedded signature.
    This can make reproducibility difficult. To handle this, we currently use
    rpmsign --delsign to delete the embedded signatures before performing
    the diff. But rpmsign --delsign sometimes deletes the signature in a way
    that is technically correct in that the RPM does not have a signature,
    but the RPM is still not identical to the same RPM that was never
    signed.
    
    To allow for checking reproducibility, this replaces the delsign logic
    with a custom function that just copies the signature header from the
    locally built RPM to the dist RPM. This ensures the signatures headers
    are exactly the same, and allows us to ensure all other bytes are
    identical.
    
    This no longer needs the rpmsign command and is removed from the
    container and command checks.
    
    DAFFODIL-3037
---
 containers/check-release/Dockerfile           |  1 -
 containers/check-release/src/check-release.sh | 43 +++++++++++++++++++++------
 2 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/containers/check-release/Dockerfile 
b/containers/check-release/Dockerfile
index a1c8d40..ffe3a11 100644
--- a/containers/check-release/Dockerfile
+++ b/containers/check-release/Dockerfile
@@ -19,7 +19,6 @@ RUN \
   dnf -y --quiet --repo=fedora install \
     diff \
     gpg \
-    rpmsign \
     wget
 
 RUN \
diff --git a/containers/check-release/src/check-release.sh 
b/containers/check-release/src/check-release.sh
index 225f266..4118c52 100755
--- a/containers/check-release/src/check-release.sh
+++ b/containers/check-release/src/check-release.sh
@@ -44,10 +44,6 @@ require_command rpm
 require_command sha1sum
 require_command sha512sum
 require_command wget
-if [ -n "$LOCAL_RELEASE_DIR" ]
-then
-       require_command rpmsign
-fi
 
 WGET="wget --recursive --level=inf -e robots=off --no-parent 
--no-host-directories --reject=index.html,robots.txt"
 
@@ -134,13 +130,42 @@ then
 fi
 
 # RPM files have an embedded signature which makes reproducibility checking
-# difficult since locally built RPMs will not have the embedded signature.
-# However, the RPMs should be identical if we delete that signature. So we
-# create a backup of the original RPM files, delete the embedded signature,
-# run the diff command, and then restore the backups.
+# difficult since locally built RPMs will not have the embedded signature. The
+# RPMs should be identical if we delete that signature, but unfortunately
+# rpmsign --delsign does not necessarily make RPMs byte for byte identical--
+# sometimes it rebuilds them in slightly different ways that are technically
+# the same but not identical. So we sort of delete the signature header
+# ourselves. This is done by calculating the size of the signature header in
+# the locally built RPM and copying those bytes into the dist RPM. As long as
+# the two signature headers are the same size (which they should always be),
+# this should work. Since we are changing the dist files, we create a backup of
+# them first, replace the signature header, run the diff command, then restore
+# the backups.
+#
+# All signature/checksum data is stored in a "signature header". This header
+# starts immediately after the 96-byte "lead". The header format is:
+#
+#  * magic number: 8 bytes
+#  * index_count: 4 bytes (uint32_t)
+#  * data_length: 4 bytes (uint32_t)
+#  * index: index_count * 16-byte entries
+#  * data: data_length bytes
+#
+# To find the total length of the signature header we read the index_count and
+# data_length fields at a known offset (skipping the lead and magic number),
+# then add together the length of 3 fixed length fields (16 bytes), the length
+# of the index (16 * index_count) and the length of the data (data_length).
 BACKUP_DIR=$(mktemp -d)
 find $DIST_DIR -name '*.rpm' -exec cp --parents {} $BACKUP_DIR \;
-find $DIST_DIR -name '*.rpm' -execdir rpmsign --delsign {} \; &>/dev/null
+for SRC_RPM in `find $LOCAL_RELEASE_DIR -name '*.rpm'`
+do
+       find $DIST_DIR -name "$(basename $SRC_RPM)" -exec bash -c '
+               LEAD_SIZE=96
+               read SIG_INDEX_COUNT SIG_DATA_LENGTH < <(od -An -t u4 -j 
$((LEAD_SIZE+8)) -N 8 --endian=big "$1")
+               SIG_HEADER_LENGTH=$((16 + SIG_INDEX_COUNT*16 + SIG_DATA_LENGTH))
+               dd if="$1" of="$2" bs=1 skip=$LEAD_SIZE seek=$LEAD_SIZE 
count=$SIG_HEADER_LENGTH conv=notrunc
+       ' _ "$SRC_RPM" {} \; &> /dev/null
+done
 
 # Reasons for excluding files from the diff check:
 # - The downloaded .rpm file has an embedded signature (which we removed),

Reply via email to