Your message dated Sun, 5 Apr 2026 16:16:34 +0200
with message-id <[email protected]>
and subject line Closing
has caused the Debian Bug report #1093304,
regarding swift-container: swift corrupts container database
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)
--
1093304: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1093304
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: swift-container
Version: 2.26.0-10+deb11u1+wmf1
Severity: normal
Hi,
We had an outage due to swift simultaneously quarantining all three
copies of a container database (saying they were corrupt) during a
listing operation. Given all three databases were corrupt, this is I
think not a case of a disk/fs fault causing corruption, but rather that
swift had processed an operation that wrote to the container DB in a way
that corrupted the sqlite file.
Each container-server had a backtrace like this:
Jan 5 07:20:28 ms-be2058 container-server: ERROR __call__ error with
GET /sdb3/16503/AUTH_mw/wikipedia-commons-local-thumb.f8 :
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/swift/common/db.py", line 475,
in get
yield conn
File "/usr/lib/python3/dist-packages/swift/container/backend.py",
line 1173, in list_objects_iter
return [transform_func(r) for r in curs]
File "/usr/lib/python3/dist-packages/swift/container/backend.py",
line 1173, in <listcomp>
return [transform_func(r) for r in curs]
sqlite3.DatabaseError: database disk image is malformed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/swift/container/server.py", line
867, in __call__
res = getattr(self, req.method)(req)
File "/usr/lib/python3/dist-packages/swift/common/utils.py", line
2007, in _timing_stats
resp = func(ctrl, *args, **kwargs)
File "/usr/lib/python3/dist-packages/swift/container/server.py", line
752, in GET
container_list = src_broker.list_objects_iter(
File "/usr/lib/python3/dist-packages/swift/container/backend.py",
line 1223, in list_objects_iter
return results
File "/usr/lib/python3.9/contextlib.py", line 135, in __exit__
self.gen.throw(type, value, traceback)
File "/usr/lib/python3/dist-packages/swift/common/db.py", line 483,
in get
self.possibly_quarantine(*sys.exc_info())
File "/usr/lib/python3/dist-packages/swift/common/db.py", line 436,
in possibly_quarantine
self.quarantine(exc_hint)
File "/usr/lib/python3/dist-packages/swift/common/db.py", line 414,
in quarantine
raise sqlite3.DatabaseError(detail)
sqlite3.DatabaseError: Quarantined
/srv/swift-storage/sdb3/containers/16503/280/4077d9164732d6587761ef101bcbc280
to
/srv/swift-storage/sdb3/quarantined/containers/4077d9164732d6587761ef101bcbc280
due to malformed database (txn: tx4d7ef4ae3a434f458e950-00677a32bc)
And, indeed, if I did an integrity check on the quarantined files, each
one showed the same errors:
mvernon@ms-be2073:~$ sqlite3 4077d9164732d6587761ef101bcbc280.db "PRAGMA
integrity_check"
row 423322 missing from index ix_object_deleted_name
row 2701219 missing from index ix_object_deleted_name
Which is quite surprising, given generally rowids are not the same
across the 3 databases. One of the complained-of rows is still extant in
the table (and dates from 2016), and the other isn't.
In all 3 cases, the latest object (based on rowid) is an object that was
deleted very shortly before the outage started - 07:19:50 UTC, with the
outage starting at 07:20:28.
It is perhaps significant that that last listing that succeeded was
using that object as a prefix in a list request at the same time as the
object was being deleted:
object with highest rowid:
19933856|f/f8/Gascones,_molino_(1988)_02.jpg/300px-Gascones,_molino_(1988)_02.jpg|1736061590.04401|0|application/deleted|noetag|1|0
final successful listing:
Jan 5 07:19:50 ms-fe2010 proxy-server: 10.194.179.98 10.192.16.76
05/Jan/2025/07/19/50 GET
/v1/AUTH_mw/wikipedia-commons-local-thumb.f8%3Flimit%3D9000%26prefix%3Df%252Ff8%252FGascones%252C_molino_%25281988%2529_02.jpg%252F%26format%3Djson%26states%3Dlisting
HTTP/1.0 200 - wikimedia/multi-http-client%20v1.1 AUTH_tk22395377a... -
511 - txc6028b8aef0d4705aef82-00677a3296 - 0.0301 - -
1736061590.006474018 1736061590.036608696 0
final delete:
ms-fe2011.proxylog.gz:Jan 5 07:19:50 ms-fe2011 proxy-server:
10.194.179.98 10.192.32.36 05/Jan/2025/07/19/50 DELETE
/v1/AUTH_mw/wikipedia-commons-local-thumb.f8/f/f8/Gascones%252C_molino_%25281988%2529_02.jpg/300px-Gascones%252C_molino_%25281988%2529_02.jpg
HTTP/1.0 204 - wikimedia/multi-http-client%20v1.1 AUTH_tk22395377a... -
- - tx8d7b6c325ca54e89a4a08-00677a3296 - 0.1511 - - 1736061590.041912317
1736061590.193058491 0
There is further investigation of the incident at
https://phabricator.wikimedia.org/T383053 and I have lots of logs, which
these seem like the most pertinent parts of, but if you'd like other log
extracts that can be done.
Finally, I should note that this container contains image thumbnails,
generated by a separate service via a 404 handler in swift middleware -
see
https://github.com/wikimedia/operations-puppet/blob/45d5772c846e42269c2f1a19c8784fd9d2deb240/modules/swift/files/python3.9/SwiftMedia/wmf/rewrite.py#L48
Thanks,
Matthew
-- System Information:
Debian Release: 11.11
APT prefers oldstable-updates
APT policy: (500, 'oldstable-updates'), (500, 'oldstable-security'),
(500, 'oldstable-debug'), (500, 'oldstable')
Architecture: amd64 (x86_64)
Kernel: Linux 5.10.0-30-amd64 (SMP w/48 CPU threads)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE
not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
Versions of packages swift-container depends on:
ii init-system-helpers 1.60
ii lsb-base 11.1.0
ii openstack-pkg-tools 117
ii python3 3.9.2-3
ii python3-pastescript 2.0.2-4
ii python3-swift 2.26.0-10+deb11u1+wmf1
ii rsync 3.2.3-4+deb11u1
ii swift 2.26.0-10+deb11u1+wmf1
ii uwsgi-plugin-python3 2.0.19.1-7.1
Versions of packages swift-container recommends:
ii swift-drive-audit 2.26.0-10+deb11u1+wmf1
swift-container suggests no packages.
--- End Message ---
--- Begin Message ---
Hi,
As this was forwarded to upstream, there's nothing more I can do, and I
do not wish to keep this bug opened forever with no action possible on
my side.
Please follow-up at:
https://bugs.launchpad.net/swift/+bug/2141924
Cheers,
Thomas Goirand (zigo)
--- End Message ---