Package: s3ql Version: 2.11.1+dfsg-1 Severity: critical Justification: causes serious data loss
Dear Maintainer, While running rsync to backup data to an s3ql file system mounted from Amazon's S3 services, the internet connection failed, resulting in the following error(s) from rsync: rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32) rsync: write failed on "<file name removed>": Software caused connection abort (103) rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9] rsync: connection unexpectedly closed (17298 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9] I attempted to unmount the file system with the following result (twice): # umount.s3ql /media/server-external File system appears to have crashed. I then forced it to unmount as follows: # fusermount -u -z /media/server-external Then attempted to fsck the file system (twice - both gave the same result): fsck.s3ql s3://<bucket name>/<file system prefix> Enter file system encryption passphrase: Starting fsck of s3://<bucket name>/<file system prefix> Using cached metadata. Remote metadata is outdated. Checking DB integrity... Creating temporary extra indices... Checking lost+found... Checking cached objects... Committing block 14 of inode 442809 to backend Committing block 16 of inode 442809 to backend Committing block 17 of inode 442809 to backend Committing block 15 of inode 442809 to backend Committing block 19 of inode 442809 to backend Committing block 18 of inode 442809 to backend Checking names (refcounts)... Checking contents (names)... Checking contents (inodes)... Checking contents (parent inodes)... Checking objects (reference counts)... Checking objects (backend)... ..processed 100000 objects so far.. Dropping temporary indices... Uncaught top-level exception: Traceback (most recent call last): File "/usr/bin/fsck.s3ql", line 9, in <module> load_entry_point('s3ql==2.11.1', 'console_scripts', 'fsck.s3ql')() File "/usr/lib/s3ql/s3ql/fsck.py", line 1189, in main fsck.check() File "/usr/lib/s3ql/s3ql/fsck.py", line 85, in check self.check_objects_id() File "/usr/lib/s3ql/s3ql/fsck.py", line 848, in check_objects_id self.conn.execute('INSERT INTO obj_ids VALUES(?)', (obj_id,)) File "/usr/lib/s3ql/s3ql/database.py", line 98, in execute self.conn.cursor().execute(*a, **kw) File "src/cursor.c", line 231, in resetcursor apsw.ConstraintError: ConstraintError: PRIMARY KEY must be unique Next I copied the entire Amazon bucket to a new bucket and attempted an fsck on the copy, minus the locally cached file system data: # fsck.s3ql s3://<new bucket name>/<file system prefix> Enter file system encryption passphrase: Starting fsck of s3://<new bucket name>/<file system prefix> Uncaught top-level exception: Traceback (most recent call last): File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 381, in _convert_legacy_metadata meta_new['data'] = meta['data'] KeyError: 'data' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/bin/fsck.s3ql", line 9, in <module> load_entry_point('s3ql==2.11.1', 'console_scripts', 'fsck.s3ql')() File "/usr/lib/s3ql/s3ql/fsck.py", line 1111, in main param = backend.lookup('s3ql_metadata') File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 72, in lookup meta_raw = self._convert_legacy_metadata(meta_raw) File "/usr/lib/s3ql/s3ql/backends/comprenc.py", line 383, in _convert_legacy_metadata raise CorruptedObjectError('meta key data is missing') s3ql.backends.common.CorruptedObjectError: meta key data is missing NOTE: I'm not sure about the exact implication of "_convert_legacy_metadata" in the traceback above, but this was NOT a legacy file system, it was just created using s3ql 2.11.1 as it is cheaper to rebuild it than to pull 700 GB in the old copy down from Amazon to do the "verify" specified as part of the upgrade procedure from older versions. At the time of this failure I had uploaded between 200 and 300 GB of deduplicated/compressed data to the new file system. As things currently stand, unless I have overlooked or misunderstood something (which I consider entirely possible), this network connection failure has resulted in 100% data loss unless fsck can be fixed in a manner which will allow it to complete correctly and recover the file system data. As I maintain other backups, no actual data has been lost (so far), but this makes s3ql unsafe to use and further attempts to backup my data to S3 pointless. Regards, Shannon Dealy -- System Information: Debian Release: 7.7 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'testing'), (500, 'stable') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 3.13-0.bpo.1-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages s3ql depends on: ii fuse 2.9.3-9 ii libc6 2.18-4 ii libjs-sphinxdoc 1.1.3+dfsg-4 ii libsqlite3-0 3.7.13-1+deb7u1 ii psmisc 22.19-1+deb7u1 ii python3 3.4.2-1 ii python3-apsw 3.8.6-r1-1 ii python3-crypto 2.6.1-5+b2 ii python3-defusedxml 0.4.1-2 ii python3-dugong 3.3+dfsg-2 ii python3-llfuse 0.40-2+b2 ii python3-pkg-resources 5.5.1-1 ii python3-requests 2.4.3-4 s3ql recommends no packages. s3ql suggests no packages. -- debconf-show failed -- To UNSUBSCRIBE, email to debian-bugs-rc-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org