We use Debian for some embedded devices that use off-the-shelf flash drives for their primary storage. Since upgrading from etch to lenny and tweaking our partition layout, we've started seeing filesystem corruption occur very rapidly after we clone the filesystem (via partimage and resize2fs). While investigating, I've been able to reproduce the corruption with both etch's and lenny's partimage, with both etch's and lenny's e2fsprogs, with both the realtime-patched kernel we used under etch and lenny's stock amd64 kernel, with flash drives of different sizes, with different flash drive partition layouts, and with one of our embedded devices, an off-the-shelf lenny server, and an off-the-shelf etch server. This doesn't make any sense to me.
While trying to figure all of this out, I've found that I can reproduce filesystem corruption 100% of the time simply by executing these commands: mke2fs -O has_journal,resize_inode,dir_index,filetype,sparse_super,large_file /dev/sdb2 tune2fs -c 29 /dev/sdb2 # /dev/sdb is an external flash drive mount /dev/sdb2 /mnt/image cd /mnt/image tar xf ~/data.tar # data.tar is a 71MB archive of the /var partition cd umount /mnt/image e2fsck -f /dev/sdb2 At this point, e2fsck starts complaining with errors like this: Symlink /lib/python-support/python2.5/_dbus_glib_bindings.so (inode #113416) is invalid. Clear<y>? Turning off has_journal or adding -o data=journal fixes the immediately preceding problem. (I haven't tested it for our cloning procedure.) However, I don't want to go back to ext2, and data=journal seems to be barely documented. (What exactly does it do?) We've seen other errors after cloning (subdirectories that point to their parents, "resize inode not valid", etc.), but these particular errors are completely reproducible. The corruption occurs on more than one flash drive. badblocks -w /dev/sdb reports no errors (although I seem to remember one of disks being bigger running badblocks - do flash drives remap bad sectors?). I can't imagine that Linux or Debian would be released with this sort of potentially severe reproducible bug but am at a loss to figure out what I might be doing wrong or what's specific to my setup. And I can't figure out why we're only seeing it since upgrading to lenny when I can currently reproduce the problem under etch. Any help would be greatly appreciated. Thanks. Josh Kelley -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org