We are on ESXi, version 5.5, according to our hosting provider. To run the test, I created a new, 40GB virtual disk. It's attached as SCSI ID 1:3, on a VMware Paravirtual SCSI adapter. I set it up as a PV and put ext4 on it the "fast way," which is what I always do:
701 pvcreate /dev/sdd 702 vgextend mongo03-vg00 /dev/sdd 703 lvcreate -n test -l 100%PV mongo03-vg00 /dev/sdd 704 mount 705 mkfs.ext4 -E lazy_itable_init=1 -O uninit_bg /dev/mongo03-vg00/test Then I ran the test. Initially it passed, twice: /test/db.1 /test repro in db.1 creating f0 creating f1 touching files hexdump f0: 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * ... But then, after doing some "multitasking" (we're working on an upgrade to MongoDB 2.6.5) and noticing that it was finished, I ran "sync" three times and checked the f0 files again: hexdump db.1/f0 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0001000 cccc cccc cccc cccc cccc cccc cccc cccc * 0007000 0000 0000 0000 0000 0000 0000 0000 0000 * 1000000 * hexdump db.10/f0 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 1000000 * hexdump db.2/f0 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0001000 cccc cccc cccc cccc cccc cccc cccc cccc * 0007000 0000 0000 0000 0000 0000 0000 0000 0000 * 1000000 * hexdump db.3/f0 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0001000 cccc cccc cccc cccc cccc cccc cccc cccc * 0007000 0000 0000 0000 0000 0000 0000 0000 0000 * 1000000 * hexdump db.4/f0 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0001000 cccc cccc cccc cccc cccc cccc cccc cccc * 0007000 0000 0000 0000 0000 0000 0000 0000 0000 * 1000000 * hexdump db.5/f0 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 1000000 * hexdump db.6/f0 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 1000000 * hexdump db.7/f0 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 1000000 * hexdump db.8/f0 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 1000000 * hexdump db.9/f0 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 1000000 * -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1371591 Title: file not initialized to 0s under some conditions Status in “linux” package in Ubuntu: Fix Released Status in “linux-lts-trusty” package in Ubuntu: Confirmed Status in “linux” source package in Precise: Invalid Status in “linux-lts-trusty” source package in Precise: Fix Released Status in “linux” source package in Trusty: Fix Released Status in “linux-lts-trusty” source package in Trusty: Invalid Bug description: SRU Justification: [Impact] Under some conditions, after fallocate() the file is observed not to be completely initilized to 0s: some 4KB pages have left-over data from previous files that occupied those pages. Note that in addition to causing functional problems for applications expecting files to be initialized to 0s, this is a security issue because it allows data to "leak" from one file to another, bypassing file access controls. The problem has been seen running under the following VMWare-based virtual environments: Fusion 6.0.2 ESXi 5.1.0 And under the following versions of Ubuntu: Ubuntu 12.04, 3.11.0-26-generic Ubuntu 14.04.1, 3.13.0-32-generic Ubuntu 14.04.1, 3.13.0-35-generic But did not reproduce under the following version: Ubuntu 10.04, 2.6.32-38-server The problem reproduced under LVM, but did not reproduce without LVM. [Test Case] I reproduced the problem as follows under VMWare Fusion: set up custom VM with default disk size (20 GB) and memory size (1 GB) attach Ubuntu 14.04.1 ISO to CDROM, set it as boot device, boot up select all defaults during installation _including_ LVM install gcc unpack the attached repro.tgz run repro.sh what it does: * fills the disk with a file containing bytes of 0xcc then deletes it * repeatedly runs the repro program which creates two files and accesses them in a certain pattern * checks the file f0 with hexdump; it should contain all 0s, but if pages 0x1000-0x7000 contain 0xcc you have reproduced the problem If the problem does not appear to reproduce, please try waiting a bit and checking the f0 files with hexdump again. This behavior was observed by a customer reproducing the problem under ESXi. I since added an sync after the running the repro binary which I think will fix that. If you still can't reproduce the problem please let me know if there's anything I can do to help. For example can we trace the disk accesses at the SCSI level to verify whether the appropriate SCSI commands are being sent? This may help determine whether the problem is in Linux or in VMWare. [Fix] mptfusion: enable no_write_same in scsi_host_template commit 4089b71cc820a426d601283c92fcd4ffeb5139c2 upstream https://lkml.org/lkml/2014/9/25/482 (Note this patch may be reverted in the future as there is active discussion upstream about a more generic fix) To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1371591/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp