We are on ESXi, version 5.5, according to our hosting provider.

To run the test, I created a new, 40GB virtual disk.  It's attached as
SCSI ID 1:3, on a VMware Paravirtual SCSI adapter.  I set it up as a PV
and put ext4 on it the "fast way," which is what I always do:

  701  pvcreate /dev/sdd 
  702  vgextend mongo03-vg00 /dev/sdd
  703  lvcreate -n test -l 100%PV mongo03-vg00 /dev/sdd 
  704  mount
  705  mkfs.ext4 -E lazy_itable_init=1 -O uninit_bg /dev/mongo03-vg00/test 

Then I ran the test.  Initially it passed, twice:

/test/db.1 /test
repro in db.1
creating f0
creating f1
touching files
hexdump f0:
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
...

But then, after doing some "multitasking" (we're working on an upgrade
to MongoDB 2.6.5) and noticing that it was finished, I ran "sync" three
times and checked the f0 files again:

hexdump db.1/f0
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0001000 cccc cccc cccc cccc cccc cccc cccc cccc
*
0007000 0000 0000 0000 0000 0000 0000 0000 0000
*
1000000
*
hexdump db.10/f0
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
1000000
*
hexdump db.2/f0
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0001000 cccc cccc cccc cccc cccc cccc cccc cccc
*
0007000 0000 0000 0000 0000 0000 0000 0000 0000
*
1000000
*
hexdump db.3/f0
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0001000 cccc cccc cccc cccc cccc cccc cccc cccc
*
0007000 0000 0000 0000 0000 0000 0000 0000 0000
*
1000000
*
hexdump db.4/f0
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
0001000 cccc cccc cccc cccc cccc cccc cccc cccc
*
0007000 0000 0000 0000 0000 0000 0000 0000 0000
*
1000000
*
hexdump db.5/f0
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
1000000
*
hexdump db.6/f0
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
1000000
*
hexdump db.7/f0
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
1000000
*
hexdump db.8/f0
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
1000000
*
hexdump db.9/f0
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
1000000
*

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1371591

Title:
  file not initialized to 0s under some conditions

Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux-lts-trusty” package in Ubuntu:
  Confirmed
Status in “linux” source package in Precise:
  Invalid
Status in “linux-lts-trusty” source package in Precise:
  Fix Released
Status in “linux” source package in Trusty:
  Fix Released
Status in “linux-lts-trusty” source package in Trusty:
  Invalid

Bug description:
  SRU Justification:

  [Impact]

  Under some conditions, after fallocate() the file is observed not to
  be completely initilized to 0s: some 4KB pages have left-over data
  from previous files that occupied those pages. Note that in addition
  to causing functional problems for applications expecting files to be
  initialized to 0s, this is a security issue because it allows data to
  "leak" from one file to another, bypassing file access controls.

  The problem has been seen running under the following VMWare-based virtual 
environments:
  Fusion 6.0.2
  ESXi 5.1.0

  And under the following versions of Ubuntu:
  Ubuntu 12.04, 3.11.0-26-generic
  Ubuntu 14.04.1, 3.13.0-32-generic
  Ubuntu 14.04.1, 3.13.0-35-generic

  But did not reproduce under the following version:
  Ubuntu 10.04, 2.6.32-38-server

  The problem reproduced under LVM, but did not reproduce without LVM.

  [Test Case]

  I reproduced the problem as follows under VMWare Fusion:
  set up custom VM with default disk size (20 GB) and memory size (1 GB)
  attach Ubuntu 14.04.1 ISO to CDROM, set it as boot device, boot up
  select all defaults during installation _including_ LVM
  install gcc
  unpack the attached repro.tgz
  run repro.sh

  what it does:
  * fills the disk with a file containing bytes of 0xcc then deletes it
  * repeatedly runs the repro program which creates two files and accesses them 
in a certain pattern
  * checks the file f0 with hexdump; it should contain all 0s, but if pages 
0x1000-0x7000 contain 0xcc you have reproduced the problem

  If the problem does not appear to reproduce, please try waiting a bit
  and checking the f0 files with hexdump again. This behavior was
  observed by a customer reproducing the problem under ESXi. I since
  added an sync after the running the repro binary which I think will
  fix that.

  If you still can't reproduce the problem please let me know if there's
  anything I can do to help. For example can we trace the disk accesses
  at the SCSI level to verify whether the appropriate SCSI commands are
  being sent? This may help determine whether the problem is in Linux or
  in VMWare.

  [Fix]

  mptfusion: enable no_write_same in scsi_host_template
  commit 4089b71cc820a426d601283c92fcd4ffeb5139c2 upstream

  https://lkml.org/lkml/2014/9/25/482

  (Note this patch may be reverted in the future as there is active
  discussion upstream about a more generic fix)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1371591/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to