** Changed in: linux-aws (Ubuntu)
Status: Fix Committed => Fix Released
** Changed in: linux-aws (Ubuntu Xenial)
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpa
> This bug is still present on 14.04 using linux-generic-lts-xenial
kernel 4.4.0-87-generic.
that's correct, and there is no planned change for the standard kernel.
Only the linux-aws kernel is being changed to address this issue, by
disabling Xen memory ballooning, as described in comment 50.
A
> This bug is still present on 14.04 using linux-generic-lts-xenial
kernel 4.4.0-87-generic.
Sorry, I misread your statement - unless you have edited your udev rule
to enable the hotplug memory, you should not encounter this issue using
Trusty with either the 3.13 or 4.4 kernel. If you are, I sug
This bug is still present on 14.04 using linux-generic-lts-xenial kernel
4.4.0-87-generic.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on dev nvm
> Is there any intention of backporting either linux-aws or any of the NVMe bug
> fixes from
> linux-aws into linux-image-virtual-lts-xenial for Ubuntu 14.04 since they're
> both
> 4.4.0 kernels?
the fix for this bug is to change the kernel config param
CONFIG_XEN_BALLOON from y to n, disabling
** Also affects: linux-lts-xenial (Ubuntu)
Importance: Undecided
Status: New
** No longer affects: linux-lts-xenial (Ubuntu)
** No longer affects: linux-lts-xenial (Ubuntu Xenial)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubu
Is there any intention of backporting either linux-aws or any of the
NVMe bug fixes from linux-aws into linux-image-virtual-lts-xenial for
Ubuntu 14.04 since they're both 4.4.0 kernels?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
h
I'm not sure when a new AMI build is scheduled, but you can 'sudo apt
install linux-aws' currently in an existing xenial instance to upgrade
to the AWS-specific kernel that has xen ballooning disabled, which fixes
this problem.
--
You received this bug notification because you are a member of Ubu
Hi, is there any ETA for the fix release?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on dev nvme0n1
To manage notifications about this bug go t
** Changed in: linux-aws (Ubuntu)
Status: Fix Released => Fix Committed
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on dev nvme0n1
To man
I think I changed the status by mistake (Did not know I could do that),
and I'm unable to revert it :/
** Changed in: linux-aws (Ubuntu)
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
http
** Description changed:
On the AWS i3 instance class - when putting the new NVME storage disks
under high IO load - seeing data corruption and errors in dmesg
-
[ 662.884390] blk_update_request: I/O error, dev nvme0n1, sector 120063912
[ 662.887824] Buffer I/O error on dev nvme0n1, l
Thanks for figuring that out! This was using the 16.04 HVM image in us-
east-1 ami-2757f631 + hardware enablement (linux-generic-hwe-16.04)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Ok, I figured out the problem.
You're using the yakkety kernel, 4.8. In the Xenial 4.4 kernel, memory
hotplug auto-onlining is disabled; however in the 4.8 kernel, memory
hotplug auto-onlining is enabled, so disabling the udev rule with the
4.8 kernel does nothing - the kernel's already onlined t
Patrick,
can you attach your /proc/zoneinfo file. also, which image type is
this?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on dev nvme0n1
T
Just to be thorough:
# find / -xdev -type f -name '*.rules' -print0 | xargs -0 fgrep memory
/lib/udev/rules.d/40-vm-hotadd.rules:# On Hyper-V and Xen Virtual Machines we
want to add memory and cpus as soon as they appear
/lib/udev/rules.d/40-vm-hotadd.rules:#SUBSYSTEM=="memory", ACTION=="add",
$ grep 0 /sys/devices/system/memory/memory*/online
/sys/devices/system/memory/memory504/online:0
$ grep memory /lib/udev/rules.d/* /etc/udev/rules.d/*
/lib/udev/rules.d/40-vm-hotadd.rules:# On Hyper-V and Xen Virtual Machines we
want to add memory and cpus as soon as they appear
/lib/udev/rules.
And, rebuild your initramfs, to make sure it doesn't have a stale udev
rule in it (although mine doesn't contain the memory hotadd udev rule):
$ sudo update-initramfs -u
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.lau
Also,
$ grep memory /lib/udev/rules.d/* /etc/udev/rules.d/*
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on dev nvme0n1
To manage notifications
Patrick, does this command return any results:
$ grep 0 /sys/devices/system/memory/memory*/online
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on
I've applied the udev rules change and it doesn't seem to make a
difference on the instances I'm testing with:
(after applying the change and reloading udev, rebooting, etc)
ubuntu@hot-i3-muguasak:~$ cat /proc/zoneinfo
...
Node 0, zone Normal
pages free 14714755
min 7663
I've applied the udev rules change and it doesn't seem to make a
difference on the instances I'm testing with:
(after applying the change and reloading udev, rebooting, etc)
ubuntu@hot-i3-muguasak:~$ cat /proc/zoneinfo
...
Node 0, zone Normal
pages free 14714755
min 7663
** Tags removed: kernel-key
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on dev nvme0n1
To manage notifications about this bug go to:
https://bug
** Changed in: linux-aws (Ubuntu)
Status: In Progress => Fix Committed
** Changed in: linux-aws (Ubuntu Xenial)
Status: In Progress => Fix Committed
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.
> Excellent. Sanity check here: This also means that trusty is not affected
> because the udev
> rules don't match.
...
> in xenial...
> and includes ATTR{[dmi/id]sys_vendor}=="Xen", GOTO="vm_hotadd_apply" which
> does trigger the bug.
that's correct, on 14.04 the Xen balloon memory is not switc
Excellent. Sanity check here: This also means that trusty is not
affected because the udev rules don't match.
I have /lib/udev/rules.d/40-hyperv-hotadd.rules:
# On Hyper-V Virtual Machines we want to add memory and cpus as soon as they
appear
ATTR{[dmi/id]sys_vendor}!="Microsoft Corporation", GO
Great, thanks very much!
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on dev nvme0n1
To manage notifications about this bug go to:
https://bugs.l
> * When would you (roughly) expect an AMI to be available?
it's too early to say, there are various steps before the fix lands in
an AMI.
> * How high is your confidence in the 40-vm-hotadd.rules change
workaround? Sounds like very high?
100%. if it doesn't work for you, please let me know.
>
Note on above; once hotadd is disabled, the xen balloon driver will
still perform the memory hotplug, but the added pages won't be available
for use. So you can check /proc/zoneinfo, and look at the Normal zone,
e.g.:
with hotadd enabled (the default in Ubuntu):
Node 0, zone Normal
pages free
@ddstreet a few quick questions
* When would you (roughly) expect an AMI to be available?
* How high is your confidence in the 40-vm-hotadd.rules change workaround?
Sounds like very high?
* For those of us who are not knowledgeable about this subsystem, are there any
drawbacks or things to watc
** Also affects: linux-aws (Ubuntu)
Importance: Undecided
Status: New
** Changed in: linux-aws (Ubuntu)
Assignee: (unassigned) => Dan Streetman (ddstreet)
** Changed in: linux-aws (Ubuntu Xenial)
Assignee: (unassigned) => Dan Streetman (ddstreet)
** Changed in: linux-aws (Ubu
For those watching this bug, to work around this until there is an AMI
available that fixes it, you can disable udev memory hotadd by changing
the /lib/udev/rules.d/40-vm-hotadd.rules file to comment out the memory
hotadd rule, like this:
--- /lib/udev/rules.d/40-vm-hotadd.rules.old2017-03-01
>> FYI: RHEL 7.3 does not suffer from this problem and appears to have
>> ballooning enabled:
> I imagine CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is set for the Ubuntu kernel?
yes exactly, CONFIG_XEN_BALLOON_MEMORY_HOTPLUG must be enabled for it to
actually increase the physical memory region, which is
I imagine CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is set for the Ubuntu
kernel?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on dev nvme0n1
To manage n
FYI: RHEL 7.3 does not suffer from this problem and appears to have
ballooning enabled:
$ grep CONFIG_XEN_BALLOON /boot/config-3.10.0-514.el7.x86_64
CONFIG_XEN_BALLOON=y
# CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is not set
$ uname -a
Linux cassandra-a-2 3.10.0-514.el7.x86_64 #1 SMP Wed Oct 19 11:24:13
Yes, ballooning has been a constant source of problems which is why it
is disabled in Amazon Linux AMI.
We do not currently support DMA to/from guest physical addresses outside
of the E820 map for ENA networking or NVMe storage interfaces. This
effectively means that ballooning needs to be disable
> We see an address of 0xfc7ffb000
Hi Matt,
I don't think you're accounting for the additional pages due to the Xen
balloon, are you? That increases physical memory, after boot. If you
check the /proc/zoneinfo file, look at the Normal zone's spanned pages
and start pfn, e.g.:
Node 0, zone No
Dan,
It appears that the requests that are being submitted refer to DMA
addresses that exceed the guest physical memory range, and this is why
the requests are being failed. The address seen is outside the E820 map:
[ 0.00] e820: BIOS-provided physical RAM map:
[ 0.00] BIOS-e820: [mem 0x0
On an i3 instance in east-1, where i can reproduce fairly easily, the
errors i'm getting unfortunately don't help. the nvme controller is
failing some requests, but it isn't providing any useful info about why
it doesn't like the requests. for example, here is some debug I added:
[ 1464.634709] nv
This is reproducable with the latest upstream kernel as well (4.10), so
this isn't a bug in the ubuntu kernel; it will require an upstream fix
and backport that into xenial/yakkety.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https
I have a quick reproducer now:
NCPUS=...whatever...
for n in $( seq 1 $NCPUS ) ; do ( dd if=/dev/zero of=/mnt/test/out$n bs=1024k
count=1024k ) & done
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/16
** Changed in: linux (Ubuntu Xenial)
Assignee: (unassigned) => Dan Streetman (ddstreet)
** Changed in: linux (Ubuntu)
Assignee: (unassigned) => Dan Streetman (ddstreet)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https:/
FYI - it doesn't occur on RHEL 7.3 nor Amazon Linux.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on dev nvme0n1
To manage notifications about th
@ddstreet I can replicate the issue in us-west-2. All i3.2xls using ext4
as the filesystem I'm testing on.
ami-4e98182e (precise) - doesn't recognize device
ami-17ac2c77 (trusty) - no error
ami-edf6758d (xenial) - I/O errors
ami-a49b1bc4 (yakkety) - I/O errors
--
You received this bug notificati
** Changed in: linux (Ubuntu)
Importance: High => Critical
** Changed in: linux (Ubuntu Xenial)
Importance: High => Critical
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazo
(This was in eu-west-1, with ami-405f7226)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on dev nvme0n1
To manage notifications about this bug go
I find it is easy to reproduce by using "dd if=/dev/zero of=big bs=4096"
-- I generally get errors within a few minutes.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instan
** Changed in: linux (Ubuntu)
Importance: Undecided => High
** Also affects: linux (Ubuntu Xenial)
Importance: Undecided
Status: New
** Changed in: linux (Ubuntu Xenial)
Importance: Undecided => High
** Changed in: linux (Ubuntu Xenial)
Status: New => Triaged
** Changed i
I don't know that I do -- I'm finding these errors when rsync'ing a
larger database from another machine.
I'm using ubuntu/images/hvm-ssd/ubuntu-
xenial-16.04-amd64-server-20170221 (ami-a58d0dc5)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed t
@matt-e do you have a quicker reproducer I can use? I've been trying
bonnie++ from the description, but that doesn't repro at all in west-2
for me, and only once in east-1 so far. Also what AMI are you using in
west-2 that shows the problem?
--
You received this bug notification because you are
I've had this issue on 4 different instances in us-west-2 -- two I still
have running -- can I help?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error
@ddstreet us-east-1
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on dev nvme0n1
To manage notifications about this bug go to:
https://bugs.launch
@rram can you try to reproduce with us-west-2? I'm able to repro in
east-1 but not in west-2.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on dev
@rram what region are you using?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129
Title:
Amazon I3 Instance Buffer I/O error on dev nvme0n1
To manage notifications about this bug go to:
https:
I tested this on a few i3.2xlarge instances using bonnie++ and an ext4
mounted filesystem
ami-cc10c1da (precise) - does not recognize the ephemeral device
ami-822bfa94 (trusty) - does NOT appear to be affected
ami-1ac0120c (xenial) - is affected
ami-e600d1f0 (yakkety) - is affected
--
You receiv
apport information
** Tags added: apport-collected ec2-images xenial
** Description changed:
On the AWS i3 instance class - when putting the new NVME storage disks
under high IO load - seeing data corruption and errors in dmesg
[ 662.884390] blk_update_request: I/O error, dev nvme0n
56 matches
Mail list logo