> We see an address of 0xfc7ffb000

Hi Matt,

I don't think you're accounting for the additional pages due to the Xen
balloon, are you?  That increases physical memory, after boot.  If you
check the /proc/zoneinfo file, look at the Normal zone's spanned pages
and start pfn, e.g.:

Node 0, zone   Normal
  pages free     15116671
        min      7661
        low      22873
        high     38085
   node_scanned  0
        spanned  15499264
        present  15499264
        managed  15212161
...
  start_pfn:           1048576


and so,
$ printf "%x\n" $[ 1048576 + 15499264 ]
fc8000

meaning that address you see is part of the pages in the balloon memory
region...

I disabled Ubuntu's memory hotadd (commented it out in
/lib/udev/rules.d/40-vm-hotadd.rules), and rebooted, and the Normal
zone's present pages was reduced so that the end is fc0000, matching the
boot time max pfn; I then tried to reproduce the problem and it seems
gone!

So I think that must be the issue; the hypervisor's NVMe driver isn't
expecting any pages from the Xen ballooned region.  I checked on Amazon
Linux, and saw why it isn't affected:

$ grep XEN_BALLOON /boot/config-4.4.41-36.55.amzn1.x86_64 
# CONFIG_XEN_BALLOON is not set

I suspect that skips quite a lot of problems for Amazon Linux, as the
Xen ballooning is quite annoying (see bug 1518457 comment 126, for
example).

Maybe Ubuntu should disable Xen ballooning for AWS also?  If not, then
this seems to be a hypervisor bug, it needs to allow pages from the
ballooned region also.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1668129

Title:
  Amazon I3 Instance Buffer I/O error on dev nvme0n1

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1668129/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to