This bug was fixed in the package linux - 4.4.0-51.72 --------------- linux (4.4.0-51.72) xenial; urgency=low
[ Luis Henriques ] * Release Tracking Bug - LP: #1644611 * 4.4.0-1037-snapdragon #41: kernel panic on boot (LP: #1644596) - Revert "dma-mapping: introduce the DMA_ATTR_NO_WARN attribute" - Revert "powerpc: implement the DMA_ATTR_NO_WARN attribute" - Revert "nvme: use the DMA_ATTR_NO_WARN attribute" linux (4.4.0-50.71) xenial; urgency=low [ Luis Henriques ] * Release Tracking Bug - LP: #1644169 * xenial 4.4.0-49.70 kernel breaks LXD userspace (LP: #1644165) - Revert "UBUNTU: SAUCE: (namespace) fuse: Allow user namespace mounts by default" - Revert "UBUNTU: SAUCE: (namespace) fs: Don't remove suid for CAP_FSETID for userns root" - Revert "(namespace) Revert "UBUNTU: SAUCE: fs: Don't remove suid for CAP_FSETID in s_user_ns"" - Revert "UBUNTU: SAUCE: (namespace) fs: Allow superblock owner to change ownership of inodes" - Revert "(namespace) Revert "UBUNTU: SAUCE: fs: Allow superblock owner to change ownership of inodes with unmappable ids"" - Revert "UBUNTU: SAUCE: (namespace) security/integrity: Harden against malformed xattrs" - Revert "(namespace) Revert "UBUNTU: SAUCE: ima/evm: Allow root in s_user_ns to set xattrs"" - Revert "(namespace) dquot: For now explicitly don't support filesystems outside of init_user_ns" - Revert "(namespace) quota: Handle quota data stored in s_user_ns in quota_setxquota" - Revert "(namespace) quota: Ensure qids map to the filesystem" - Revert "(namespace) Revert "UBUNTU: SAUCE: quota: Convert ids relative to s_user_ns"" - Revert "(namespace) Revert "UBUNTU: SAUCE: quota: Require that qids passed to dqget() be valid and map into s_user_ns"" - Revert "(namespace) vfs: Don't create inodes with a uid or gid unknown to the vfs" - Revert "(namespace) vfs: Don't modify inodes with a uid or gid unknown to the vfs" - Revert "UBUNTU: SAUCE: (namespace) fuse: Translate ids in posix acl xattrs" - Revert "UBUNTU: SAUCE: (namespace) posix_acl: Export posix_acl_fix_xattr_userns() to modules" - Revert "(namespace) vfs: Verify acls are valid within superblock's s_user_ns." - Revert "(namespace) Revert "UBUNTU: SAUCE: fs: Update posix_acl support to handle user namespace mounts"" - Revert "(namespace) fs: Refuse uid/gid changes which don't map into s_user_ns" - Revert "(namespace) Revert "UBUNTU: SAUCE: fs: Refuse uid/gid changes which don't map into s_user_ns"" - Revert "(namespace) mnt: Move the FS_USERNS_MOUNT check into sget_userns" linux (4.4.0-49.70) xenial; urgency=low [ Luis Henriques ] * Release Tracking Bug - LP: #1640921 * Infiniband driver (kernel module) needed for Azure (LP: #1641139) - SAUCE: RDMA Infiniband for Windows Azure - [Config] CONFIG_HYPERV_INFINIBAND_ND=m - SAUCE: Makefile RDMA infiniband driver for Windows Azure - [Config] Add hv_network_direct.ko to generic inclusion list - SAUCE: RDMA Infiniband for Windows Azure is dependent on amd64 linux (4.4.0-48.69) xenial; urgency=low [ Luis Henriques ] * Release Tracking Bug - LP: #1640758 * lxc-attach to malicious container allows access to host (LP: #1639345) - Revert "UBUNTU: SAUCE: (noup) ptrace: being capable wrt a process requires mapped uids/gids" - (upstream) mm: Add a user_ns owner to mm_struct and fix ptrace permission checks * take 'P' command from upstream xmon (LP: #1637978) - powerpc/xmon: Add xmon command to dump process/task similar to ps(1) * zfs: importing zpool with vdev on zvol hangs kernel (LP: #1636517) - SAUCE: (noup) Update zfs to 0.6.5.6-0ubuntu15 * I2C touchpad does not work on AMD platform (LP: #1612006) - pinctrl/amd: Configure GPIO register using BIOS settings - pinctrl/amd: switch to using a bool for level * [LTCTest] vfio_pci not loaded on Ubuntu 16.10 by default (LP: #1636733) - [Config] CONFIG_VFIO_PCI=y for ppc64el * QEMU throws failure msg while booting guest with SRIOV VF (LP: #1630554) - KVM: PPC: Always select KVM_VFIO, plus Makefile cleanup * Allow fuse user namespace mounts by default in xenial (LP: #1634964) - (namespace) mnt: Move the FS_USERNS_MOUNT check into sget_userns - (namespace) Revert "UBUNTU: SAUCE: fs: Refuse uid/gid changes which don't map into s_user_ns" - (namespace) fs: Refuse uid/gid changes which don't map into s_user_ns - (namespace) Revert "UBUNTU: SAUCE: fs: Update posix_acl support to handle user namespace mounts" - (namespace) vfs: Verify acls are valid within superblock's s_user_ns. - SAUCE: (namespace) posix_acl: Export posix_acl_fix_xattr_userns() to modules - SAUCE: (namespace) fuse: Translate ids in posix acl xattrs - (namespace) vfs: Don't modify inodes with a uid or gid unknown to the vfs - (namespace) vfs: Don't create inodes with a uid or gid unknown to the vfs - (namespace) Revert "UBUNTU: SAUCE: quota: Require that qids passed to dqget() be valid and map into s_user_ns" - (namespace) Revert "UBUNTU: SAUCE: quota: Convert ids relative to s_user_ns" - (namespace) quota: Ensure qids map to the filesystem - (namespace) quota: Handle quota data stored in s_user_ns in quota_setxquota - (namespace) dquot: For now explicitly don't support filesystems outside of init_user_ns - (namespace) Revert "UBUNTU: SAUCE: ima/evm: Allow root in s_user_ns to set xattrs" - SAUCE: (namespace) security/integrity: Harden against malformed xattrs - (namespace) Revert "UBUNTU: SAUCE: fs: Allow superblock owner to change ownership of inodes with unmappable ids" - SAUCE: (namespace) fs: Allow superblock owner to change ownership of inodes - (namespace) Revert "UBUNTU: SAUCE: fs: Don't remove suid for CAP_FSETID in s_user_ns" - SAUCE: (namespace) fs: Don't remove suid for CAP_FSETID for userns root - SAUCE: (namespace) fuse: Allow user namespace mounts by default * [Feature] KBL - New device ID for Kabypoint(KbP) (LP: #1591618) - SAUCE: mfd: lpss: Fix Intel Kaby Lake PCH-H properties * hio: SSD data corruption under stress test (LP: #1638700) - SAUCE: hio: set bi_error field to signal an I/O error on a BIO - SAUCE: hio: splitting bio in the entry of .make_request_fn * Module sha1-mb fails to load (LP: #1637165) - crypto: sha-mb - Fix load failure - crypto: mcryptd - Fix load failure * please include mlx5_core modules in linux-image-generic package (LP: #1635223) - [Config] Include mlx5 in main package * xgene i2c slimpro driver fails to load (LP: #1625232) - mailbox: Add support for APM X-Gene platform mailbox driver - mailbox/xgene-slimpro: Checking for IS_ERR instead of NULL - mailbox: xgene-slimpro: Fix wrong test for devm_kzalloc - [Config] Enabled XGENE_SLIMPRO_MBOX as a module * [Dell][XPS]Touchscreen fails to function after resume from s3 by Lid close/open (LP: #1632527) - gpio/pinctrl: sunxi: stop poking around in private vars - pinctrl: intel: Only restore pins that are used by the driver * Xenial update to v4.4.30 stable release (LP: #1638272) - Revert "x86/mm: Expand the exception table logic to allow new handling options" - Revert "fix minor infoleak in get_user_ex()" - Linux 4.4.30 * Xenial update to v4.4.29 stable release (LP: #1638267) - drm/prime: Pass the right module owner through to dma_buf_export() - drm/amdgpu: fix IB alignment for UVD - drm/amdgpu/dce10: disable hpd on local panels - drm/amdgpu/dce8: disable hpd on local panels - drm/amdgpu/dce11: disable hpd on local panels - drm/amdgpu/dce11: add missing drm_mode_config_cleanup call - drm/amdgpu: change vblank_time's calculation method to reduce computational error. - drm/radeon: narrow asic_init for virtualization - drm/radeon/si/dpm: fix phase shedding setup - drm/radeon: change vblank_time's calculation method to reduce computational error. - drm/vmwgfx: Limit the user-space command buffer size - drm/i915/gen9: fix the WaWmMemoryReadLatency implementation - Revert "drm/i915: Check live status before reading edid" - drm/i915: Account for TSEG size when determining 865G stolen base - drm/i915: Unalias obj->phys_handle and obj->userptr - mm/hugetlb: fix memory offline with hugepage size > memory block size - posix_acl: Clear SGID bit when setting file permissions - ipip: Properly mark ipip GRO packets as encapsulated. - powerpc/eeh: Null check uses of eeh_pe_bus_get - perf stat: Fix interval output values - genirq/generic_chip: Add irq_unmap callback - uio: fix dmem_region_start computation - ARM: clk-imx35: fix name for ckil clk - spi: spi-fsl-dspi: Drop extra spi_master_put in device remove function - mwifiex: correct aid value during tdls setup - crypto: gcm - Fix IV buffer size in crypto_gcm_setkey - crypto: arm/ghash-ce - add missing async import/export - hwrng: omap - Only fail if pm_runtime_get_sync returns < 0 - ASoC: topology: Fix error return code in soc_tplg_dapm_widget_create() - ASoC: dapm: Fix possible uninitialized variable in snd_soc_dapm_get_volsw() - ASoC: dapm: Fix value setting for _ENUM_DOUBLE MUX's second channel - ASoC: dapm: Fix kcontrol creation for output driver widget - staging: r8188eu: Fix scheduling while atomic splat - power: bq24257: Fix use of uninitialized pointer bq->charger - dmaengine: ipu: remove bogus NO_IRQ reference - x86/mm: Expand the exception table logic to allow new handling options - s390/cio: fix accidental interrupt enabling during resume - s390/con3270: fix use of uninitialised data - s390/con3270: fix insufficient space padding - clk: qoriq: fix a register offset error - clk: divider: Fix clk_divider_round_rate() to use clk_readl() - perf hists browser: Fix event group display - perf symbols: Check symbol_conf.allow_aliases for kallsyms loading too - perf symbols: Fixup symbol sizes before picking best ones - mpt3sas: Don't spam logs if logging level is 0 - powerpc/nvram: Fix an incorrect partition merge - ARM: pxa: pxa_cplds: fix interrupt handling - Linux 4.4.29 * KVM: PPC: Book3S HV: Migrate pinned pages out of CMA (LP: #1632045) - KVM: PPC: Book3S HV: Migrate pinned pages out of CMA * Xenial update to v4.4.28 stable release (LP: #1637510) - gpio: mpc8xxx: Correct irq handler function - mei: me: add kaby point device ids - regulator: tps65910: Work around silicon erratum SWCZ010 - clk: imx6: initialize GPU clocks - PM / devfreq: event: remove duplicate devfreq_event_get_drvdata() - rtlwifi: Fix missing country code for Great Britain - mmc: block: don't use CMD23 with very old MMC cards - mmc: sdhci: cast unsigned int to unsigned long long to avoid unexpeted error - PCI: Mark Atheros AR9580 to avoid bus reset - platform: don't return 0 from platform_get_irq[_byname]() on error - cpufreq: intel_pstate: Fix unsafe HWP MSR access - parisc: Increase KERNEL_INITIAL_SIZE for 32-bit SMP kernels - parisc: Fix kernel memory layout regarding position of __gp - parisc: Increase initial kernel mapping size - pstore/ramoops: fixup driver removal - pstore/core: drop cmpxchg based updates - pstore/ram: Use memcpy_toio instead of memcpy - pstore/ram: Use memcpy_fromio() to save old buffer - perf intel-pt: Fix snapshot overlap detection decoder errors - perf intel-pt: Fix estimated timestamps for cycle-accurate mode - perf intel-pt: Fix MTC timestamp calculation for large MTC periods - dm: mark request_queue dead before destroying the DM device - dm: return correct error code in dm_resume()'s retry loop - dm mpath: check if path's request_queue is dying in activate_path() - dm crypt: fix crash on exit - powerpc/vdso64: Use double word compare on pointers - powerpc/powernv: Pass CPU-endian PE number to opal_pci_eeh_freeze_clear() - powerpc/powernv: Use CPU-endian hub diag-data type in pnv_eeh_get_and_dump_hub_diag() - powerpc/powernv: Use CPU-endian PEST in pnv_pci_dump_p7ioc_diag_data() - ubi: Deal with interrupted erasures in WL - zfcp: fix fc_host port_type with NPIV - zfcp: fix ELS/GS request&response length for hardware data router - zfcp: close window with unblocked rport during rport gone - zfcp: retain trace level for SCSI and HBA FSF response records - zfcp: restore: Dont use 0 to indicate invalid LUN in rec trace - zfcp: trace on request for open and close of WKA port - zfcp: restore tracing of handle for port and LUN with HBA records - zfcp: fix D_ID field with actual value on tracing SAN responses - zfcp: fix payload trace length for SAN request&response - zfcp: trace full payload of all SAN records (req,resp,iels) - scsi: zfcp: spin_lock_irqsave() is not nestable - fbdev/efifb: Fix 16 color palette entry calculation - ovl: Fix info leak in ovl_lookup_temp() - ovl: copy_up_xattr(): use strnlen - mb86a20s: fix the locking logic - mb86a20s: fix demod settings - cx231xx: don't return error on success - cx231xx: fix GPIOs for Pixelview SBTVD hybrid - ALSA: hda - Fix a failure of micmute led when having multi adcs - MIPS: Fix -mabi=64 build of vdso.lds - MIPS: ptrace: Fix regs_return_value for kernel context - lib: move strtobool() to kstrtobool() - lib: update single-char callers of strtobool() - lib: add "on"/"off" support to kstrtobool - Input: i8042 - skip selftest on ASUS laptops - Input: elantech - force needed quirks on Fujitsu H760 - Input: elantech - add Fujitsu Lifebook E556 to force crc_enabled - sunrpc: fix write space race causing stalls - NFSv4: Don't report revoked delegations as valid in nfs_have_delegation() - NFSv4: nfs4_copy_delegation_stateid() must fail if the delegation is invalid - NFSv4: Open state recovery must account for file permission changes - NFSv4.2: Fix a reference leak in nfs42_proc_layoutstats_generic - scsi: Fix use-after-free - metag: Only define atomic_dec_if_positive conditionally - mm: filemap: don't plant shadow entries without radix tree node - ipc/sem.c: fix complex_count vs. simple op race - arc: don't leak bits of kernel stack into coredump - fs/super.c: fix race between freeze_super() and thaw_super() - cifs: Limit the overall credit acquired - fs/cifs: keep guid when assigning fid to fileinfo - Clarify locking of cifs file and tcon structures and make more granular - Display number of credits available - Set previous session id correctly on SMB3 reconnect - SMB3: GUIDs should be constructed as random but valid uuids - Do not send SMB3 SET_INFO request if nothing is changing - Cleanup missing frees on some ioctls - blkcg: Unlock blkcg_pol_mutex only once when cpd == NULL - x86/e820: Don't merge consecutive E820_PRAM ranges - kvm: x86: memset whole irq_eoi - irqchip/gicv3: Handle loop timeout proper - sd: Fix rw_max for devices that report an optimal xfer size - hpsa: correct skipping masked peripherals - PKCS#7: Don't require SpcSpOpusInfo in Authenticode pkcs7 signatures - bnx2x: Prevent false warning for lack of FC NPIV - net/mlx4_core: Allow resetting VF admin mac to zero - acpi, nfit: check for the correct event code in notifications - mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page() - mm: filemap: fix mapping->nrpages double accounting in fuse - Using BUG_ON() as an assert() is _never_ acceptable - s390/mm: fix gmap tlb flush issues - irqchip/gic-v3-its: Fix entry size mask for GITS_BASER - isofs: Do not return EACCES for unknown filesystems - memstick: rtsx_usb_ms: Runtime resume the device when polling for cards - memstick: rtsx_usb_ms: Manage runtime PM when accessing the device - arm64: percpu: rewrite ll/sc loops in assembly - arm64: kernel: Init MDCR_EL2 even in the absence of a PMU - ceph: fix error handling in ceph_read_iter - powerpc/mm: Prevent unlikely crash in copro_calculate_slb() - mmc: core: Annotate cmd_hdr as __le32 - mmc: rtsx_usb_sdmmc: Avoid keeping the device runtime resumed when unused - mmc: rtsx_usb_sdmmc: Handle runtime PM while changing the led - ext4: do not advertise encryption support when disabled - jbd2: fix incorrect unlock on j_list_lock - ubifs: Fix xattr_names length in exit paths - target: Re-add missing SCF_ACK_KREF assignment in v4.1.y - target: Make EXTENDED_COPY 0xe4 failure return COPY TARGET DEVICE NOT REACHABLE - target: Don't override EXTENDED_COPY xcopy_pt_cmd SCSI status code - Linux 4.4.28 * Xenial update to v4.4.27 stable release (LP: #1637501) - serial: 8250_dw: Check the data->pclk when get apb_pclk - btrfs: assign error values to the correct bio structs - drivers: base: dma-mapping: page align the size when unmap_kernel_range - fuse: listxattr: verify xattr list - fuse: invalidate dir dentry after chmod - fuse: fix killing s[ug]id in setattr - i40e: avoid NULL pointer dereference and recursive errors on early PCI error - brcmfmac: fix memory leak in brcmf_fill_bss_param - ASoC: Intel: Atom: add a missing star in a memcpy call - reiserfs: Unlock superblock before calling reiserfs_quota_on_mount() - reiserfs: switch to generic_{get,set,remove}xattr() - async_pq_val: fix DMA memory leak - scsi: arcmsr: Simplify user_len checking - ext4: enforce online defrag restriction for encrypted files - ext4: reinforce check of i_dtime when clearing high fields of uid and gid - ext4: fix memory leak in ext4_insert_range() - ext4: allow DAX writeback for hole punch - ext4: release bh in make_indexed_dir - dlm: free workqueues after the connections - vfs: move permission checking into notify_change() for utimes(NULL) - cfq: fix starvation of asynchronous writes - Linux 4.4.27 * Xenial update to v4.4.26 stable release (LP: #1637500) - x86/build: Build compressed x86 kernels as PIE - Linux 4.4.26 * ISST-LTE:pVM nvme 0000:a0:00.0: iommu_alloc failed on NVMe card (LP: #1633128) - dma-mapping: introduce the DMA_ATTR_NO_WARN attribute - powerpc: implement the DMA_ATTR_NO_WARN attribute - nvme: use the DMA_ATTR_NO_WARN attribute * CVE-2016-8658 - brcmfmac: avoid potential stack overflow in brcmf_cfg80211_start_ap() * Hotkey doesn't work on HP x360 (LP: #1620979) - gpiolib: Make it possible to exclude GPIOs from IRQ domain - pinctrl: cherryview: Do not mask all interrupts in probe - pinctrl: cherryview: Do not add all southwest and north GPIOs to IRQ domain * Bad page state in process genwqe_gunzip pfn:3c275 in the genwqe device driver (LP: #1559194) - SAUCE: (noup) Bad page state in process genwqe_gunzip pfn:3c275 in the genwqe device driver * CVE-2016-7425 - scsi: arcmsr: Buffer overflow in arcmsr_iop_message_xfer() * Add ipvlan module to 16.04 kernel (LP: #1634705) - [Config] Add ipvlan to the generic inclusion list * kernel generates ACPI Exception: AE_NOT_FOUND, Evaluating _DOD incorrectly (LP: #1634607) - ACPI / video: skip evaluating _DOD when it does not exist * BT still shows off after resume by wireless hotkey (LP: #1634380) - Bluetooth: btusb: Fix atheros firmware download error * ghash-clmulni-intel module fails to load (LP: #1633058) - crypto: ghash-clmulni - Fix load failure - crypto: cryptd - Assign statesize properly * Xenial update to v4.4.25 stable release (LP: #1634153) - timekeeping: Fix __ktime_get_fast_ns() regression - ALSA: ali5451: Fix out-of-bound position reporting - ALSA: usb-audio: Extend DragonFly dB scale quirk to cover other variants - ALSA: usb-line6: use the same declaration as definition in header for MIDI manufacturer ID - mfd: rtsx_usb: Avoid setting ucr->current_sg.status - mfd: atmel-hlcdc: Do not sleep in atomic context - mfd: 88pm80x: Double shifting bug in suspend/resume - mfd: wm8350-i2c: Make sure the i2c regmap functions are compiled - KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register - KVM: MIPS: Drop other CPU ASIDs on guest MMU changes - KVM: PPC: BookE: Fix a sanity check - x86/boot: Fix kdump, cleanup aborted E820_PRAM max_pfn manipulation - x86/irq: Prevent force migration of irqs which are not in the vector domain - x86/dumpstack: Fix x86_32 kernel_stack_pointer() previous stack access - ARM: dts: mvebu: armada-390: add missing compatibility string and bracket - ARM: dts: MSM8064 remove flags from SPMI/MPP IRQs - ARM: cpuidle: Fix error return code - ima: use file_dentry() - tpm: fix a race condition in tpm2_unseal_trusted() - tpm_crb: fix crb_req_canceled behavior - Linux 4.4.25 * backport fwts UEFI test driver to Xenial (LP: #1633506) - efi: Add efi_test driver for exporting UEFI runtime service interfaces - [Config] CONFIG_EFI_TEST=m * Fix alps driver for multitouch function. (LP: #1633321) - HID: alps: fix multitouch cursor issue * xgene merlin crashes when running as iperf server (LP: #1632739) - drivers: net: xgene: optimizing the code - xgene: get_phy_device() doesn't return NULL anymore - drivers: net: xgene: Get channel number from device binding - drivers: net: xgene: constify xgene_cle_ops structure - drivers: net: xgene: Fix error handling - drivers: net: xgene: fix IPv4 forward crash - drivers: net: xgene: fix sharing of irqs - drivers: net: xgene: fix ununiform latency across queues - drivers: net: xgene: fix statistics counters race condition - drivers: net: xgene: fix register offset - drivers: net: xgene: Separate set_speed from mac_init - drivers: net: xgene: Fix module unload crash - hw resource cleanup - drivers: net: xgene: Fix module unload crash - change sw sequence - drivers: net: xgene: Fix module unload crash - clkrst sequence - drivers: net: phy: xgene: Add MDIO driver - drivers: net: xgene: Add backward compatibility - drivers: net: xgene: Enable MDIO driver - drivers: net: xgene: Use exported functions - drivers: net: xgene: ethtool: Use phy_ethtool_gset and sset - dtb: xgene: Add MDIO node - MAINTAINERS: xgene: Add driver and documentation path - [Config] Enable MDIO_XGENE as a modules * Add support for KabeLake i219-LOM chips (LP: #1632578) - e1000e: Initial support for KabeLake -- Luis Henriques <luis.henriq...@canonical.com> Thu, 24 Nov 2016 17:56:21 +0000 ** Changed in: linux (Ubuntu Xenial) Status: Fix Committed => Fix Released ** CVE added: http://www.cve.mitre.org/cgi- bin/cvename.cgi?name=2016-7425 ** CVE added: http://www.cve.mitre.org/cgi- bin/cvename.cgi?name=2016-8658 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1632045 Title: KVM: PPC: Book3S HV: Migrate pinned pages out of CMA Status in linux package in Ubuntu: Fix Released Status in linux source package in Xenial: Fix Released Status in linux source package in Yakkety: Fix Released Status in linux source package in Zesty: Fix Released Bug description: ---Problem Description--- https://github.com/open-power/supermicro-openpower/issues/59 SW/HW Configuration PNOR image version: 5/3/2016 BMC image version: 0.25 CPLD Version: B2.81.01 Host OS version: Ubuntu 16.04 LTS UbuntuKVM Guest OS version: Ubuntu 14.04.4 LTS HTX version: 394 Processor: 00UL865 * 2 Memory: SK hynix 16GB 2Rx4 PC4-2133P * 16 Summary of Issue Two UbuntuKVM guests are each configured with 8 processors, 64 GB of memory, 1 disk of 128 GB, 1 network interface, and 1 GPU (pass- through'd from the Host OS's K80). The two guests are each put into a Create/Destroy loop, with HTX running on each of the guests (NOT HOST) in between its creation and destruction. The mdt.bu profile is used, and the processors, memory, and the GPU are put under load. The HTX session lasts 9 minutes. While this is running, the amount of available memory (free memory) in the Host OS will slowly decrease, and this can continue until the point wherein there's no more free memory for the Host OS to do anything, including creating the two VM guests. It seems to be that after every cycle, a small portion of the memory that was allocated to the VM guest does not get released back to the Host OS, and eventually, this can and will add up to take up all the available memory in the Host OS. At some point, the VM guest(s) might get disconnected and will display the following error: error: Disconnected from qemu:///system due to I/O error error: One or more references were leaked after disconnect from the hypervisor Then, when the Host OS tries to start the VM guest again, the following error shows up: error: Failed to create domain from guest2_trusty.xml error: internal error: early end of file from monitor, possible problem: Unexpected error in spapr_alloc_htab() at /build/qemu-c3ZrbA/qemu-2.5+dfsg/hw/ppc/spapr.c:1030: 2016-05-23T16:18:16.871549Z qemu-system-ppc64: Failed to allocate HTAB of requested size, try with smaller maxmem The Host OS syslog, as seen HERE, also contains quite some errors. To just list a few: May 13 20:27:44 191-136 kernel: [36827.151228] alloc_contig_range: [3fb800, 3fd8f8) PFNs busy May 13 20:27:44 191-136 kernel: [36827.151291] alloc_contig_range: [3fb800, 3fd8fc) PFNs busy May 13 20:27:44 191-136 libvirtd[19263]: *** Error in `/usr/sbin/libvirtd': realloc(): invalid next size: 0x000001000a780400 *** May 13 20:27:44 191-136 libvirtd[19263]: ======= Backtrace: ========= May 13 20:27:44 191-136 libvirtd[19263]: /lib/powerpc64le-linux-gnu/libc.so.6(+0x8720c)[0x3fffaf6a720c] May 13 20:27:44 191-136 libvirtd[19263]: /lib/powerpc64le-linux-gnu/libc.so.6(+0x96f70)[0x3fffaf6b6f70] May 13 20:27:44 191-136 libvirtd[19263]: /lib/powerpc64le-linux-gnu/libc.so.6(realloc+0x16c)[0x3fffaf6b87fc] May 13 20:27:44 191-136 libvirtd[19263]: /usr/lib/powerpc64le-linux-gnu/libvirt.so.0(virReallocN+0x68)[0x3fffaf90ccc8] May 13 20:27:44 191-136 libvirtd[19263]: /usr/lib/libvirt/connection-driver/libvirt_driver_qemu.so(+0x8ef6c)[0x3fff9346ef6c] May 13 20:27:44 191-136 libvirtd[19263]: /usr/lib/libvirt/connection-driver/libvirt_driver_qemu.so(+0xa826c)[0x3fff9348826c] May 13 20:27:44 191-136 libvirtd[19263]: /usr/lib/powerpc64le-linux-gnu/libvirt.so.0(virEventPollRunOnce+0x8b4)[0x3fffaf9332b4] May 13 20:27:44 191-136 libvirtd[19263]: /usr/lib/powerpc64le-linux-gnu/libvirt.so.0(virEventRunDefaultImpl+0x54)[0x3fffaf931334] May 13 20:27:44 191-136 libvirtd[19263]: /usr/lib/powerpc64le-linux-gnu/libvirt.so.0(virNetDaemonRun+0x1f0)[0x3fffafad2f70] May 13 20:27:44 191-136 libvirtd[19263]: /usr/sbin/libvirtd(+0x15d74)[0x52e45d74] May 13 20:27:44 191-136 libvirtd[19263]: /lib/powerpc64le-linux-gnu/libc.so.6(+0x2319c)[0x3fffaf64319c] May 13 20:27:44 191-136 libvirtd[19263]: /lib/powerpc64le-linux-gnu/libc.so.6(__libc_start_main+0xb8)[0x3fffaf6433b8] May 13 20:27:44 191-136 libvirtd[19263]: ======= Memory map: ======== May 13 20:27:44 191-136 libvirtd[19263]: 52e30000-52eb0000 r-xp 00000000 08:02 65540510 /usr/sbin/libvirtd May 13 20:27:44 191-136 libvirtd[19263]: 52ec0000-52ed0000 r--p 00080000 08:02 65540510 /usr/sbin/libvirtd May 13 20:27:44 191-136 libvirtd[19263]: 52ed0000-52ee0000 rw-p 00090000 08:02 65540510 /usr/sbin/libvirtd May 13 20:27:44 191-136 libvirtd[19263]: 1000a730000-1000a830000 rw-p 00000000 00:00 0 [heap] May 13 20:27:44 191-136 libvirtd[19263]: 3fff60000000-3fff60030000 rw-p 00000000 00:00 0 May 13 20:27:44 191-136 libvirtd[19263]: 3fff60030000-3fff64000000 ---p 00000000 00:00 0 May 13 20:50:33 191-136 kernel: [38196.502926] audit: type=1400 audit(1463197833.497:4025): apparmor="DENIED" operation="open" profile="libvirt-d3ade785-c1c1-4519-b123-9d28704c2ad4" name="/sys/devices/pci0003:00/0003:00:00.0/0003:01:00.0/0003:02:08.0/0003:03:00.0/devspec" pid=24887 comm="qemu-system-ppc" requested_mask="r" denied_mask="r" fsuid=110 ouid=0 May 13 20:50:33 191-136 virtlogd[3727]: End of file while reading data: Input/output error Notes Host OS's free memory will also slowly decrease when HTX is NOT executed at all on the guests between guest Create/Destory, but at a much slower pace, and VM guests can also still fail to be created, with the same error message, and even though the Host OS might still have plenty of free memory left: error: Failed to create domain from guest2_trusty.xml error: internal error: early end of file from monitor, possible problem: Unexpected error in spapr_alloc_htab() at /build/qemu-c3ZrbA/qemu-2.5+dfsg/hw/ppc/spapr.c:1030: 2016-05-23T16:18:16.871549Z qemu-system-ppc64: Failed to allocate HTAB of requested size, try with smaller maxmem However, this happened only once so far, and after it completed about 3924 Create/Destroy cycles. The other guest that was running the same test concurrently did NOT have any issues and went on to 4,600+ cycles. ---uname output--- Host OS version: Ubuntu 16.04 LTS UbuntuKVM Guest OS version: Ubuntu 14.04.4 LTS Machine Type = SMC I do not see any actual information about using all memory, here are: 1. "Failed to allocate HTAB" - happens because we run out of _contiguous_ chunks of CMA memory, not just any RAM 2. libvirtd[19263]: *** Error in `/usr/sbin/libvirtd': realloc(): invalid next size: 0x000001000a780400 *** - this looks more like memory corruption than insufficient memory I suggest collecting statistics using something like this shell script: # !/bin/sh while [ true ] do <here you put guest start/stop> grep -e "\(CmaFree:\|MemFree:\)" /proc/meminfo | paste -d "\t" - - >> mymemorylog done and attaching the resulting mymemorylog to this bug. Also it would be interesting to know if the issue can be reproduced without loaded NVIDIA driver in the guest or even without passing NVIDIA GPU to the guest. Meanwhile I am running my tests and see if I can get this behavior. Ok, located the problem, will post a patch tomorrow to the public lists. Basically when QEMU dies, it does unpin DMA pages when its memory context is destroyed which was expected to happen when QEMU process exits but actually it may happen lot later if some kernel thread was executed on this same context and referenced it so until it was scheduled again, the very last memory context release would not happen. == Comment: #15 - Leonardo Augusto Guimaraes Garcia <lagar...@br.ibm.com> - 2016-08-24 08:15:00 == (In reply to comment #14) > On my host, I have 10 guests running. Sum of all 10 guests memory will come > up to 69GB. Ok... So, this is quite different from what is in the bug description. In the bug description, I read: "Two UbuntuKVM guests are each configured with 8 processors, 64 GB of memory, 1 disk of 128 GB, 1 network interface, and 1 GPU (pass- through'd from the Host OS's K80). The two guests are each put into a Create/Destroy loop, with HTX running on each of the guests (NOT HOST) in between its creation and destruction. The mdt.bu profile is used, and the processors, memory, and the GPU are put under load. The HTX session lasts 9 minutes." What is the scenario being worked on this bug? I suggest you open a new bug for your issue if needed and we continue to investigate the original issue here. > > I am trying to bring up 11th guest which is having 5Gb memory and it fails: > > root@lotkvm:~# virsh start --console lotg12 > error: Failed to start domain lotg12 > error: internal error: process exited while connecting to monitor: > 5076802818bda30000000000003f2,format=raw,if=none,id=drive-virtio-disk0 > -device > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0, > id=virtio-disk0,bootindex=1 -drive > file=/dev/disk/by-id/wwn-0x6005076802818bda30000000000003f4,format=raw, > if=none,id=drive-virtio-disk1 -device > virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk1, > id=virtio-disk1 -netdev tap,fd=41,id=hostnet0 -device > virtio-net,netdev=hostnet0,id=net0,mac=52:54:00:9b:53:77,bus=pci.0,addr=0x1, > bootindex=2 -chardev pty,id=charserial0 -device > spapr-vty,chardev=charserial0,reg=0x30000000 -device > virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x2 -msg timestamp=on > 2016-08-24T12:00:50.375315Z qemu-system-ppc64: Failed to allocate KVM HPT of > order 26 (try smaller maxmem?): Cannot allocate memory This is not because you don't have available memory. This is because you don't have CMA memory available. Please, take a look at LTC bug 145072 comment 5 and subsequent comments. > > > I waited for an hour and retried guest start.. It fails still.. > > Current memory on host : > ----------- > root@lotkvm:~# free -g > total used free shared buff/cache > available > Mem: 127 73 0 0 53 > 53 > Swap: 11 4 6 I think there are actually two separate problems here. (A) Pages in the CMA zone are getting pinned and causing fragmentation of the CMA zone, leading to the messages saying "qemu-system-ppc64: Failed to allocate HTAB of requested size, try with smaller maxmem". This happens because the guest is doing PCI passthrough with DDW enabled and hence pins all its memory. If guest pages happen to be allocated in the CMA zone, they get pinned there and then can't be moved for a future HPT allocation. Balbir was looking at the possibility of moving the pages out of the CMA zone before pinning them, but this work was dependent on some upstream refactoring which seems to be stalled. (B) On VM destruction, the pages are not getting unpinned and freed in a timely fashion. Alexey debugged this issue and has posted two patches to fix the problem: "powerpc/iommu: Stop using @current in mm_iommu_xxx" and "powerpc/mm/iommu: Put pages on process exit". These patches touch two maintainers' areas (powerpc and vfio) and hence need two maintainers' concurrence, and thus haven't gone anywhere yet. (Of course, issue (B) exacerbates issue (A).) Upon moving host and guests to 4.8 kernel. Still almost whole memory is getting used on host. Any updates here, any patches that we can expect soon ? Please let us know.. Thanks, Manju 4.8 does not yet have the fix for the pinned page migrations. I am not sure of the status of https://patchwork.kernel.org/patch/9238861/ upstream. I checked to see if I could find it in any git tree, but could not. I suspect we need this fix in first. > Balbir - Is this fixed in the latest 4.8 kernel out today? My patch is in powerpc-next https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git/commit/?h=next&id=2e5bbb5461f138cac631fe21b4ad956feabfba22 Should hit 4.9 and we can backport it. I am also trying to work on improvements to the patch for the future. Not sure of aik's patch status Balbir Singh. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1632045/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp