[Group.of.nepali.translators] [Bug 1744300] Re: bt_iter() crash due to NULL pointer

2018-01-19 Thread Guilherme G. Piccoli
** No longer affects: linux (Ubuntu Bionic)

** Changed in: linux (Ubuntu Xenial)
   Status: New => Fix Released

** Changed in: linux (Ubuntu Artful)
   Status: New => In Progress

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1744300

Title:
  bt_iter() crash due to NULL pointer

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Artful:
  In Progress

Bug description:
  SRU Justification:

  
  [Impact]
  The following crash was observed in Ubuntu 16.04 running linux-gcp kernel 
version 4.13 (specifically 4.13.0-1006.9):

  [ 10.972644] BUG: unable to handle kernel NULL pointer dereference at 
0030 
  [ 10.980708] IP: bt_iter+0x31/0x50 
  [ 10.984310] PGD 0 
  [ 10.984310] P4D 0 
  [ 10.986439] 
  [ 10.990190] Oops:  [#1] SMP PTI 
  [ 11.016282] Workqueue: kblockd blk_mq_timeout_work 
  [ 11.021196] task: 8e7c2e70 task.stack: b8d4c67a8000 
  [ 11.027234] RIP: 0010:bt_iter+0x31/0x50 
  [ 11.031187] RSP: 0018:b8d4c67abda0 EFLAGS: 00010206 
  [ 11.037730] RAX: b8d4c67abdd0 RBX: 0180 RCX: 
 
  [ 11.045172] RDX: 8e7c34c8d280 RSI:  RDI: 
8e7c32dd8000 
  [ 11.053321] RBP: b8d4c67abe20 R08:  R09: 
2100 
  [ 11.060582] R10: 0130 R11: fffee5bf R12: 
8e7c3572c790 
  [ 11.068094] R13: 8e7c3572c780 R14: 0008 R15: 
8e7c35e7c180 
  [ 11.075522] FS: () GS:8e7c3a4c() 
knlGS: 
  [ 11.083721] CS: 0010 DS:  ES:  CR0: 80050033 
  [ 11.089593] CR2: 0030 CR3: 9e20a003 CR4: 
001606e0 
  [ 11.096871] Call Trace: 
  [ 11.099468] ? blk_mq_queue_tag_busy_iter+0xe2/0x1f0 
  [ 11.104558] ? blk_mq_rq_timed_out+0x70/0x70 
  [ 11.109130] ? blk_mq_rq_timed_out+0x70/0x70 
  [ 11.114933] blk_mq_timeout_work+0xbb/0x170 
  [ 11.119408] process_one_work+0x156/0x410 
  [ 11.123641] worker_thread+0x4b/0x460 
  [ 11.127827] kthread+0x109/0x140 
  [ 11.131186] ? process_one_work+0x410/0x410 
  [ 11.135499] ? kthread_create_on_node+0x70/0x70 
  [ 11.140408] ret_from_fork+0x1f/0x30 
  [ 11.144110] Code: 89 d0 48 8b 3a 0f b6 48 18 48 8b 97 30 01 00 00 84 c9 75 
03 03 72 04 48 8b 92 80 00 00 00 89 f6 48 8b 34 f2 48 8b 97 c0 00 00 00 <48> 39 
56 30 74 06 b8 01 00 00 00 c3 55 48 8b 50 10 48 89 e5 ff 
  [ 11.167573] RIP: bt_iter+0x31/0x50 RSP: b8d4c67abda0 
  [ 11.173028] CR2: 0030 
  [ 11.176515] ---[ end trace 2f8e5b1cf4139fec ]--- 
  [ 11.182589] Kernel panic - not syncing: Fatal exception 

  Basically, we have a NULL pointer dereference while in bt_iter()
  function - this is caused because after the merge of blk-mq scheduler
  capability on Linux kernel , tags->rqs[] array has been dinamically
  assigned and there's a small window of time in which the bit is set
  but tags->rqs[] array wasn't allocated yet. This was reported to
  happen in about 5% of test runs (more details on test section).

  
  [Fix]
  The fix is small and simple, and it's upstream already. Basically, it adds a 
NULL pointer check on bt_iter() and bt_tags_iter() functions.

  The fix is: 7f5562d5ecc4 ("blk-mq-tag: check for NULL rq when iterating 
tags"), by Jens Axboe.
  
(https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7f5562d5ecc4)

  
  [Testcase] 
  Since the problem manifests in a small non-deterministic time window, there's 
no easy test to reproduce this. In our case, it was observed while testing a 
large number of CPU's and attached disks (>200 disks, >150 cores), trying to 
exercise all CPUs and disks (the disks with quick dd commands). In this test 
scenario, as already mentioned, issue occured in about 5% of the runs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1744300/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1750013] Re: systemd-logind: memory leaks on session's connections (trusty-only)

2018-02-16 Thread Guilherme G. Piccoli
** Changed in: systemd (Ubuntu Trusty)
   Importance: Undecided => Medium

** Changed in: systemd (Ubuntu Trusty)
   Status: New => In Progress

** Changed in: systemd (Ubuntu Trusty)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: systemd (Ubuntu Xenial)
   Importance: Undecided => Medium

** Changed in: systemd (Ubuntu Xenial)
   Status: New => Fix Released

** Changed in: systemd (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: systemd (Ubuntu Bionic)
   Status: In Progress => Fix Released

** Changed in: systemd (Ubuntu Artful)
   Importance: Undecided => Medium

** Changed in: systemd (Ubuntu Artful)
   Status: New => Fix Released

** Changed in: systemd (Ubuntu Artful)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1750013

Title:
  systemd-logind: memory leaks on session's connections (trusty-only)

Status in systemd package in Ubuntu:
  Fix Released
Status in systemd source package in Trusty:
  In Progress
Status in systemd source package in Xenial:
  Fix Released
Status in systemd source package in Artful:
  Fix Released
Status in systemd source package in Bionic:
  Fix Released

Bug description:
  It was observed that systemd-logind tool is leaking memory at each session
  connected. The issue happens in systemd from Trusty (14.04), which latest
  version currently (Feb/2018) is 204-5ubuntu20.26 (and still reproduces
  the bug).

  The basic test-case is to run the following loop from a remote
  machine:

  while true; do ssh  "whoami"; done

  and watch the increase in memory consumption from "systemd-logind" process
  in the target machine. One can use the "ps uax" command to verify the
  RSS of the process, or count the anon pages from /proc//smaps.

  To clarify a bit how a session works, the following "stack trace" details
  a bit which function calls happen when a SSH connection is opened, from
  Trusty's systemd-logind point of view:

  
  main() 
  manager_startup()
  manager_run() [event-loop]
  bus_loop_dispatch() 
  dbus_watch_handle() -> bus_manager_message_handler()
  bus_manager_create_session()
  manager_add_session() 
  session_new() 
  session_create_fifo()
  session_start()
  session_create_cgroup()
  session_save()
  session_bus_path()
  [...]

  After each session is closed, it was observed that session_free() isn't
  called, keeping the sessions alive. This can be verified through the
  command "loginctl list-session" - each session that once connected is
  present there "forever".

  The memory leaks can eventually lead to OOM situation of this process.
  Debug progress will be tracked here, in this LP.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1750013/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1771557] Re: NVMe boot drives not supported - failing in generating initramfs

2018-05-16 Thread Guilherme G. Piccoli
** Changed in: initramfs-tools (Ubuntu)
   Status: In Progress => Fix Released

** Changed in: initramfs-tools (Ubuntu)
   Importance: High => Medium

** Changed in: initramfs-tools (Ubuntu)
Milestone: trusty-updates => None

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1771557

Title:
  NVMe boot drives not supported - failing in generating initramfs

Status in initramfs-tools:
  Fix Released
Status in initramfs-tools package in Ubuntu:
  Fix Released
Status in initramfs-tools source package in Trusty:
  In Progress
Status in initramfs-tools source package in Xenial:
  Fix Released

Bug description:
  [Impact] 
  The initramfs-tools hook-functions script cannot translate nvmeXnYpZ to 
nvmeXnY block device, so it's failing and not building the initram disk.

  Upstream solution is composed for at least 2 patches (it's a series,
  but the 2 below are really the needed ones):

  commit 3cb744c9
  Author: Ben Hutchings 
  hook-functions: Rewrite block device sysfs lookup to be generic

  commit 8ac52dc0
  Author: Ben Hutchings 
  hook-functions: Include modules for all components of a multi-disk device

  Instead of doing the backport, which is huge, we added another sed 
substitution: currently the script has substitutions for sdX and hdX, in order 
to convert sda1 to sda, for example. The new substitution converts nvmeXnYpZ to 
nvmeXnY.
  It's less intrusive than the full backport, since this is a SRU to Trusty 
only.

  
  [Test Case]
  To be added.

  
  [Regression Potential] 
  If the sed expression was somewhat broken, we could have an issue generating 
initiramfs for generic block devices, like regular HDDs.

  
  [Other Info]
  This issue is based on Debian bug #785147: 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=785147

To manage notifications about this bug go to:
https://bugs.launchpad.net/initramfs-tools/+bug/1771557/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1771557] Re: NVMe boot drives not supported - failing in generating initramfs

2018-05-16 Thread Guilherme G. Piccoli
** Package changed: linux (Ubuntu) => initramfs-tools (Ubuntu)

** Changed in: initramfs-tools (Ubuntu Xenial)
   Importance: Undecided => Medium

** Changed in: initramfs-tools (Ubuntu Xenial)
   Status: New => Fix Released

** Changed in: initramfs-tools (Ubuntu Trusty)
   Importance: Undecided => High

** Changed in: initramfs-tools (Ubuntu Trusty)
   Status: New => In Progress

** Changed in: initramfs-tools (Ubuntu Trusty)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: initramfs-tools (Ubuntu Trusty)
Milestone: None => trusty-updates

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1771557

Title:
  NVMe boot drives not supported - failing in generating initramfs

Status in initramfs-tools:
  Fix Released
Status in initramfs-tools package in Ubuntu:
  In Progress
Status in initramfs-tools source package in Trusty:
  In Progress
Status in initramfs-tools source package in Xenial:
  Fix Released

Bug description:
  [Impact] 
  The initramfs-tools hook-functions script cannot translate nvmeXnYpZ to 
nvmeXnY block device, so it's failing and not building the initram disk.

  Upstream solution is composed for at least 2 patches (it's a series,
  but the 2 below are really the needed ones):

  commit 3cb744c9
  Author: Ben Hutchings 
  hook-functions: Rewrite block device sysfs lookup to be generic

  commit 8ac52dc0
  Author: Ben Hutchings 
  hook-functions: Include modules for all components of a multi-disk device

  Instead of doing the backport, which is huge, we added another sed 
substitution: currently the script has substitutions for sdX and hdX, in order 
to convert sda1 to sda, for example. The new substitution converts nvmeXnYpZ to 
nvmeXnY.
  It's less intrusive than the full backport, since this is a SRU to Trusty 
only.

  
  [Test Case]
  To be added.

  
  [Regression Potential] 
  If the sed expression was somewhat broken, we could have an issue generating 
initiramfs for generic block devices, like regular HDDs.

  
  [Other Info]
  This issue is based on Debian bug #785147: 
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=785147

To manage notifications about this bug go to:
https://bugs.launchpad.net/initramfs-tools/+bug/1771557/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1857616] Re: Cannot collect dump due to "Can't get a valid pmd_pte" error

2019-12-27 Thread Guilherme G. Piccoli
This proved not to be the case; in fact, the issue was caused by a
change in kernel that changed a bit and hence, the vmcore collection
failed.

The culprit was kernel commit 326e1b8f83a4 ("mm/sparsemem: introduce a 
SECTION_IS_EARLY flag"), introduced in kernel 5.3.
The makedumpfile fix for this comes in commit 7bdb468c2c ("Increase 
SECTION_MAP_LAST_BIT to 4").

The PPA launchpad.net/~gpiccoli/+archive/ubuntu/lp1857616 contains a build with 
this fix for testing purposes.
Cheers,

Guilherme




** Also affects: makedumpfile (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: makedumpfile (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: makedumpfile (Ubuntu Disco)
   Importance: Undecided
   Status: New

** Also affects: makedumpfile (Ubuntu Focal)
   Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: Confirmed

** Also affects: makedumpfile (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Changed in: makedumpfile (Ubuntu Xenial)
   Status: New => Invalid

** Changed in: makedumpfile (Ubuntu Bionic)
   Status: New => Confirmed

** Changed in: makedumpfile (Ubuntu Disco)
   Status: New => Invalid

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1857616

Title:
  Cannot collect dump due to "Can't get a valid pmd_pte" error

Status in makedumpfile package in Ubuntu:
  Confirmed
Status in makedumpfile source package in Xenial:
  Invalid
Status in makedumpfile source package in Bionic:
  Confirmed
Status in makedumpfile source package in Disco:
  Invalid
Status in makedumpfile source package in Eoan:
  New
Status in makedumpfile source package in Focal:
  Confirmed

Bug description:
  Due to an issue in the kaslr address resolution in makedumpfile, we
  may get the following error when collecting a dump:

  Excluding unnecessary pages : [ 46.3 %] / __vtop4_x86_64[ 39.341233]: Can't 
get a valid pmd_pte.
  readmem: Can't convert a virtual address(e05cb400) to physical 
address.
  readmem: type_addr: 0, addr:e05cb400, size:32768
  __exclude_unnecessary_pages: Can't read the buffer of struct page.
  create_2nd_bitmap: Can't exclude unnecessary pages.

  This is believed to be fixed by commit: 3222d4ad04c6 ("x86_64: fix
  get_kaslr_offset_x86_64() to return kaslr_offset correctly"). This LP
  keeps track of Test/SRU for this commit.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1857616/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1844455] Re: Memory leak on libvirt 1.3.1

2020-01-22 Thread Guilherme G. Piccoli
We validated that commit 38816336a5 ("node_device_conf: Don't leak
@physical_function in virNodeDeviceGetPCISRIOVCaps") [0] indeed fixes
the leak investigated. Although there are more definitely-lost memory
reports from Valgrind, they are ultimately glibc-related and given the
report was in Trusty and they are not considerable leaks (at most,
8K/24H) our focus will be to fix the PCI-related leak in all libvirt
releases.

SRU template and debdiffs will get added here soon.
Cheers,


Guilherme


[0] libvirt.org/git/?p=libvirt.git;a=commit;h=38816336a5

** Also affects: libvirt (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: libvirt (Ubuntu Focal)
   Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: Confirmed

** Also affects: libvirt (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Changed in: libvirt (Ubuntu Eoan)
   Status: New => Confirmed

** Changed in: libvirt (Ubuntu Bionic)
   Status: New => Confirmed

** Changed in: libvirt (Ubuntu Xenial)
   Status: New => Confirmed

** Changed in: libvirt (Ubuntu Eoan)
   Importance: Undecided => High

** Changed in: libvirt (Ubuntu Bionic)
   Importance: Undecided => High

** Changed in: libvirt (Ubuntu Xenial)
   Importance: Undecided => High

** Changed in: libvirt (Ubuntu Bionic)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: libvirt (Ubuntu Eoan)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1844455

Title:
  Memory leak on libvirt 1.3.1

Status in Ubuntu Cloud Archive:
  Confirmed
Status in Ubuntu Cloud Archive mitaka series:
  Confirmed
Status in libvirt package in Ubuntu:
  Confirmed
Status in libvirt source package in Xenial:
  Confirmed
Status in libvirt source package in Bionic:
  Confirmed
Status in libvirt source package in Eoan:
  Confirmed
Status in libvirt source package in Focal:
  Confirmed

Bug description:
  It was reported that libvirt 1.3.1 running on Trusty (through
  UCA/Mitaka) is getting OOM'ed after a while - in our reports took 2
  years for the leak to trigger an out-of-memory situation, but this may
  change according to the user available memory.

  Valgrind was executed in a similar environment, we were able to
  collect information about the "definitely lost" memory of libvirt
  process (attached) below.

  The leaks are detailed in next comments.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1844455/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1851663] Re: Consistent autopkgtest failures on ppc64el/s390x

2020-01-28 Thread Guilherme G. Piccoli
** Branch linked: lp:~gpiccoli/britney/hints-ubuntu-xenial

** Changed in: makedumpfile (Ubuntu Disco)
   Status: Confirmed => Won't Fix

** Changed in: makedumpfile (Ubuntu Disco)
   Importance: Medium => Undecided

** Changed in: makedumpfile (Ubuntu Disco)
 Assignee: Thadeu Lima de Souza Cascardo (cascardo) => (unassigned)

** Also affects: makedumpfile (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Changed in: makedumpfile (Ubuntu Xenial)
   Status: New => Confirmed

** Changed in: makedumpfile (Ubuntu Xenial)
   Importance: Undecided => Medium

** Changed in: makedumpfile (Ubuntu Xenial)
 Assignee: (unassigned) => Thadeu Lima de Souza Cascardo (cascardo)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1851663

Title:
  Consistent autopkgtest failures on ppc64el/s390x

Status in makedumpfile package in Ubuntu:
  Confirmed
Status in makedumpfile source package in Xenial:
  Confirmed
Status in makedumpfile source package in Bionic:
  Confirmed
Status in makedumpfile source package in Disco:
  Won't Fix
Status in makedumpfile source package in Eoan:
  Confirmed
Status in makedumpfile source package in Focal:
  Confirmed

Bug description:
  Recently autopkgtest started to consistently fail in ppc64el / s390x.
  When testing manually, the kernel dump is collected and all works
  fine. By discussing with the package maintainer (Cascardo), it seems
  it started after a change in the instance type of the tests, to
  m1.large.

  This seems either a problem in the test infrastructure itself or a likely a 
"timing" problem in the test (i.e., requiring some pause or delay that is not 
currently enough).
  This demands investigation because those failures delay the package release 
for all arches, usually those releases contain important fixes.

  Also, the recurrent "workaround" is to mark the tests as "force-
  badtest" in ubuntu-hints for the 2 arches, which is in practice
  skipping the test, so those 2 arches are getting releases untested by
  the automatic infrastructure.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1851663/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1844455] Re: Memory leak of struct _virPCIDeviceAddress on libvirt

2020-02-21 Thread Guilherme G. Piccoli
** Summary changed:

- Memory leak on libvirt 1.3.1
+ Memory leak of struct _virPCIDeviceAddress on libvirt

** Changed in: libvirt (Ubuntu Focal)
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1844455

Title:
  Memory leak of struct _virPCIDeviceAddress on libvirt

Status in Ubuntu Cloud Archive:
  Confirmed
Status in Ubuntu Cloud Archive mitaka series:
  Confirmed
Status in libvirt package in Ubuntu:
  Fix Released
Status in libvirt source package in Xenial:
  Confirmed
Status in libvirt source package in Bionic:
  Confirmed
Status in libvirt source package in Eoan:
  Confirmed
Status in libvirt source package in Focal:
  Fix Released

Bug description:
  It was reported that libvirt 1.3.1 running on Trusty (through
  UCA/Mitaka) is getting OOM'ed after a while - in our reports took 2
  years for the leak to trigger an out-of-memory situation, but this may
  change according to the user available memory.

  Valgrind was executed in a similar environment, we were able to
  collect information about the "definitely lost" memory of libvirt
  process (attached) below.

  The leaks are detailed in next comments.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1844455/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1864918] [NEW] libvirt for Xenial failing to build due to gnutls SHA1 restriction

2020-02-26 Thread Guilherme G. Piccoli
Public bug reported:

This is a placeholder for now, while I work the patch SRU to make test pass on 
xenial build.
Will enhance the description soon.

** Affects: libvirt (Ubuntu)
 Importance: Undecided
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Fix Released

** Affects: libvirt (Ubuntu Xenial)
 Importance: Medium
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed


** Tags: seg

** Also affects: libvirt (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Changed in: libvirt (Ubuntu Xenial)
   Status: New => Incomplete

** Changed in: libvirt (Ubuntu Xenial)
   Status: Incomplete => Confirmed

** Changed in: libvirt (Ubuntu)
   Status: Confirmed => Fix Released

** Changed in: libvirt (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: libvirt (Ubuntu Xenial)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1864918

Title:
  libvirt for Xenial failing to build due to gnutls SHA1 restriction

Status in libvirt package in Ubuntu:
  Fix Released
Status in libvirt source package in Xenial:
  Confirmed

Bug description:
  This is a placeholder for now, while I work the patch SRU to make test pass 
on xenial build.
  Will enhance the description soon.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1864918/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1845048] Re: /etc/default/kdump-tools KDUMP_SYSCTL does not set sysctl params

2020-02-29 Thread Guilherme G. Piccoli
** Also affects: makedumpfile (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: makedumpfile (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Also affects: makedumpfile (Ubuntu Focal)
   Importance: Low
   Status: Triaged

** Also affects: makedumpfile (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Changed in: makedumpfile (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: makedumpfile (Ubuntu Bionic)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: makedumpfile (Ubuntu Eoan)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: makedumpfile (Ubuntu Focal)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: makedumpfile (Ubuntu Focal)
   Importance: Low => Medium

** Changed in: makedumpfile (Ubuntu Eoan)
   Importance: Undecided => Medium

** Changed in: makedumpfile (Ubuntu Bionic)
   Importance: Undecided => Medium

** Changed in: makedumpfile (Ubuntu Xenial)
   Importance: Undecided => Medium

** Changed in: makedumpfile (Ubuntu Xenial)
   Status: New => Confirmed

** Changed in: makedumpfile (Ubuntu Bionic)
   Status: New => Confirmed

** Changed in: makedumpfile (Ubuntu Eoan)
   Status: New => Confirmed

** Changed in: makedumpfile (Ubuntu Focal)
   Status: Triaged => Confirmed

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1845048

Title:
  /etc/default/kdump-tools KDUMP_SYSCTL does not set sysctl params

Status in makedumpfile package in Ubuntu:
  Confirmed
Status in makedumpfile source package in Xenial:
  Confirmed
Status in makedumpfile source package in Bionic:
  Confirmed
Status in makedumpfile source package in Eoan:
  Confirmed
Status in makedumpfile source package in Focal:
  Confirmed

Bug description:
  [impact]

  Documentation, and past behavior, for kdump-tools was that the
  KDUMP_SYSCTL variable in the /etc/default/kdump-tools file would be
  applied to the system kernel params at kdump 'load'.  However this is
  no longer true, and those params are no longer applied to the system's
  kernel param settings.

  [test case]

  install linux-crashdump (and kdump-tools).

  Edit the /etc/default/kdump-tools file to set the KDUMP_SYSCTL param
  to something other than default, e.g.:

  KDUMP_SYSCTL="kernel.panic_on_oops=1 kernel.panic_on_warn=1"

  reboot, or unload/reload kdump, to pick up the changes to the file.

  Check if the panic_on_warn param is set:

  $ cat /proc/sys/kernel/panic_on_warn
  0

  the problem does not seem to be with sysctl, as manually calling it
  does work:

  $ KDUMP_SYSCTL="kernel.panic_on_oops=1 kernel.panic_on_warn=1"
  $ cat /proc/sys/kernel/panic_on_warn
  0
  $ sudo sysctl -w $KDUMP_SYSCTL
  kernel.panic_on_oops = 1
  kernel.panic_on_warn = 1
  $ cat /proc/sys/kernel/panic_on_warn
  1

  [regression potential]

  TBD

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1845048/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1844455] Re: Memory leak of struct _virPCIDeviceAddress on libvirt

2020-03-31 Thread Guilherme G. Piccoli
** Changed in: cloud-archive/mitaka
   Status: Fix Committed => Fix Released

** Changed in: cloud-archive
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1844455

Title:
  Memory leak of struct _virPCIDeviceAddress on libvirt

Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive mitaka series:
  Fix Released
Status in Ubuntu Cloud Archive queens series:
  Fix Released
Status in libvirt package in Ubuntu:
  Fix Released
Status in libvirt source package in Xenial:
  Fix Released
Status in libvirt source package in Bionic:
  Fix Released
Status in libvirt source package in Eoan:
  Fix Released
Status in libvirt source package in Focal:
  Fix Released

Bug description:
  [Impact]
  * There's a long-term memory leak in libvirt related to the PCI information 
gathering from sysfs in Linux, specially related with SR-IOV devices. This was 
fixed by commit 38816336 ("node_device_conf: Don't leak @physical_function in 
virNodeDeviceGetPCISRIOVCaps") [ 
libvirt.org/git/?p=libvirt.git;a=commit;h=38816336 ].

  * In comment #9 there is a detailed explanation of what's going on,
  but the summary is that the variable physical_function (member of a
  PCI structure), of type _virPCIDeviceAddress, is allocated on
  virPCIGetDeviceAddressFromSysfsLink() and should be freed before reuse
  in virNodeDeviceGetPCISRIOVCaps(), but it wasn't before the fix was
  introduced.

  * The impact of the issue is a memory leak usually small but that may
  grow bigger depending on the amount of PCI devices and how/when they
  are enumerated by libvirt; if some user of those functions are
  actively exercising the leak path it may become a problem (OOM
  situation).

  [Test Case]
  * The basic testing done to exercise the memory leak path was running the 
virsh tool to generate the XML output of a SR-IOV PCI device in a loop, like:

  while true; do virsh nodedev-dumpxml pci__08_12_0 >/dev/null; done

  * This was executed while Valgrind was used to debug libvirtd, in
  order to collect the signature of the leak. Without the patch we get
  the "definitely lost" type of leak with the PCI backtrace (on comment
  #9), whereas with the patch we don't see the leak anymore.

  [Regression Potential]
  * The potential of regressions is really low - the fix is upstream for a 
while and in Focal package, and it is self-contained and not intrusive. 
Considering hypothetical scenarios, if there's an issue with the fix it should 
come in form of unused memory or double-free (which is usually harmless), and 
only in PCI enumeration (or PCI XML generation) paths.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1844455/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1869948] [NEW] Multiple Kexec in AWS Nitro instances fail

2020-03-31 Thread Guilherme G. Piccoli
Public bug reported:

Placeholder
To be improved

** Affects: linux (Ubuntu)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed

** Affects: linux (Ubuntu Xenial)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed

** Affects: linux (Ubuntu Bionic)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed

** Affects: linux (Ubuntu Eoan)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed

** Affects: linux (Ubuntu Focal)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed


** Tags: sts

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Focal)
   Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: Confirmed

** Also affects: linux (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Eoan)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Bionic)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Eoan)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Eoan)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Bionic)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Xenial)
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1869948

Title:
  Multiple Kexec in AWS Nitro instances fail

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Xenial:
  Confirmed
Status in linux source package in Bionic:
  Confirmed
Status in linux source package in Eoan:
  Confirmed
Status in linux source package in Focal:
  Confirmed

Bug description:
  Placeholder
  To be improved

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1869948/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1869948] Re: Multiple Kexec in AWS Nitro instances fail

2020-04-10 Thread Guilherme G. Piccoli
Regarding Disco, despite the "Fix Committed" status I didn't find the
patch in the latest tags from generic tree (Ubuntu-5.0.0-46.50) nor AWS
tree (Ubuntu-aws-5.0.0-1024.27), so I think the patch wasn't merged
(which is not a big deal, given Bionic HWE is now based on 5.3).

I've reverted the "Fix Committed" status, let me know if we should de-
nominate Disco.

Thanks,


Guilherme

** Changed in: linux (Ubuntu Disco)
   Status: Fix Committed => Opinion

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1869948

Title:
  Multiple Kexec in AWS Nitro instances fail

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Disco:
  Opinion
Status in linux source package in Eoan:
  Fix Committed
Status in linux source package in Focal:
  Fix Committed

Bug description:
  [Impact]
  * Currently, users cannot perform multiple kernel kexec loads on AWS Nitro 
instances (KVM-based); after the 2nd or 3rd kexec, an initrd corruption is 
observed, with the following signature:

   Initramfs unpacking failed: junk within compressed archive
  [...]
   Kernel panic - not syncing: No working init found.
  Try passing init= option to kernel. See Linux 
Documentation/admin-guide/init.rst for guidance.
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.5.0-rc7-gpiccoli+ #26  Hardware 
name: Amazon EC2 t3.large/, BIOS 1.0 10/16/2017
  Call Trace:
dump_stack+0x6d/0x9a
? csum_partial_copy_generic+0x150/0x170
panic+0x101/0x2e3
? do_execve+0x25/0x30
? rest_init+0xb0/0xb0
kernel_init+0xfb/0x100
ret_from_fork+0x35/0x40

  * After investigation (see comment 2), it was noticed the Amazon ena
  network driver doesn't provide a shutdown() handler, hence it could be
  performing a DMA transaction to a previous valid address during boot,
  which would then corrupt kernel memory. The following patch was
  proposed and fixed the issue, allowing 1000 kexecs to be executed
  successfully with no issues observed: 428c491332bc("net: ena: Add PCI
  shutdown handler to allow safe kexec") [
  git.kernel.org/linus/428c491332bc ].

  * Hence, we are hereby requesting SRU for this patch. It was tested in
  all supported series (4.4, 4.15 and 5.3) in Amazon Nitro instances
  with success, and reviewed/acked by ena driver team and a kexec
  developer from other distro. Worth mentioning that we proposed an
  upstream multi-vendor discussion about this issue:
  marc.info/?l=kexec&m=158299605013194

  [Test case]

  * The basic test procedure is about performing multiple kexecs
  sequentially; AWS does not provide a full console, so in case of
  failures one could check the instance screenshot or use pstore/ramoops
  in order to collect dmesg after a crash in a preserved memory area.
  The commands used to perform kexec are:

  kexec -l  --initrd  --reuse-cmdline
  systemctl kexec

  Alternatively, one could user "--append=" instead of "--reuse-cmdline"
  if a change in kexec command-line is desired; also, to execute the
  kexec-loaded kernel both "kexec -e" and "systemctl kexec" are equally
  valid.

  * On comment 3 we proposed a script/approach to auto-test kexecs, used
  here to perform 1000 kexecs with the proposed patch.

  [Regression Potential]

  * Although the patch proposed here introduce a PCI handler, it kept
  the remove handler identical and based shutdown strongly on
  ena_remove(), changing just netdev handling following other upstream
  drivers. It was extensively tested and presented no issue. Also, it's
  self-contained and affect only one driver, so any other cloud
  providers or non-cloud environment wouldn't be even affected by the
  patch.

  * In case of a potential regression, it could manifest as a delay or
  issue on reboot/shutdown path, only if ena driver is in use.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1869948/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1869948] Re: Multiple Kexec in AWS Nitro instances fail

2020-04-29 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu Focal)
   Status: Fix Committed => Fix Released

** Changed in: linux (Ubuntu Disco)
   Importance: Medium => Low

** Changed in: linux (Ubuntu Disco)
   Status: Opinion => Won't Fix

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1869948

Title:
  Multiple Kexec in AWS Nitro instances fail

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux source package in Disco:
  Won't Fix
Status in linux source package in Eoan:
  Fix Released
Status in linux source package in Focal:
  Fix Released

Bug description:
  [Impact]
  * Currently, users cannot perform multiple kernel kexec loads on AWS Nitro 
instances (KVM-based); after the 2nd or 3rd kexec, an initrd corruption is 
observed, with the following signature:

   Initramfs unpacking failed: junk within compressed archive
  [...]
   Kernel panic - not syncing: No working init found.
  Try passing init= option to kernel. See Linux 
Documentation/admin-guide/init.rst for guidance.
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.5.0-rc7-gpiccoli+ #26  Hardware 
name: Amazon EC2 t3.large/, BIOS 1.0 10/16/2017
  Call Trace:
dump_stack+0x6d/0x9a
? csum_partial_copy_generic+0x150/0x170
panic+0x101/0x2e3
? do_execve+0x25/0x30
? rest_init+0xb0/0xb0
kernel_init+0xfb/0x100
ret_from_fork+0x35/0x40

  * After investigation (see comment 2), it was noticed the Amazon ena
  network driver doesn't provide a shutdown() handler, hence it could be
  performing a DMA transaction to a previous valid address during boot,
  which would then corrupt kernel memory. The following patch was
  proposed and fixed the issue, allowing 1000 kexecs to be executed
  successfully with no issues observed: 428c491332bc("net: ena: Add PCI
  shutdown handler to allow safe kexec") [
  git.kernel.org/linus/428c491332bc ].

  * Hence, we are hereby requesting SRU for this patch. It was tested in
  all supported series (4.4, 4.15 and 5.3) in Amazon Nitro instances
  with success, and reviewed/acked by ena driver team and a kexec
  developer from other distro. Worth mentioning that we proposed an
  upstream multi-vendor discussion about this issue:
  marc.info/?l=kexec&m=158299605013194

  [Test case]

  * The basic test procedure is about performing multiple kexecs
  sequentially; AWS does not provide a full console, so in case of
  failures one could check the instance screenshot or use pstore/ramoops
  in order to collect dmesg after a crash in a preserved memory area.
  The commands used to perform kexec are:

  kexec -l  --initrd  --reuse-cmdline
  systemctl kexec

  Alternatively, one could user "--append=" instead of "--reuse-cmdline"
  if a change in kexec command-line is desired; also, to execute the
  kexec-loaded kernel both "kexec -e" and "systemctl kexec" are equally
  valid.

  * On comment 3 we proposed a script/approach to auto-test kexecs, used
  here to perform 1000 kexecs with the proposed patch.

  [Regression Potential]

  * Although the patch proposed here introduce a PCI handler, it kept
  the remove handler identical and based shutdown strongly on
  ena_remove(), changing just netdev handling following other upstream
  drivers. It was extensively tested and presented no issue. Also, it's
  self-contained and affect only one driver, so any other cloud
  providers or non-cloud environment wouldn't be even affected by the
  patch.

  * In case of a potential regression, it could manifest as a delay or
  issue on reboot/shutdown path, only if ena driver is in use.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1869948/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1749961] Re: xhci_hcd: TRB DMA errors reported with ASMedia ASM1142 USB 3.1 Controller

2020-05-04 Thread Guilherme G. Piccoli
Hi John and all, thanks for updating logs and enhancing the report! Due
to a lot of other work, and the nature of this problem (being a FW issue
that we'd try to alleviate using a hack in linux), I wasn't able to work
the tentative hack approach yet.

The issue was "resolved" through a FW update by ASMedia, but this was
only worked with a specific motherboard vendor, not as a general
release. This is the problem with FW fixes...they are quite scattered
and vendor-depending. So, can you John or any of the reporters try to
reproduce the problem with:

(a) Ubuntu 20.04, just released?
(b) Ubuntu 18.04 running the current HWE kernel (5.3)?

That'd be good data points. Also, if you could try Ubuntu 20.04 with latest 
mainline kernel (from [0]), that would be a gigantic help!
Thanks in advance,


Guilherme


[0] https://kernel.ubuntu.com/~kernel-ppa/mainline/?C=N;O=D

** No longer affects: linux (Ubuntu Artful)

** Changed in: linux (Ubuntu Trusty)
   Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1749961

Title:
  xhci_hcd: TRB DMA errors reported with ASMedia ASM1142 USB 3.1
  Controller

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Trusty:
  Won't Fix
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Bionic:
  In Progress
Status in linux package in Debian:
  Confirmed

Bug description:
  It was observed that while trying to use a 4K USB webcam connected to
  USB port provided by ASMedia ASM1142 USB 3.1 Controller, the webcam
  does not work and kernel log shows the following messages:

  [431.928016] xhci_hcd :12:00.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 13
  [431.928021] xhci_hcd :12:00.0: Looking for event-dma 003f3330e020 
trb-start 003f3330e000 trb-end 003f3330e000 seg-start 003f3330e000 
seg-end 003f3330eff0
  [431.928024] xhci_hcd :12:00.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 13
  [431.928026] xhci_hcd :12:00.0: Looking for event-dma 003f3330e030 
trb-start 003f3330e000 trb-end 003f3330e000 seg-start 003f3330e000 
seg-end 003f3330eff0
  [431.928027] xhci_hcd :12:00.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 13
  [431.928029] xhci_hcd :12:00.0: Looking for event-dma 003f3330e050 
trb-start 003f3330e000 trb-end 003f3330e000 seg-start 003f3330e000 
seg-end 003f3330eff0
  [431.928386] xhci_hcd :12:00.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 13

  A similar issue was already reported on Launchpad:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1667750

  The fix to this issue seems to be the following patch:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9da5a109

  Tests in our scenario with this patch proved still broken. Our next
  approach is to modify the patch a bit and re-test.

  This LP will be used to document our progress in the investigation.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1749961/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1749961] Re: xhci_hcd: TRB DMA errors reported with ASMedia ASM1142 USB 3.1 Controller

2020-05-05 Thread Guilherme G. Piccoli
** Also affects: linux (Ubuntu Focal)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1749961

Title:
  xhci_hcd: TRB DMA errors reported with ASMedia ASM1142 USB 3.1
  Controller

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Trusty:
  Won't Fix
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Focal:
  Confirmed
Status in linux package in Debian:
  Confirmed

Bug description:
  It was observed that while trying to use a 4K USB webcam connected to
  USB port provided by ASMedia ASM1142 USB 3.1 Controller, the webcam
  does not work and kernel log shows the following messages:

  [431.928016] xhci_hcd :12:00.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 13
  [431.928021] xhci_hcd :12:00.0: Looking for event-dma 003f3330e020 
trb-start 003f3330e000 trb-end 003f3330e000 seg-start 003f3330e000 
seg-end 003f3330eff0
  [431.928024] xhci_hcd :12:00.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 13
  [431.928026] xhci_hcd :12:00.0: Looking for event-dma 003f3330e030 
trb-start 003f3330e000 trb-end 003f3330e000 seg-start 003f3330e000 
seg-end 003f3330eff0
  [431.928027] xhci_hcd :12:00.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 13
  [431.928029] xhci_hcd :12:00.0: Looking for event-dma 003f3330e050 
trb-start 003f3330e000 trb-end 003f3330e000 seg-start 003f3330e000 
seg-end 003f3330eff0
  [431.928386] xhci_hcd :12:00.0: ERROR Transfer event TRB DMA ptr not part 
of current TD ep_index 2 comp_code 13

  A similar issue was already reported on Launchpad:
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1667750

  The fix to this issue seems to be the following patch:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9da5a109

  Tests in our scenario with this patch proved still broken. Our next
  approach is to modify the patch a bit and re-test.

  This LP will be used to document our progress in the investigation.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1749961/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1817918] Re: Hard lockups due to unrestricted lapic timer delay

2020-05-10 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu)
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1817918

Title:
  Hard lockups due to unrestricted lapic timer delay

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  [Impact]

  * There is a long-time report of an issue with the TSC delay present
  in wait_lapic_expire() - basically the guest could have an expiration
  timer configured in a way it induces host to wait a long time (with
  preemption disabled), so there's a potential scenario for host lockups.

  * The stack trace we have access (from an user report of this issue)
  is (summarized) below:

  NMI watchdog: Watchdog detected hard LOCKUP on cpu 16
  [...]
  CPU: 16 PID: 3024910 Comm: CPU 0/KVM Not tainted 4.4.0-139-generic #165-Ubuntu
  RIP: 0010:[]  [] delay_tsc+0x20/0x60
  [...]
   __delay+0x15/0x20
  wait_lapic_expire+0xc3/0x150 [kvm]
  vcpu_enter_guest+0x743/0x11d0 [kvm]
  kvm_arch_vcpu_ioctl_run+0xe6/0x410 [kvm]
  kvm_vcpu_ioctl+0x33d/0x620 [kvm]
  do_vfs_ioctl+0x2af/0x4b0
  ? __do_page_fault+0x1c1/0x410
  ? fire_user_return_notifiers+0x3e/0x50
  SyS_ioctl+0x79/0x90
  entry_SYSCALL_64_fastpath+0x22/0xc1

  This matches the reported problem in the KVM mailing-list:
  https://marc.info/?l=kvm&m=146374488028339

  * A fix was proposed in the above thread, but discarded in favor of the
  following approach: https://marc.info/?l=kvm&m=146647260109315
  The patch was merged in Linus tree, hence we hereby request the SRU:
  b606f189c7d5 ("KVM: LAPIC: cap __delay at lapic_timer_advance_ns").
  There's one additional patch needed, which is just the header adjustment
  for exporting a necessary function.

  * The patch is missing only in 4.4 kernel series; Bionic (4.15) and
  the other newer releases have the patch already.

  [Test Case]

  * Unfortunately this is a hard to reproduce issue; we have reports of
  this lockup from an user, hence the SRU request here.
  Also, the patch was introduced originally in kernel 4.7, approx. 2.5 years
  ago. So, we are confident that community is running this code long enough
  without errors reported. Also, checked in the Linus tree and no fixes
  for this code were introduced since kernel 4.7.

  [Regression Potential]

  * The code modification requested here affects the amount of delay in
  a specific timer; the patch introduces a maximum time for delay, preventing 
unbounded delays in host.
  The regression potential is considered low, and given the nature of the
  modification, latency issues in guests are likely to be the most problematic 
regression potential we have.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1817918/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1845048] Re: /etc/default/kdump-tools KDUMP_SYSCTL does not set sysctl params

2020-05-30 Thread Guilherme G. Piccoli
I've reported a Debian bug with the proposed fixes; the merge report has
the information about the approach used to deal with sysctl in kdump:
https://salsa.debian.org/debian/makedumpfile/-/merge_requests/2

Cheers,


Guilherme

** Bug watch added: Debian Bug tracker #961880
   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=961880

** Also affects: makedumpfile (Debian) via
   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=961880
   Importance: Unknown
   Status: Unknown

** Summary changed:

- /etc/default/kdump-tools KDUMP_SYSCTL does not set sysctl params
+ Improve sysctl handling on kdump-tools

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1845048

Title:
  Improve sysctl handling on kdump-tools

Status in makedumpfile package in Ubuntu:
  Confirmed
Status in makedumpfile source package in Xenial:
  Confirmed
Status in makedumpfile source package in Bionic:
  Confirmed
Status in makedumpfile source package in Eoan:
  Confirmed
Status in makedumpfile source package in Focal:
  Confirmed
Status in makedumpfile package in Debian:
  Unknown

Bug description:
  [impact]

  Documentation, and past behavior, for kdump-tools was that the
  KDUMP_SYSCTL variable in the /etc/default/kdump-tools file would be
  applied to the system kernel params at kdump 'load'.  However this is
  no longer true, and those params are no longer applied to the system's
  kernel param settings.

  [test case]

  install linux-crashdump (and kdump-tools).

  Edit the /etc/default/kdump-tools file to set the KDUMP_SYSCTL param
  to something other than default, e.g.:

  KDUMP_SYSCTL="kernel.panic_on_oops=1 kernel.panic_on_warn=1"

  reboot, or unload/reload kdump, to pick up the changes to the file.

  Check if the panic_on_warn param is set:

  $ cat /proc/sys/kernel/panic_on_warn
  0

  the problem does not seem to be with sysctl, as manually calling it
  does work:

  $ KDUMP_SYSCTL="kernel.panic_on_oops=1 kernel.panic_on_warn=1"
  $ cat /proc/sys/kernel/panic_on_warn
  0
  $ sudo sysctl -w $KDUMP_SYSCTL
  kernel.panic_on_oops = 1
  kernel.panic_on_warn = 1
  $ cat /proc/sys/kernel/panic_on_warn
  1

  [regression potential]

  TBD

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1845048/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1874444] Re: Bionic ubuntu ethtool doesn't check ring parameters boundaries

2020-06-10 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu)
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/187

Title:
  Bionic ubuntu ethtool doesn't check ring parameters boundaries

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  [Impact]
  * There's a bad behavior in the ena driver ringparam setting on kernels 4.4 
and 4.15, if an invalid ringparam is provided to ethtool.

  * Upstream Linux kernel implemented ring parameter boundaries check in 
commit: 37e2d99b59c4 ("ethtool: Ensure new ring parameters are within bounds 
during SRINGPARAM") [ git.kernel.org/linus/37e2d99b59c4 ].
  Due to this commit, the community doesn't usually allow ring parameter 
boundary checks in driver code.

  * Xenial/Bionic kernels don't include this patch, and some network
  drivers (like ena) rely on this patch for boundary checking of ring
  params. So, we are hereby requesting the commit inclusion in these
  kernel versions.

  [Test case]
  1. In AWS, create a new c5.4xlarge instance with the Ubuntu 18.04 official 
ami (uses the ENA network driver) and update to latest kernel/reboot.

  2. Run ethtool -g ens5
  output:
  Ring parameters for ens5:
  Pre-set maximums:
  RX:   16384
  RX Mini:  0
  RX Jumbo: 0
  TX:   1024
  Current hardware settings:
  RX:   1024
  RX Mini:  0
  RX Jumbo: 0
  TX:   1024

  3. Change the TX/RX ring size to a legal number within boundaries -
  works!

  4. Change the TX/RX ring size to an illegal number (such as 2048 for
  TX) with the command - "sudo ethtool -G ens5 tx 2048".

  Expected behavior - "Cannot set device ring parameters: Invalid argument"
  Actual behavior - causes a driver hang since boundaries are not checked by 
ethtool, effectively hanging the instance (given that AWS has no console to 
allow system manipulation).

  [Regression Potential]

  Since that the commit is present in kernels v4.16+ (including Ubuntu)
  and is quite small and self-contained, the regression risk is very
  reduced.

  One potential "regression" would be if some driver has bugs and
  provide bad values on get_ringparams, then the validation would be
  broken (allowing illegal values or refusing legal ones), but this
  wouldn't be a regression in the hereby proposed patch itself, it'd be
  only exposed by the patch.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/187/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1877858] Re: Improve TSC refinement (and calibration) reliability

2020-06-10 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu)
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1877858

Title:
  Improve TSC refinement (and calibration) reliability

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released

Bug description:
  [Impact]
  * We received a report recently of a missing TSC refinement across multiple 
reboots of a server, in an Intel Skylake-based processor. This was only 
reproducible in Bionic pre-5.0.

  * After checking kernel commits, we came up with 2 commits that
  largely improve the situation: a786ef152cdc ("x86/tsc: Make
  calibration refinement more robust")
  [git.kernel.org/linus/a786ef152cdc] and 604dc9170f24 ("x86/tsc: Use
  CPUID.0x16 to calculate missing crystal frequency")
  [git.kernel.org/linus/604dc9170f24]. We hereby request SRU for both of
  them.

  * The first commit contains improvement in comments and in an offset to match 
more recent (fast) machines, but the important part is a retry mechanism in the 
TSC refinement (in case it fails due to some disturbance on TSC read, like 
NMIs/SMIs).
   
  * The second commit is an improvement in TSC calibration for Skylake (and 
some other models), by checking a register instead of relying on table-based 
hardcoded values.

  * A note for Xenial (kernel 4.4): the second patch would require the
  inclusion of more commits, so given the "maturity" of this release
  (and the fact kernel 4.15 is an HWE for Xenial), I've kept it out of
  Xenial, backporting only the first and more important patch for 4.4 .

  [Test case]
  * Unfortunately there's not an easy way to test the effectiveness of the 
commits, specially the refinement improvement.

  * The user that reported us the missing refinements was able to test
  300 reboots with a regular Bionic kernel (and it reproduced the issue
  at least once), whereas when they tested with Bionic kernel + both
  hereby proposed commits, the problem didn't happen.

  * Regarding the calibration commit, it was well-tested by community
  using multiple machines and checking the TSC calibration read vs.
  tables present in instlatx64.atw.hu .

  [Regression potential]
  * We consider the regression potential low, specially due to the nature of 
the patches: the first is basically a retry mechanism (and some improvement in 
an offset to reflect more recent machines), and the 2nd is an improvement for 
TSC calibration on some platforms (that are currently hardcoded in a 
table-based way in kernel). Also, the patches are present upstream for a while 
and I couldn't find any fixes for them.

  * An hypothetical regression from the 2nd patch could be in TSC
  precision calculation, which refinement itself might as well
  circumvent. From the first patch, a bug in code is the one
  hypothetical regression I could think.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1877858/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2020-07-14 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu)
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

Status in linux package in Ubuntu:
  Fix Released
Status in linux-azure package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Invalid
Status in linux-azure source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux-azure source package in Bionic:
  Invalid
Status in linux source package in Cosmic:
  Won't Fix
Status in linux-azure source package in Cosmic:
  Invalid
Status in linux source package in Disco:
  Fix Released
Status in linux-azure source package in Disco:
  Invalid

Bug description:
  NOTICE: The new patch merge is being worked on
  https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1856949 - if you
  face this issue, please report there!

  
  [Impact]

  * We got reports of a kernel crash in cifs module with the following
  signature:

  BUG: unable to handle kernel NULL pointer dereference at 0038
  IP: smb2_push_mandatory_locks+0x10e/0x3b0 [cifs]
  PGD 0 P4D 0
  RIP: 0010:smb2_push_mandatory_locks+0x10e/0x3b0 [cifs]
  Call Trace:
   cifs_oplock_break+0x12f/0x3d0 [cifs]
   process_one_work+0x14d/0x410
   worker_thread+0x4b/0x460
   kthread+0x105/0x140
  [...]

  * Low-level analysis (decodecode script output and the objdump of the
  function) revealed that we are crashing in a NULL ptr dereference when
  trying to access "cfile->tlink"; below, a snippet of the objdump at
  function smb2_push_mandatory_locks():

  [...]
  mov0x10(%r14),%r15   # %r15 = cifsFileInfo *cfile
  mov0x18(%r14),%rbx   # %rbx = cifsLockInfo *li = (fdlocks->locks)
  lea0x18(%r14),%r12
  mov0x90(%r15),%rax   # %rax = struct tcon_link *tlink (cfile->tlink)
  cmp%r12,%rbx
  mov0x38(%rax),%rax   # <--- TRAP [trying to get cifs_tcon *tl_tcon]
  [...]

  * After discussing the issue with CIFS maintainers (Steve French and
  Pavel Shilovsky) they suggested commit b98749cac4a6 ("CIFS: keep
  FileInfo handle live during oplock break")
  [http://git.kernel.org/linus/b98749cac4a6] as a fix for multiple
  reports of this kind of crash.

  * The fix was sent to stable kernels and is present in Ubuntu kernels
  5.0 and newer. We are requesting the SRU for this patch here in order
  to fix the crashes, after reports of successful testing with the patch
  (see below section) and since the patch is restricted to the cifs
  module scope and accepted on linux stable.

  * Alternatively the issue is known to be avoided when oplocks are
  disabled using "cifs.enable_oplocks=N" module parameter.

  [Test case]

  * Unfortunately we cannot reproduce the issue. The patch proposed here was
  validated by us with xfstests (instructions followed from
  https://wiki.samba.org/index.php/Xfstesting-cifs) and fio. Also, we
  have a user report of test validation using LISA 
(https://github.com/LIS/LISAv2).

  * Using xfstest with the exclusions proposed in the link above we
  managed to get the same results as a non-patched kernel, i.e., the
  same tests failed in both kernels, we didn't get worse results with
  the patch. Fio also didn't show noticeable performance regression with
  the patch.

  [Regression potential]

  * The patch was validated by the cifs filesystem maintainers (in fact
  they suggested its inclusion in Ubuntu) and by the aforementioned
  tests; also, the scope is restricted to cifs only so the likelihood of
  regressions is considered low.

  * Due to the nature of the code modification (add a new reference of a
  file handler and manipulate it in different places), I consider that
  if we have a regression it'll manifest as deadlock/blocked tasks, not
  something more serious like crashes or data corruption.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1814095] Re: bnxt_en_po: TX timed out triggering Netdev Watchdog Timer

2020-07-14 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu)
   Status: Confirmed => Fix Released

** Changed in: linux (Ubuntu)
 Assignee: (unassigned) => Nivedita Singhvi (niveditasinghvi)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1814095

Title:
  bnxt_en_po: TX timed out triggering Netdev Watchdog Timer

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released

Bug description:
  [Impact]

  The bnxt_en_bpo driver experienced tx timeouts causing the system to
  experience network stalls and fail to send data and heartbeat packets.

  The following 25Gb Broadcom NIC error was seen on Xenial
  running the 4.4.0-141-generic kernel on an amd64 host
  seeing moderate-heavy network traffic (just once):

  * The bnxt_en_po driver froze on a "TX timed out" error
    and triggered the Netdev Watchdog timer under load.

  * From kernel log:
    "NETDEV WATCHDOG: eno2d1 (bnxt_en_bpo): transmit queue 0 timed out"
    See attached kern.log excerpt file for full excerpt of error log.

  * Release = Xenial
    Kernel = 4.4.0-141-generic #167
    eno2d1 = Product Name: Broadcom Adv. Dual 25Gb Ethernet

  * This caused the driver to reset in order to recover:

    "bnxt_en_bpo :19:00.1 eno2d1: TX timeout detected, starting
  reset task!"

    driver: bnxt_en_bpo
    version: 1.8.1
    source: ubuntu/bnxt/bnxt.c: bnxt_tx_timeout()

  * The loss of connectivity and softirq stall caused other failures
    on the system.

  * The bnxt_en_po driver is the imported Broadcom driver
    pulled in to support newer Broadcom HW (specific boards)
    while the bnx_en module continues to support the older
    HW. The current Linux upstream driver does not compile
    easily with the 4.4 kernel (too many changes).

  * This upstream and bnxt_en driver fix is a likely solution:
     "bnxt_en: Fix TX timeout during netpoll"
     commit: 73f21c653f930f438d53eed29b5e4c65c8a0f906

    This fix has not been applied to the bnxt_en_po driver
    version, but review of the code indicates that it is
    susceptible to the bug, and the fix would be reasonable.

  [Test Case]

  * Unfortunately, this is not easy to reproduce. Also, it is only seen
  on 4.4 kernels with newer Broadcom NICs supported by the bnxt_en_bpo
  driver.

  [Regression Potential]

  * The patch is restricted to the bpo driver, with very constrained
  scope - just the newest Broadcom NICs being used by the Xenial 4.4
  kernel (as opposed to the hwe 4.15 etc. kernels, which would have the
  in-tree fixed driver).

  * The patch is very small and backport is fairly minimal and simple.

  * The fix has been running on the in-tree driver in upstream mainline
  as well as the Ubuntu Linux in-tree driver, although the Broadcom
  driver has a lot of lower level code that is different, this piece is
  still the same.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1814095/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1764956] Re: Guests using IBRS incur a large performance penalty

2020-07-14 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu)
   Status: In Progress => Fix Released

** Changed in: linux (Ubuntu Trusty)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Trusty)
 Assignee: (unassigned) => Gavin Guo (mimi0213kimo)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1764956

Title:
  Guests using IBRS incur a large performance penalty

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Trusty:
  Fix Released
Status in linux source package in Xenial:
  Fix Released

Bug description:
  [Impact]
  the IBRS would be mistakenly enabled in the host when the switching
  from an IBRS-enabled VM and that causes the performance overhead in
  the host. The other condition could also mistakenly disables the IBRS
  in VM when context-switching from the host. And this could be
  considered a CVE host.

  [Fix]
  The patch fixes the logic inside the x86_virt_spec_ctrl that it checks
  the ibrs_enabled and _or_ the hostval with the SPEC_CTRL_IBRS as the
  x86_spec_ctrl_base by default is zero. Because the upstream
  implementation is not equal to the Xenial's implementation. Upstream
  doesn't use the IBRS as the formal fix. So, by default, it's zero.

  On the other hand, after the VM exit, the SPEC_CTRL register also
  needs to be saved manually by reading the SPEC_CTRL MSR as the MSR
  intercept is disabled by default in the hardware_setup(v4.4) and
  vmx_init(v3.13). The access to SPEC_CTRL MSR in VM is direct and
  doesn't trigger a trap. So, the vmx_set_msr() function isn't called.

  The v3.13 kernel hasn't been tested. However, the patch can be viewed
  at:
  
http://kernel.ubuntu.com/git/gavinguo/ubuntu-trusty-amd64.git/log/?h=sf00191076-sru

  The v4.4 patch:
  
http://kernel.ubuntu.com/git/gavinguo/ubuntu-xenial.git/log/?h=sf00191076-spectre-v2-regres-backport-juerg

  [Test]

  The patch has been tested on the 4.4.0-140.166 and works fine.

  The reproducing environment:
  Guest kernel version: 4.4.0-138.164
  Host kernel version: 4.4.0-140.166

  (host IBRS, guest IBRS)

  - 1). (0, 1).
  The case can be reproduced by the following instructions:
  guest$ echo 1 | sudo tee /proc/sys/kernel/ibrs_enabled
  1

  

  host$ cat /proc/sys/kernel/ibrs_enabled
  0
  host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
  110001001010

  Some of the IBRS bit inside the SPEC_CTRL MSR are mistakenly
  enabled.

  host$ taskset -c 5 stress-ng -c 1 --cpu-ops 2500
  stress-ng: info:  [11264] defaulting to a 86400 second run per stressor
  stress-ng: info:  [11264] dispatching hogs: 1 cpu
  stress-ng: info:  [11264] cache allocate: default cache size: 35840K
  stress-ng: info:  [11264] successful run completed in 33.48s

  The host kernel didn't notice the IBRS bit is enabled. So, the situation
  is the same as "echo 2 > /proc/sys/kernel/ibrs_enabled" in the host.
  And running the stress-ng is a pure userspace CPU capability
  calculation. So, the performance downgrades to about 1/3. Without the
  IBRS enabled, it needs about 10s.

  - 2). (1, 1) disables IBRS in host -> (0, 1) actually it becomes (0, 0).
  The guest IBRS has been mistakenly disabled.

  guest$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
  guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
  

  host$ echo 2 | sudo tee /proc/sys/kernel/ibrs_enabled
  host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
  
  host$ echo 0 | sudo tee /proc/sys/kernel/ibrs_enabled
  host$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
  

  guest$ for i in {0..55}; do sudo rdmsr 0x48 -p $i; done
  

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1764956/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1807393] Re: nvme - Polling on timeout

2020-07-14 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu)
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1807393

Title:
  nvme - Polling on timeout

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released

Bug description:
  [Impact]
  * NVMe controllers potentially could miss to send an interrupt, specially
  due to bugs in virtual devices(which are common those days - qemu has its
  own NVMe virtual device, so does AWS). This would be a difficult to
  debug situation, because NVMe driver only reports the request timeout,
  not the reason.

  * The upstream patch proposed to SRU here here, 7776db1ccc12
  ("NVMe/pci: Poll CQ on timeout") was designed to provide more information
  in these cases, by pro-actively polling the CQEs on request timeouts, to
  check if the specific request was completed and some issue (probably a
  missed interrupt) prevented the driver to notice, or if the request really
  wasn't completed, which indicates more severe issues.

  * Although quite useful for debugging, this patch could help to mitigate
  issues in cloud environments like AWS, in case we may have jitter in
  request completion and the i/o timeout was set to low values, or even
  in case of atypical bugs in the virtual NVMe controller. With this patch,
  if polling succeeds the NVMe driver will continue working instead of
  trying a reset controller procedure, which may lead to fails in the 
  rootfs - refer to https://launchpad.net/bugs/1788035.

  
  [Test Case]

  * It's a bit tricky to artificially create a situation of missed interrupt;
  one idea was to implement a small hack in the NVMe qemu virtual device
  that given a trigger in guest kernel, will induce the virtual device to
  skip an interrupt. The hack patch is present in comment #1 below.

  * To trigger such hack from guest kernel, all is needed is to issue a
  raw admin command from nvme-cli: "nvme admin-passthru -o 0xff /dev/nvme0"
  After that, just perform some I/Os to see one of them aborting - one could
  use dd for this goal, like "dd if=/dev/zero of=/dev/nvme0n1 count=5".

  
  [Regression Potential]

  * There are no clear risks in adding such polling mechanism to the NVMe 
driver; one bad thing that was neverreported but could happen with this patch 
is the device could be in a bad state IRQ-wise that a reset would fix, but
  the patch could cause all requests to be completed via polling, which
  prevents the adapter reset. This is however a very hypothetical situation,
  which would also happen in the mainline kernel (since it has the patch).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1807393/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1845048] Re: Improve sysctl handling on kdump-tools

2020-07-14 Thread Guilherme G. Piccoli
** Changed in: makedumpfile (Ubuntu)
   Status: Confirmed => In Progress

** Changed in: makedumpfile (Ubuntu Xenial)
   Status: Confirmed => Opinion

** Changed in: makedumpfile (Ubuntu Bionic)
   Status: Confirmed => In Progress

** Changed in: makedumpfile (Ubuntu Eoan)
   Status: Confirmed => In Progress

** Changed in: makedumpfile (Ubuntu Focal)
   Status: Confirmed => In Progress

** Also affects: makedumpfile (Ubuntu Groovy)
   Importance: Medium
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: In Progress

** Tags added: sts

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1845048

Title:
  Improve sysctl handling on kdump-tools

Status in makedumpfile package in Ubuntu:
  In Progress
Status in makedumpfile source package in Xenial:
  Opinion
Status in makedumpfile source package in Bionic:
  In Progress
Status in makedumpfile source package in Eoan:
  In Progress
Status in makedumpfile source package in Focal:
  In Progress
Status in makedumpfile source package in Groovy:
  In Progress
Status in makedumpfile package in Debian:
  New

Bug description:
  [impact]

  Documentation, and past behavior, for kdump-tools was that the
  KDUMP_SYSCTL variable in the /etc/default/kdump-tools file would be
  applied to the system kernel params at kdump 'load'.  However this is
  no longer true, and those params are no longer applied to the system's
  kernel param settings.

  [test case]

  install linux-crashdump (and kdump-tools).

  Edit the /etc/default/kdump-tools file to set the KDUMP_SYSCTL param
  to something other than default, e.g.:

  KDUMP_SYSCTL="kernel.panic_on_oops=1 kernel.panic_on_warn=1"

  reboot, or unload/reload kdump, to pick up the changes to the file.

  Check if the panic_on_warn param is set:

  $ cat /proc/sys/kernel/panic_on_warn
  0

  the problem does not seem to be with sysctl, as manually calling it
  does work:

  $ KDUMP_SYSCTL="kernel.panic_on_oops=1 kernel.panic_on_warn=1"
  $ cat /proc/sys/kernel/panic_on_warn
  0
  $ sudo sysctl -w $KDUMP_SYSCTL
  kernel.panic_on_oops = 1
  kernel.panic_on_warn = 1
  $ cat /proc/sys/kernel/panic_on_warn
  1

  [regression potential]

  TBD

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1845048/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1851663] Re: Consistent autopkgtest failures on ppc64el/s390x

2020-07-14 Thread Guilherme G. Piccoli
** No longer affects: makedumpfile (Ubuntu Disco)

** Also affects: makedumpfile (Ubuntu Groovy)
   Importance: Medium
 Assignee: Thadeu Lima de Souza Cascardo (cascardo)
   Status: Confirmed

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1851663

Title:
  Consistent autopkgtest failures on ppc64el/s390x

Status in makedumpfile package in Ubuntu:
  Confirmed
Status in makedumpfile source package in Xenial:
  Confirmed
Status in makedumpfile source package in Bionic:
  Confirmed
Status in makedumpfile source package in Eoan:
  Confirmed
Status in makedumpfile source package in Focal:
  Confirmed
Status in makedumpfile source package in Groovy:
  Confirmed

Bug description:
  Recently autopkgtest started to consistently fail in ppc64el / s390x.
  When testing manually, the kernel dump is collected and all works
  fine. By discussing with the package maintainer (Cascardo), it seems
  it started after a change in the instance type of the tests, to
  m1.large.

  This seems either a problem in the test infrastructure itself or a likely a 
"timing" problem in the test (i.e., requiring some pause or delay that is not 
currently enough).
  This demands investigation because those failures delay the package release 
for all arches, usually those releases contain important fixes.

  Also, the recurrent "workaround" is to mark the tests as "force-
  badtest" in ubuntu-hints for the 2 arches, which is in practice
  skipping the test, so those 2 arches are getting releases untested by
  the automatic infrastructure.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1851663/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1800562] Re: Remove obsolete "nousb" option in kdump command-line for newer kernels

2020-07-14 Thread Guilherme G. Piccoli
** No longer affects: makedumpfile (Ubuntu Cosmic)

** No longer affects: makedumpfile (Ubuntu Disco)

** Also affects: makedumpfile (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: makedumpfile (Ubuntu Groovy)
   Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: In Progress

** Changed in: makedumpfile (Ubuntu Focal)
   Status: New => Confirmed

** Changed in: makedumpfile (Ubuntu Eoan)
   Status: In Progress => Confirmed

** Changed in: makedumpfile (Ubuntu Groovy)
   Status: In Progress => Confirmed

** Changed in: makedumpfile (Ubuntu Focal)
   Importance: Undecided => Medium

** Changed in: makedumpfile (Ubuntu Groovy)
   Importance: High => Medium

** Changed in: makedumpfile (Ubuntu Eoan)
   Importance: High => Medium

** Changed in: makedumpfile (Ubuntu Bionic)
   Importance: High => Medium

** Changed in: makedumpfile (Ubuntu Xenial)
   Importance: High => Medium

** Changed in: makedumpfile (Ubuntu Focal)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1800562

Title:
  Remove obsolete "nousb" option in kdump command-line for newer kernels

Status in makedumpfile package in Ubuntu:
  Confirmed
Status in makedumpfile source package in Xenial:
  Won't Fix
Status in makedumpfile source package in Bionic:
  Confirmed
Status in makedumpfile source package in Eoan:
  Confirmed
Status in makedumpfile source package in Focal:
  Confirmed
Status in makedumpfile source package in Groovy:
  Confirmed

Bug description:
  [Impact]
  * Kdump command-line include an obsolete "nousb" parameter by default, which 
can cause a misimpression: users will think they are not booting with USB, but 
they are.

  * Since kernel v4.5, the correct parameter to disable USB subsystem
  initialization is "usbcore.nousb" always (instead of "nousb" in case
  the subsystem is built-in). This was changed by commit 097a9ea0e48
  ("usb: make "nousb" a clear module parameter").

  * USB may be pretty essential in case for example kdump users need to
  decrypt a disk under LUKS, and there's only an USB keyboard connected
  to the system. Given the option is innocuous since Bionic, we should
  just drop it to prevent confusion.

  
  [Test Case]

  1) Deploy a Disco VM e.g. with uvt-kvm
  2) Install the kdump-tools package
  3) Run `kdump-config test`and check for the 'nousb' parameter:

  $ kdump-config test
  ...
  kexec command to be used:
    /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-4.15.0-45-generic 
root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0 nr_cpus=1 
systemd.unit=kdump-tools.service irqpoll nousb ata_piix.prefer_ms_hyperv=0" 
/var/lib/kdump/vmlinuz

  
  [Regression Potential]

  The regression potential is extremely low, since the "nousb" parameter
  is not used since Bionic although is there. Any bugs we would have by
  changing this are still valid by not removing the option - the
  semantics with or without "nosub" is the same since from Bionic.

  NOTICE we won't change Xenial, it can use kernel 4.4 which indeed
  disables USB by taking the "nousb" parameter.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1800562/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1816743] Re: Add systemd's kdump service command-line regardless if user provides or not KDUMP_CMDLINE_APPEND

2020-07-14 Thread Guilherme G. Piccoli
** Changed in: makedumpfile (Ubuntu Xenial)
   Status: Confirmed => Won't Fix

** No longer affects: makedumpfile (Ubuntu Cosmic)

** No longer affects: makedumpfile (Ubuntu Disco)

** Also affects: makedumpfile (Ubuntu Groovy)
   Importance: Low
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: Confirmed

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1816743

Title:
  Add systemd's kdump service command-line regardless if user provides
  or not KDUMP_CMDLINE_APPEND

Status in makedumpfile package in Ubuntu:
  Confirmed
Status in makedumpfile source package in Xenial:
  Won't Fix
Status in makedumpfile source package in Bionic:
  Confirmed
Status in makedumpfile source package in Eoan:
  Confirmed
Status in makedumpfile source package in Focal:
  Confirmed
Status in makedumpfile source package in Groovy:
  Confirmed

Bug description:
  Since Xenial release, Ubuntu relies on systemd as its init system -
  there's a kdump service to prevent some other services to
  unnecessarily start in kdump environment.

  Problem: if we add something to KDUMP_CMDLINE_APPEND, the entry for
  kdump service, "systemd.unit=kdump-tools.service" is removed from the
  command-line. The user manually needs to add that, and this seems
  highly prone to error.

  We propose here to decouple the "systemd.unit=kdump-tools.service"
  parameter from KDUMP_CMDLINE_APPEND, so if user wants really to remove
  this option, they should used KDUMP_CMDLINE instead.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1816743/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1800873] Re: Add KDUMP_CMDLINE_REMOVE option to remove portions of kernel command-line

2020-07-14 Thread Guilherme G. Piccoli
** No longer affects: makedumpfile (Ubuntu Trusty)

** Changed in: makedumpfile (Ubuntu Xenial)
   Status: Confirmed => Won't Fix

** No longer affects: makedumpfile (Ubuntu Cosmic)

** No longer affects: makedumpfile (Ubuntu Disco)

** Also affects: makedumpfile (Ubuntu Groovy)
   Importance: Low
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: Confirmed

** Also affects: makedumpfile (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Also affects: makedumpfile (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Changed in: makedumpfile (Ubuntu Eoan)
   Status: New => Confirmed

** Changed in: makedumpfile (Ubuntu Focal)
   Status: New => Confirmed

** Changed in: makedumpfile (Ubuntu Eoan)
   Importance: Undecided => Low

** Changed in: makedumpfile (Ubuntu Focal)
   Importance: Undecided => Low

** Changed in: makedumpfile (Ubuntu Eoan)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: makedumpfile (Ubuntu Focal)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1800873

Title:
  Add KDUMP_CMDLINE_REMOVE option to remove portions of kernel command-
  line

Status in makedumpfile package in Ubuntu:
  Confirmed
Status in makedumpfile source package in Xenial:
  Won't Fix
Status in makedumpfile source package in Bionic:
  Confirmed
Status in makedumpfile source package in Eoan:
  Confirmed
Status in makedumpfile source package in Focal:
  Confirmed
Status in makedumpfile source package in Groovy:
  Confirmed

Bug description:
  Currently kdump has an useful option to append parameters to the kdump
  kernel command-line, "KDUMP_CMDLINE_APPEND". Would be useful to have a
  reciprocal option which users could use to remove parameters without
  needing to rewrite the whole line.

  The option name proposed here is KDUMP_CMDLINE_REMOVE, which would
  tentatively "sed"-out the options from the kernel command-line before
  appending the new ones from KDUMP_CMDLINE_APPEND.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1800873/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1794877] Re: Crash in ixgbe, during tx packet xmit (while potentially changing queues number)

2020-07-14 Thread Guilherme G. Piccoli
We couldn't reproduce the bug and reporter cannot help in providing data, so 
we're marking as invalid. If anybody ever reproduces that, please ping here and 
reopen.
Thanks,


Guilherme

** Changed in: linux (Ubuntu)
   Status: In Progress => Invalid

** Changed in: linux (Ubuntu Xenial)
   Status: Confirmed => Invalid

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1794877

Title:
  Crash in ixgbe, during tx packet xmit (while potentially changing
  queues number)

Status in linux package in Ubuntu:
  Invalid
Status in linux source package in Xenial:
  Invalid

Bug description:
  It was reported that ixgbe driver may crash with the following stack
  trace, while changing interrupt/queue configuration (probably using
  ethtool --set-channel):

  [28661.949147] init: irqbalance main process (19397) killed by TERM signal 
  [28662.381154] ixgbe :04:00.0: removed PHC on eth4 
  [28662.502142] ixgbe :04:00.0: Multiqueue Enabled: Rx Queue count = 18, 
Tx Queue count = 18 
  [28662.588634] ixgbe :04:00.0: registered PHC device on eth4 
  [28662.689789] br-iscsi-left: port 1(eth4.4011) entered disabled state 
  [28662.689951] br-sio-bel: port 1(eth4.4015) entered disabled state 
  [28662.690039] br-sio-fel: port 1(eth4.4017) entered disabled state 
  [28662.694227] ixgbe :04:00.0 eth4: NIC Link is Up 10 Gbps, Flow Control: 
RX/TX 
  [28662.694506] br-iscsi-left: port 1(eth4.4011) entered forwarding state 
  [28662.694519] br-iscsi-left: port 1(eth4.4011) entered forwarding state 
  [28662.694596] br-sio-bel: port 1(eth4.4015) entered forwarding state 
  [28662.694604] br-sio-bel: port 1(eth4.4015) entered forwarding state 
  [28662.694651] br-sio-fel: port 1(eth4.4017) entered forwarding state 
  [28662.694658] br-sio-fel: port 1(eth4.4017) entered forwarding state 
  [28662.709921] ixgbe :04:00.1: removed PHC on eth5 
  [28662.834289] ixgbe :04:00.1: Multiqueue Enabled: Rx Queue count = 18, 
Tx Queue count = 18 
  [28662.915121] ixgbe :04:00.1: registered PHC device on eth5 
  [28663.018209] ixgbe :04:00.1 eth5: NIC Link is Up 10 Gbps, Flow Control: 
RX/TX 
  [28663.018356] BUG: unable to handle kernel NULL pointer dereference at 
0058 
  [28663.026266] IP: [] ixgbe_xmit_frame_ring+0x81/0xf50 
[ixgbe] 
  [28663.033491] PGD 800046bcc067 PUD 46bcd067 PMD 0 
  [28663.038562] Oops:  [#1] SMP 
  [28663.328921] Call Trace: 
  [28663.334598]  
  [28663.336551] [] ixgbe_xmit_frame+0x42/0x90 [ixgbe] 
  [28663.349627] [] dev_hard_start_xmit+0x23d/0x400 
  [28663.358854] [] sch_direct_xmit+0xe4/0x1f0 
  [28663.367602] [] __qdisc_run+0x9b/0x1c0 
  [28663.376110] [] net_tx_action+0x15e/0x240 
  [28663.384673] [] __do_softirq+0xe6/0x2a0 
  [28663.392944] [] irq_exit+0x95/0xa0 
  [28663.400720] [] do_IRQ+0x56/0xe0 
  [28663.408338] [] common_interrupt+0xbf/0xbf 
  [28663.416733]  
  [28663.418680] [] ? worker_thread+0x18c/0x480 
  [28663.430363] [] ? rescuer_thread+0x310/0x310 
  [28663.438870] [] kthread+0xd8/0xf0 
  [28663.446368] [] ? kthread_park+0x60/0x60 
  [28663.454385] [] ret_from_fork+0x55/0x80 
  [28663.462286] [] ? kthread_park+0x60/0x60 
  [28663.470488] Code: 2a 41 83 e8 01 31 c0 45 0f b7 c0 49 83 c0 01 49 c1 e0 04 
8b 74 07 3c 48 83 c0 10 8d 96 ff 3f 00 00 c1 ea 0e 01 d1 4c 39 c0 75 e8 <0f> b7 
43 58 0f b7 73 5a 83 c1 03 31 d2 66 39 f0 66 0f 43 53 54 
  [28663.498992] RIP [] ixgbe_xmit_frame_ring+0x81/0xf50 
[ixgbe] 
  [28663.512112] RSP  
  [28663.518217] CR2: 0058

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1794877/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1869948] Re: Multiple Kexec in AWS Nitro instances fail

2020-07-14 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu)
   Status: Fix Committed => Fix Released

** No longer affects: linux (Ubuntu Disco)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1869948

Title:
  Multiple Kexec in AWS Nitro instances fail

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux source package in Eoan:
  Fix Released
Status in linux source package in Focal:
  Fix Released

Bug description:
  [Impact]
  * Currently, users cannot perform multiple kernel kexec loads on AWS Nitro 
instances (KVM-based); after the 2nd or 3rd kexec, an initrd corruption is 
observed, with the following signature:

   Initramfs unpacking failed: junk within compressed archive
  [...]
   Kernel panic - not syncing: No working init found.
  Try passing init= option to kernel. See Linux 
Documentation/admin-guide/init.rst for guidance.
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.5.0-rc7-gpiccoli+ #26  Hardware 
name: Amazon EC2 t3.large/, BIOS 1.0 10/16/2017
  Call Trace:
dump_stack+0x6d/0x9a
? csum_partial_copy_generic+0x150/0x170
panic+0x101/0x2e3
? do_execve+0x25/0x30
? rest_init+0xb0/0xb0
kernel_init+0xfb/0x100
ret_from_fork+0x35/0x40

  * After investigation (see comment 2), it was noticed the Amazon ena
  network driver doesn't provide a shutdown() handler, hence it could be
  performing a DMA transaction to a previous valid address during boot,
  which would then corrupt kernel memory. The following patch was
  proposed and fixed the issue, allowing 1000 kexecs to be executed
  successfully with no issues observed: 428c491332bc("net: ena: Add PCI
  shutdown handler to allow safe kexec") [
  git.kernel.org/linus/428c491332bc ].

  * Hence, we are hereby requesting SRU for this patch. It was tested in
  all supported series (4.4, 4.15 and 5.3) in Amazon Nitro instances
  with success, and reviewed/acked by ena driver team and a kexec
  developer from other distro. Worth mentioning that we proposed an
  upstream multi-vendor discussion about this issue:
  marc.info/?l=kexec&m=158299605013194

  [Test case]

  * The basic test procedure is about performing multiple kexecs
  sequentially; AWS does not provide a full console, so in case of
  failures one could check the instance screenshot or use pstore/ramoops
  in order to collect dmesg after a crash in a preserved memory area.
  The commands used to perform kexec are:

  kexec -l  --initrd  --reuse-cmdline
  systemctl kexec

  Alternatively, one could user "--append=" instead of "--reuse-cmdline"
  if a change in kexec command-line is desired; also, to execute the
  kexec-loaded kernel both "kexec -e" and "systemctl kexec" are equally
  valid.

  * On comment 3 we proposed a script/approach to auto-test kexecs, used
  here to perform 1000 kexecs with the proposed patch.

  [Regression Potential]

  * Although the patch proposed here introduce a PCI handler, it kept
  the remove handler identical and based shutdown strongly on
  ena_remove(), changing just netdev handling following other upstream
  drivers. It was extensively tested and presented no issue. Also, it's
  self-contained and affect only one driver, so any other cloud
  providers or non-cloud environment wouldn't be even affected by the
  patch.

  * In case of a potential regression, it could manifest as a delay or
  issue on reboot/shutdown path, only if ena driver is in use.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1869948/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1724614] Re: [KVM] Lower the default for halt_poll_ns to 200000 ns

2020-07-14 Thread Guilherme G. Piccoli
** No longer affects: linux (Ubuntu)

** No longer affects: linux (Ubuntu Xenial)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1724614

Title:
  [KVM] Lower the default for halt_poll_ns to 20 ns

Status in linux source package in Zesty:
  Won't Fix
Status in linux source package in Groovy:
  New

Bug description:
  [Environment]

  Distributor ID:   Ubuntu
  Description:  Ubuntu 16.04.3 LTS
  Release:  16.04
  Codename: xenial

  Linux porygon 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36
  UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

  [Description]

  We've identified a constant high (~90%) system time load at the host level
  when a VCPU in a KVM guest remains or switches/resumes in/from halt/idle state
  in a constant frequency, usually for a slightly smaller time than the default 
polling
  period.

  The halt polling mechanism has the intention to reduce latency in the cases
  on which the guest is quickly resumed saving a call to the scheduler.

  We've performed some testing by adjusting the 
/sys/module/kvm/parameters/halt_poll_ns
  value which defines the max time that should be spend polling before calling 
the
  scheduler to allow it to run other tasks (which defaults to 40 ns in 
Ubuntu).

  With the default value the tests shows that the load remains nearly on 90% on 
a
  VCPU that has a single task in the run queue.

  We've also tested altering the halt_poll_ns value to 20 ns and the results
  seems to drop the system time usage from 90% to ~25%.

  root@porygon:/home/ubuntu# echo 20 > 
/sys/module/kvm/parameters/halt_poll_ns
  root@porygon:/home/ubuntu# mpstat 1 -P 6 5
  Linux 4.4.0-112-generic (porygon) 01/24/2018  _x86_64_(64 CPU)

  02:06:08 PM  CPU  %usr  %nice  %sys %iowait  %irq   %soft  %steal  %guest  
%gnice   %idle
  02:06:09 PM6  0.00  0.00   4.85  0.000.000.000.00   16.50
0.00   78.64
  [...]
  Average:   6  0.00  0.00   4.26  0.000.000.000.00   17.83
0.00   77.91

  
  root@porygon:/home/ubuntu# echo 40 > 
/sys/module/kvm/parameters/halt_poll_ns
  root@porygon:/home/ubuntu# mpstat 1 -P 6 5
  Linux 4.4.0-112-generic (porygon) 01/24/2018  _x86_64_(64 CPU)

  02:06:20 PM  CPU  %usr  %nice  %sys %iowait   %irq   %soft  %steal  %guest  
%gnice   %idle
  02:06:21 PM6  0.00  0.00   87.13  0.000.000.000.00   11.88
0.000.99
  [...]
  Average:   6  0.00  0.00   89.59  0.000.000.000.008.45
0.001.96

  [Reproducer]

  1) Configure a KVM guest with a single pinned VCPU.
  2) Run the following program (http://pastebin.ubuntu.com/25731919/) at the 
KVM guest.
  $ gcc test.c -lpthread -o test && ./test 250 0
  3) Run mpstat at the host on the pinned CPU and compare the stats
  $ sudo mpstat 1 -P 6 5

  [Fix]

  Change the halt polling max time to half of the current value.

  In some fio benchmarks, halt_poll_ns=40 caused CPU utilization to
  increase heavily even in cases where the performance improvement was
  small.  In particular, bandwidth divided by CPU usage was as much as
  60% lower.

  To some extent this is the expected effect of the patch, and the
  additional CPU utilization is only visible when running the
  benchmarks.  However, halving the threshold also halves the extra
  CPU utilization (from +30-130% to +20-70%) and has no negative
  effect on performance.

  Signed-off-by: Paolo Bonzini 

  *
  
https://github.com/torvalds/linux/commit/b401ee0b85a53e89739ff68a5b1a0667d664afc9

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/zesty/+source/linux/+bug/1724614/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1724614] Re: [KVM] Lower the default for halt_poll_ns to 200000 ns

2020-07-14 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu Xenial)
   Status: New => Fix Released

** No longer affects: linux (Ubuntu Groovy)

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1724614

Title:
  [KVM] Lower the default for halt_poll_ns to 20 ns

Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Zesty:
  Won't Fix

Bug description:
  [Environment]

  Distributor ID:   Ubuntu
  Description:  Ubuntu 16.04.3 LTS
  Release:  16.04
  Codename: xenial

  Linux porygon 4.4.0-112-generic #135-Ubuntu SMP Fri Jan 19 11:48:36
  UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

  [Description]

  We've identified a constant high (~90%) system time load at the host level
  when a VCPU in a KVM guest remains or switches/resumes in/from halt/idle state
  in a constant frequency, usually for a slightly smaller time than the default 
polling
  period.

  The halt polling mechanism has the intention to reduce latency in the cases
  on which the guest is quickly resumed saving a call to the scheduler.

  We've performed some testing by adjusting the 
/sys/module/kvm/parameters/halt_poll_ns
  value which defines the max time that should be spend polling before calling 
the
  scheduler to allow it to run other tasks (which defaults to 40 ns in 
Ubuntu).

  With the default value the tests shows that the load remains nearly on 90% on 
a
  VCPU that has a single task in the run queue.

  We've also tested altering the halt_poll_ns value to 20 ns and the results
  seems to drop the system time usage from 90% to ~25%.

  root@porygon:/home/ubuntu# echo 20 > 
/sys/module/kvm/parameters/halt_poll_ns
  root@porygon:/home/ubuntu# mpstat 1 -P 6 5
  Linux 4.4.0-112-generic (porygon) 01/24/2018  _x86_64_(64 CPU)

  02:06:08 PM  CPU  %usr  %nice  %sys %iowait  %irq   %soft  %steal  %guest  
%gnice   %idle
  02:06:09 PM6  0.00  0.00   4.85  0.000.000.000.00   16.50
0.00   78.64
  [...]
  Average:   6  0.00  0.00   4.26  0.000.000.000.00   17.83
0.00   77.91

  
  root@porygon:/home/ubuntu# echo 40 > 
/sys/module/kvm/parameters/halt_poll_ns
  root@porygon:/home/ubuntu# mpstat 1 -P 6 5
  Linux 4.4.0-112-generic (porygon) 01/24/2018  _x86_64_(64 CPU)

  02:06:20 PM  CPU  %usr  %nice  %sys %iowait   %irq   %soft  %steal  %guest  
%gnice   %idle
  02:06:21 PM6  0.00  0.00   87.13  0.000.000.000.00   11.88
0.000.99
  [...]
  Average:   6  0.00  0.00   89.59  0.000.000.000.008.45
0.001.96

  [Reproducer]

  1) Configure a KVM guest with a single pinned VCPU.
  2) Run the following program (http://pastebin.ubuntu.com/25731919/) at the 
KVM guest.
  $ gcc test.c -lpthread -o test && ./test 250 0
  3) Run mpstat at the host on the pinned CPU and compare the stats
  $ sudo mpstat 1 -P 6 5

  [Fix]

  Change the halt polling max time to half of the current value.

  In some fio benchmarks, halt_poll_ns=40 caused CPU utilization to
  increase heavily even in cases where the performance improvement was
  small.  In particular, bandwidth divided by CPU usage was as much as
  60% lower.

  To some extent this is the expected effect of the patch, and the
  additional CPU utilization is only visible when running the
  benchmarks.  However, halving the threshold also halves the extra
  CPU utilization (from +30-130% to +20-70%) and has no negative
  effect on performance.

  Signed-off-by: Paolo Bonzini 

  *
  
https://github.com/torvalds/linux/commit/b401ee0b85a53e89739ff68a5b1a0667d664afc9

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/xenial/+source/linux/+bug/1724614/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1775326] Re: The kernel NULL pointer dereference happens when accessing the task_struct by task_cpu() in function cpuacct_charge()

2020-07-15 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu)
   Status: Incomplete => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1775326

Title:
  The kernel NULL pointer dereference happens when accessing the
  task_struct by task_cpu() in function cpuacct_charge()

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released

Bug description:
  [Impact]

  In function cpuacct_charge(), the NULL pointer dereference happens
  with the stack pointer being zero inside the task_struct when the
  task_cpu() is trying to access the member CPU of the struct
  thread_info inside the stack. It's a use-after-free corruption
  happening in the situation that the task_struct is released almost
  concurrently before accessing the task_struct->stack.

  void cpuacct_charge(struct task_struct *tsk, u64 cputime)
   {
  struct cpuacct *ca;
  int cpu;
   
  cpu = task_cpu(tsk);
   
  rcu_read_lock();
   
  ca = task_ca(tsk);
   
  while (true) {
  u64 *cpuusage = per_cpu_ptr(ca->cpuusage, cpu);
  *cpuusage += cputime;
   
  ca = parent_ca(ca);
  if (!ca)
  break;
  }

rcu_read_unlock();
  }

  
  BUG: unable to handle kernel NULL pointer dereference at 0010
  IP: [] cpuacct_charge+0x14/0x40
  PGD 0 
  Oops:  [#1] SMP  
  CPU: 10 PID: 148614 Comm: qemu-system-x86 Tainted: PW  OE   
4.4.0-45-generic #66~14.04.1-Ubuntu
  Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.7 06/16/2016
  task: 881ff0f01b80 ti: 88018fd7 task.ti: 88018fd7
  RIP: 0010:[]  [] cpuacct_charge+0x14/0x40
  RSP: 0018:88018fd73d10  EFLAGS: 00010246
  RAX:  RBX: 8801931e8000 RCX: 88010caff200
  RDX: 880124508000 RSI: 0066f757398831d6 RDI: 8801931e7fa0
  RBP: 88018fd73d10 R08: c04b8320 R09: 0001
  R10: 0001 R11:  R12: 0066f757398831d6
  R13: 0066f757398b8997 R14: 8801931e7fa0 R15: 0001
  FS:  7f162aaf7700() GS:881ffe74() knlGS:
  CS:  0010 DS:  ES:  CR0: 80050033
  CR2: 0010 CR3: 00011d86e000 CR4: 003426e0
  DR0:  DR1:  DR2: 
  DR3:  DR6: fffe0ff0 DR7: 0400
  Stack:
   88018fd73d28 810b1a9f 8801931e8000 88018fd73d40
   c069df72 8801931e8000 88018fd73da8 c069f121
   881ff0f01b80  881ff0f01b80 810bddc0
  Call Trace:
   [] update_curr+0xdf/0x170
   [] kvm_vcpu_check_block+0x12/0x60 [kvm]
   [] kvm_vcpu_block+0x191/0x2d0 [kvm]
   [] ? prepare_to_wait_event+0xf0/0xf0
   [] kvm_arch_vcpu_ioctl_run+0x17e/0x3d0 [kvm]
   [] kvm_vcpu_ioctl+0x2ab/0x640 [kvm]
   [] ? perf_event_context_sched_in+0x87/0xa0
   [] do_vfs_ioctl+0x2dd/0x4c0
   [] ? __audit_syscall_entry+0xaf/0x100
   [] ? do_audit_syscall_entry+0x66/0x70
   [] SyS_ioctl+0x79/0x90
   [] entry_SYSCALL_64_fastpath+0x16/0x75
  Code: 9a 11 00 5b 48 c7 c0 f4 ff ff ff 5d eb df 66 0f 1f 84 00 00 00 00 00 0f 
1f 44 00 00 55 48 8b 47 08 48 8b 97 78 07 00 00 48 89 e5 <48> 63 48 10 48 8b 52 
60 48 8b 82 b8 00 00 00 48 03 04 cd c0 7a
  RIP  [] cpuacct_charge+0x14/0x40
   RSP 
  CR2: 0010
  ---[ end trace 419a30375d0e4622 ]---


  [Fix]

  The patch uses this_cpu_ptr() instead of getting the CPU number by 
  task_cpu() and proceeds to get the cpu_usage by per_cpu_ptr(). And
  that can avoid accessing the thread_info inside the stack. 

  commit 73e6aafd9ea81498d31361f01db84a0118da2d1c
  Author: Zhao Lei 
  Date:   Thu Mar 17 12:19:43 2016 +0800

  sched/cpuacct: Simplify the cpuacct code
  
   - Use for() instead of while() loop in some functions
 to make the code simpler.
  
   - Use this_cpu_ptr() instead of per_cpu_ptr() to make the code
 cleaner and a bit faster.
  
  Suggested-by: Peter Zijlstra 
  Signed-off-by: Zhao Lei 
  Signed-off-by: Peter Zijlstra (Intel) 
  Cc: Linus Torvalds 
  Cc: Tejun Heo 
  Cc: Thomas Gleixner 
  Link: 
http://lkml.kernel.org/r/d8a7ef9592f55224630cb26dea239f05b6398a4e.1458187654.git.zhao...@cn.fujitsu.com
  Signed-off-by: Ingo Molnar 


  [Test]
  The test kernel has been tested by the Qemu and cannot be reproduced.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1775326/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help 

[Group.of.nepali.translators] [Bug 1848739] Re: [linux] Patch to prevent possible data corruption

2020-07-15 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu)
   Status: Fix Committed => Fix Released

** Changed in: linux-azure (Ubuntu)
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1848739

Title:
  [linux] Patch to prevent possible data corruption

Status in linux package in Ubuntu:
  Fix Released
Status in linux-azure package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Invalid
Status in linux-azure source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux-azure source package in Bionic:
  Invalid

Bug description:
  There are three patches that prevent possible data corruption.  The
  three commits are:

  aef1897cd36d ("blk-mq: insert rq with DONTPREP to hctx dispatch list when 
requeue")
  c616cbee97ae ("blk-mq: punt failed direct issue to dispatch list")
  923218f6166a ("blk-mq: don't allocate driver tag upfront for flush rq")

  18.04 has all three of these patches.  16.04 has two out of the three,
  but it is missing commit c616cbee97ae.

  We would like to request commit c616cbee97ae be included in the 16.04 kernel:
  c616cbee97ae ("blk-mq: punt failed direct issue to dispatch list")

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1848739/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1840043] Re: bcache: Performance degradation when querying priority_stats

2020-07-15 Thread Guilherme G. Piccoli
** Changed in: linux
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1840043

Title:
  bcache: Performance degradation when querying priority_stats

Status in Linux:
  Fix Released
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux source package in Disco:
  Fix Released
Status in linux source package in Eoan:
  Fix Released

Bug description:
  [Impact]
  Querying bcache's priority_stats attribute in sysfs causes severe performance 
degradation for read/write workloads and occasional system stalls

  [Test Case]
  Note: As the sorting step has the most noticeable performance impact, the 
test case below pins a workload and the sysfs query to the same CPU. CPU 
contention issues still occur without any pinning, this just removes the 
scheduling factor of landing in different CPUs and affecting different tasks.

  1) Start a read/write workload on the bcache device with e.g. fio or dd, 
pinned to a certain CPU:
  # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress

  2) Start a sysfs query loop for the priority_stats attribute pinned to the 
same CPU:
  # for i in {1..10}; do taskset 0x10 cat 
/sys/fs/bcache/*/cache0/priority_stats > /dev/null; done

  3) Monitor the read/write workload for any performance impact

  [Fix]
  To fix CPU contention and performance impact, a cond_resched() call is 
introduced in the priority_stats sort comparison.

  [Regression Potential]
  Regression potential is low, as the change is confined to the priority_stats 
sysfs query. In cases where frequent queries to bcache priority_stats take 
place (e.g. node_exporter), the impact should be more noticeable as those could 
now take a bit longer to complete. A regression due to this patch would most 
likely show up as a performance degradation in bcache-focused workloads.

  --

  [Description]
  In the latest bcache drivers, there's a sysfs attribute that calculates 
bucket priority statistics in /sys/fs/bcache/*/cache0/priority_stats. Querying 
this file has a big performance impact on tasks that run in the same CPU, and 
also affects read/write performance of the bcache device itself.

  This is due to the way the driver calculates the stats: the bcache
  buckets are locked and iterated through, collecting information about
  each individual bucket. An array of nbucket elements is constructed
  and sorted afterwards, which can cause very high CPU contention in
  cases of larger bcache setups.

  From our tests, the sorting step of the priority_stats query causes
  the most expressive performance reduction, as it can hinder tasks that
  are not even doing any bcache IO. If a task is "unlucky" to be
  scheduled in the same CPU as the sysfs query, its performance will be
  harshly reduced as both compete for CPU time. We've had users report
  systems stalls of up to ~6s due to this, as a result from monitoring
  tools that query the priority_stats periodically (e.g. Prometheus Node
  Exporter from [0]). These system stalls have triggered several other
  issues such as ceph-mon re-elections, problems in percona-cluster and
  general network stalls, so the impact is not isolated to bcache IO
  workloads.

  An example benchmark can be seen in [1], where the read performance on
  a bcache device suffered quite heavily (going from ~40k IOPS to ~4k
  IOPS due to priority_stats). Other comparison charts are found under
  [2].

  [0] https://github.com/prometheus/node_exporter
  [1] 
https://people.canonical.com/~halves/priority_stats/read/4k-iops-2Dsmooth.png
  [2] https://people.canonical.com/~halves/priority_stats/

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1840043/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1657281] Re: Ubuntu xenial - 4.4.0-59-generic i3 I/O performance issue

2020-07-15 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu)
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1657281

Title:
  Ubuntu xenial - 4.4.0-59-generic i3 I/O performance issue

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Fix Released

Bug description:
  
  [Impact]

   * When running block device I/O on an Amazon i3 system there
 is a performance degradation. A patch with git change id
 87c279e613f848c69b29 that is in Ubuntu-lts-4.8.0 kernel
 series increases I/O performance.

   * Fix should be backported to xenial 4.4 series to avoid more
 support issues being filed.

   * This change gets rid of uneccessary work being performed.

  [Test Case]

   * Steps to reproduce below
 1) partition ephemeral disks 
/sbin/parted -s --align optimal /dev/nvme0n1 mklabel gpt mkpart primary 
0% 100% 
/sbin/parted -s --align optimal /dev/nvme1n1 mklabel gpt mkpart primary 
0% 100% 
/sbin/parted -s --align optimal /dev/nvme2n1 mklabel gpt mkpart primary 
0% 100% 
/sbin/parted -s --align optimal /dev/nvme3n1 mklabel gpt mkpart primary 
0% 100% 

 2) create raid array 
/sbin/mdadm --create /dev/md0 --assume-clean --chunk=2048 --level=10 
--raid-devices=4 /dev/nvme0n1p1 /dev/nvme1n1p1 /dev/nvme2n1p1 /dev/nvme3n1p1 

 3) create pv 
/sbin/pvcreate --force --metadatasize 4092k /dev/md0 

 4) create volume group 
/sbin/vgcreate RDSVG /dev/md0 

 5) create lv 
/sbin/lvcreate -L 2.5T -n RDSRAIDLV RDSVG 

 5) test I/O
sudo dd if=/dev/zero of=/dev/RDSVG/RDSRAIDLV bs=8k count=100 && sync

  [Regression Potential]

   * No known regression potential.

  [Original Description]

  When we were doing testing on i3, we noticed that it is taking
  significantly longer to perform operations when using software RAID
  than without, we believe this is resolved in an upstream commit:
  http://kernel.ubuntu.com/git/ubuntu/ubuntu-
  xenial.git/commit/?id=87c279e613f848c69b29d49de8df3f4f56da

  So stock 4.4.0-59-generic performs ok:
  $ sudo dd if=/dev/zero of=/dev/RDSVG/RDSRAIDLV bs=8k count=100 && sync 
  sudo: unable to resolve host ip-10-0-85-167
  100+0 records in
  100+0 records out
  819200 bytes (8.2 GB, 7.6 GiB) copied, 54.3711 s, 151 MB/s

  and the patch you originally providing works a little bit better (as 
expected):
  $ uname -a
  Linux ip-10-0-85-167 4.4.0-57-generic 
#78hf00v20170110b0h3199a6e718db-Ubuntu SMP Tue Jan 10 02:53: x86_64 x86_64 
x86_64 GNU/Linux
  ubuntu@ip-10-0-85-167:~$ sudo dd if=/dev/zero of=/dev/RDSVG/RDSRAIDLV bs=8k 
count=100 && sync 
  sudo: unable to resolve host ip-10-0-85-167
  100+0 records in
  100+0 records out
  819200 bytes (8.2 GB, 7.6 GiB) copied, 31.4108 s, 261 MB/s

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1657281/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1879987] Re: machine get stuck at boot if specified 'console=ttyS* ' doesn't exist.

2020-07-23 Thread Guilherme G. Piccoli
** This bug is no longer a duplicate of bug 1573095
   Cloud images fail to boot when a serial port is not available

** No longer affects: linux (Ubuntu)

** Changed in: initramfs-tools (Ubuntu)
   Status: Confirmed => In Progress

** Also affects: initramfs-tools (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Also affects: initramfs-tools (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: initramfs-tools (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: initramfs-tools (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: initramfs-tools (Ubuntu Trusty)
   Importance: Undecided
   Status: New

** Also affects: initramfs-tools (Ubuntu Groovy)
   Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: In Progress

** Changed in: initramfs-tools (Ubuntu Eoan)
   Status: New => Won't Fix

** Changed in: initramfs-tools (Ubuntu Trusty)
   Status: New => Won't Fix

** Changed in: initramfs-tools (Ubuntu Xenial)
   Status: New => In Progress

** Changed in: initramfs-tools (Ubuntu Bionic)
   Status: New => In Progress

** Changed in: initramfs-tools (Ubuntu Focal)
   Status: New => In Progress

** Changed in: initramfs-tools (Ubuntu Trusty)
   Importance: Undecided => High

** Changed in: initramfs-tools (Ubuntu Xenial)
   Importance: Undecided => High

** Changed in: initramfs-tools (Ubuntu Trusty)
   Importance: High => Low

** Changed in: initramfs-tools (Ubuntu Bionic)
   Importance: Undecided => High

** Changed in: initramfs-tools (Ubuntu Focal)
   Importance: Undecided => High

** Changed in: initramfs-tools (Ubuntu Eoan)
   Importance: Undecided => Low

** Changed in: initramfs-tools (Ubuntu Focal)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: initramfs-tools (Ubuntu Eoan)
     Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: initramfs-tools (Ubuntu Bionic)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: initramfs-tools (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: initramfs-tools (Ubuntu Trusty)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1879987

Title:
  machine get stuck at boot if specified 'console=ttyS* ' doesn't exist.

Status in initramfs-tools package in Ubuntu:
  In Progress
Status in initramfs-tools source package in Trusty:
  Won't Fix
Status in initramfs-tools source package in Xenial:
  In Progress
Status in initramfs-tools source package in Bionic:
  In Progress
Status in initramfs-tools source package in Eoan:
  Won't Fix
Status in initramfs-tools source package in Focal:
  In Progress
Status in initramfs-tools source package in Groovy:
  In Progress

Bug description:
  kernel get stucks at boot if console=ttyS* is specified in the kernel
  cmdline and that serial HW isn't available on the system.

  Reproduced with:
  4.4 (from Xenial), 4.15 (from Bionic), 5.4 (native, Focal) and 5.7-next 
(mainline)

  Removing the non-existent 'console=ttyS*' parameter fixes the
  situation.

  I tested it using KVM/qemu, but it has been brought to my attention
  that it was reproducible in VMware as well.

  I think it is safe to say that it is unlikely to be specifics to a
  certain virtualization technology type.

  Didn't test on baremetal yet.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/initramfs-tools/+bug/1879987/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1813873] Re: Userspace break as a result of missing patch backport

2019-02-18 Thread Guilherme G. Piccoli
Hi Edgar, I've changed it back =)
It should be Fix Committed if the patch is present/merged in the kernel, but 
kernel wasn't released yet. Once it gets released, it'll get changed to Fix 
Released.

Cheers,


Guilherme

** Changed in: linux (Ubuntu Bionic)
   Status: Fix Released => Fix Committed

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1813873

Title:
  Userspace break as a result of missing patch backport

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Fix Released

Bug description:
  Hi,

  The most recent set of Ubuntu kernels applied a variety of tty patches
  including:
  
https://github.com/torvalds/linux/commit/c96cf923a98d1b094df9f0cf97a83e118817e31b

  But have not applied the more recent
  
https://github.com/torvalds/linux/commit/d3736d82e8169768218ee0ef68718875918091a0
  patch.

  This second patch is required to prevent a rather serious regression
  where userspace applications reading from stdin can receive EAGAIN
  when they should not.

  I will try to link correspondence from the mailing list archives once
  they are available, but for now if you have access to the linux-
  console mailing list you can find discussion under the thread
  "Userspace break? read from STDIN returns EAGAIN if tty is "touched"".

  I would appreciate it if this could be examined soon as it is a
  regression on userspace.

  Thanks
  Michael

  
  Good:
  4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC 2018

  
  Bad:
  4.15.0-44-generic #47-Ubuntu SMP Mon Jan 14 11:26:59 UTC 2019

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1813873/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1817918] [NEW] Hard lockups due to unrestricted lapic timer delay

2019-02-27 Thread Guilherme G. Piccoli
Public bug reported:

There is a report of hard lockup induced by a long delay in lapic expiration 
timer.
We'll provide SRU request here for merging the fixes in 4.4 kernel.

** Affects: linux (Ubuntu)
 Importance: Undecided
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed

** Affects: linux (Ubuntu Xenial)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed

** Affects: linux (Ubuntu Bionic)
 Importance: Low
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Fix Released


** Tags: sts

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Bionic)
   Status: New => Fix Released

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => Low

** Changed in: linux (Ubuntu Bionic)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Xenial)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1817918

Title:
  Hard lockups due to unrestricted lapic timer delay

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Xenial:
  Confirmed
Status in linux source package in Bionic:
  Fix Released

Bug description:
  There is a report of hard lockup induced by a long delay in lapic expiration 
timer.
  We'll provide SRU request here for merging the fixes in 4.4 kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1817918/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1802073] Re: No network in AWS (EC-Classic) after stopping and starting instance

2019-03-20 Thread Guilherme G. Piccoli
Problem is resolved in Bionic's latest version of cloud-init, released
yesterday:

$ dpkg -l | grep cloud-init 
ii cloud-init 18.5-45-g3554ffe8-0ubuntu1~18.04.1

I've manually upgraded the package after bringing-up my EC2 Classic instance,
so notice the AWS image doesn't have the latest cloud-init version yet.

Thanks,


Guilherme

** Changed in: cloud-init (Ubuntu Bionic)
   Status: Confirmed => Fix Released

** Changed in: cloud-init (Ubuntu Xenial)
   Status: Confirmed => Fix Released

** Changed in: cloud-init (Ubuntu Cosmic)
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1802073

Title:
  No network in AWS (EC-Classic) after stopping and starting instance

Status in cloud-init package in Ubuntu:
  Fix Released
Status in cloud-init source package in Xenial:
  Fix Released
Status in cloud-init source package in Bionic:
  Fix Released
Status in cloud-init source package in Cosmic:
  Fix Released

Bug description:
  I don't know is this cloud-init or netplan or what, but this is not
  good.

  Background:
  # lsb_release -rd
  Description:Ubuntu 18.04.1 LTS
  Release:18.04
  # apt-cache policy cloud-init
  cloud-init:
Installed: 18.4-0ubuntu1~18.04.1
Candidate: 18.4-0ubuntu1~18.04.1
Version table:
   *** 18.4-0ubuntu1~18.04.1 500
  500 http://eu-west-1.ec2.archive.ubuntu.com/ubuntu 
bionic-updates/main amd64 Packages
  100 /var/lib/dpkg/status
   18.2-14-g6d48d265-0ubuntu1 500
  500 http://eu-west-1.ec2.archive.ubuntu.com/ubuntu bionic/main amd64 
Packages

  
  1. Get newest image to use

  $ aws --region eu-west-1 ec2 describe-images --owners 099720109477
  --filters Name=root-device-type,Values=ebs
  Name=architecture,Values=x86_64 Name=name,Values='*hvm-ssd/ubuntu-
  bionic-18.04*' --query 'sort_by(Images, &Name)[-1].ImageId'

  "ami-08596fdd2d5b64915"

  2. Start instance to EC2-Classic with that image.

  3. Try to SSH. Everything is ok.

  # cat /var/log/cloud-init-output.log
  Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'init-local' at Wed, 07 Nov 2018 
08:12:16 +. Up 10.51 seconds.
  Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'init' at Wed, 07 Nov 2018 
08:12:21 +. Up 15.50 seconds.
  ci-info: +++Net device 
info
  ci-info: 
++--+-+-++---+
  ci-info: | Device |  Up  |   Address   |   Mask  | 
Scope  | Hw-Address|
  ci-info: 
++--+-+-++---+
  ci-info: |  eth0  | True | 10.74.200.25| 255.255.255.192 | 
global | 22:00:0a:4a:c8:19 |
  ci-info: |  eth0  | True | fe80::2000:aff:fe4a:c819/64 |.|  
link  | 22:00:0a:4a:c8:19 |
  ci-info: |   lo   | True |  127.0.0.1  |255.0.0.0|  
host  | . |
  ci-info: |   lo   | True |   ::1/128   |.|  
host  | . |
  ci-info: 
++--+-+-++---+
  ...
  Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'modules:config' at Wed, 07 Nov 
2018 08:12:41 +. Up 35.63 seconds.
  Cloud-init v. 18.4-0ubuntu1~18.04.1 running 'modules:final' at Wed, 07 Nov 
2018 08:12:44 +. Up 38.98 seconds.
  Cloud-init v. 18.4-0ubuntu1~18.04.1 finished at Wed, 07 Nov 2018 08:12:45 
+. Datasource DataSourceEc2Local.  Up 39.38 seconds

  4. Stop the instance.

  5. Start the instance.

  6. Try to SSH.
  Expected to happen: Instance has network and is working.
  What happens: Instance has no working network

  Getting instance log we can see:
  [   11.342357] cloud-init[412]: Cloud-init v. 18.4-0ubuntu1~18.04.1 running 
'init-local' at Wed, 07 Nov 2018 08:21:07 +. Up 10.77 seconds.
  [  OK  ] Started Initial cloud-init job (pre-networking).
  [  OK  ] Reached target Network (Pre).
   Starting Network Service...
  [  OK  ] Started Network Service.
   Starting Network Name Resolution...
   Starting Wait for Network to be Configured...
  [  OK  ] Started Wait for Network to be Configured.
   Starting Initial cloud-init job (metadata service crawler)...
  [  OK  ] Started Network Name Resolution.
  [  OK  ] Reached target Host and Network Name Lookups.
  [  OK  ] Reached target Network.
  [   13.036207] cloud-init[637]: Cloud-init v. 18.4-0ubuntu1~18.04.1 running 
'init' at Wed, 07 Nov 2018 08:21:08 +. Up 12.55 seconds.
  [   13.052849] cloud-init[637]: ci-info: +++Net 
device info
  [   13.100325] cloud-init[637]: ci-info: 
++---+---+---+---+---

[Group.of.nepali.translators] [Bug 1879980] Re: Fail to boot with LUKS on top of RAID1 if the array is broken/degraded

2020-08-04 Thread Guilherme G. Piccoli
** Changed in: mdadm (Ubuntu)
   Status: Confirmed => Opinion

** Changed in: initramfs-tools (Ubuntu)
   Status: Confirmed => In Progress

** Also affects: mdadm (Ubuntu Groovy)
   Importance: Medium
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: Opinion

** Also affects: cryptsetup (Ubuntu Groovy)
   Importance: Medium
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: In Progress

** Also affects: initramfs-tools (Ubuntu Groovy)
   Importance: Medium
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: In Progress

** Also affects: mdadm (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: cryptsetup (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: initramfs-tools (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: mdadm (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: cryptsetup (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: initramfs-tools (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: mdadm (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: cryptsetup (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: initramfs-tools (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Changed in: mdadm (Ubuntu Xenial)
   Status: New => Opinion

** Changed in: mdadm (Ubuntu Bionic)
   Status: New => Opinion

** Changed in: cryptsetup (Ubuntu Xenial)
   Status: New => Opinion

** Changed in: cryptsetup (Ubuntu Bionic)
   Status: New => In Progress

** Changed in: cryptsetup (Ubuntu Bionic)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: cryptsetup (Ubuntu Xenial)
   Importance: Undecided => Medium

** Changed in: cryptsetup (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: cryptsetup (Ubuntu Bionic)
   Importance: Undecided => Medium

** Changed in: cryptsetup (Ubuntu Xenial)
   Status: Opinion => Won't Fix

** Changed in: cryptsetup (Ubuntu Focal)
   Importance: Undecided => Medium

** Changed in: cryptsetup (Ubuntu Focal)
   Status: New => In Progress

** Changed in: cryptsetup (Ubuntu Focal)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: initramfs-tools (Ubuntu Xenial)
   Importance: Undecided => Medium

** Changed in: initramfs-tools (Ubuntu Xenial)
   Status: New => Won't Fix

** Changed in: initramfs-tools (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: initramfs-tools (Ubuntu Bionic)
   Importance: Undecided => Medium

** Changed in: initramfs-tools (Ubuntu Bionic)
   Status: New => In Progress

** Changed in: initramfs-tools (Ubuntu Bionic)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: initramfs-tools (Ubuntu Focal)
   Importance: Undecided => Medium

** Changed in: initramfs-tools (Ubuntu Focal)
   Status: New => In Progress

** Changed in: initramfs-tools (Ubuntu Focal)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: mdadm (Ubuntu Xenial)
   Importance: Undecided => Medium

** Changed in: mdadm (Ubuntu Xenial)
   Status: Opinion => Won't Fix

** Changed in: mdadm (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: mdadm (Ubuntu Bionic)
   Importance: Undecided => Medium

** Changed in: mdadm (Ubuntu Bionic)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: mdadm (Ubuntu Focal)
   Importance: Undecided => Medium

** Changed in: mdadm (Ubuntu Focal)
   Status: New => Opinion

** Changed in: mdadm (Ubuntu Focal)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1879980

Title:
  Fail to boot with LUKS on top of RAID1 if the array is broken/degraded

Status in cryptsetup package in Ubuntu:
  In Progress
Status in initramfs-tools package in Ubuntu:
  In Progress
Status in mdadm package in Ubuntu:
  Opinion
Status in cryptsetup source package in Xenial:
  Won't Fix
Status in initramfs-tools source package in Xenial:
  Won't Fix
Status in mdadm source package in Xenial:
  Won't Fix
Status in cryptsetup source package in Bionic:
  In Progress
Status in initramfs-tools source package in Bionic:
  In Progress
Status in mdadm source package in Bionic:
  Opinion
Status in cryptsetup source package in Focal:
  In Progress
Status in initramfs-tools source package in Focal:
  In Progress
Status in mdadm source package in Focal:
  Opinion
Status in cryptsetup source package in Groov

[Group.of.nepali.translators] [Bug 1914283] Re: Enable CONFIG_PCI_MSI in the linux-kvm derivative

2021-02-03 Thread Guilherme G. Piccoli
** Description changed:

- To be filled - I'm just reserving the LP number for now.
+ [Impact]
+ * Currently linux-kvm derivative doesn't have CONFIG_PCI_MSI (and its 
dependency options) enabled. The goal for such derivative is to be minimal and 
boot as fast as possible in virtual environments, hence most config options 
were dropped.
+ 
+ * Happens that MSI/MSI-X are the de facto drivers' standard with regards
+ to interrupts, and as such the hot path is optimized for MSIs. Boot
+ testing with that config enabled showed that we have improvements in
+ boot time (details in next section).
+ 
+ * Also, performance-wise MSIs are a good idea too, since it usually
+ allows multiple queues in network devices and KVM is more optimized to
+ MSIs in comparison with regular IRQs - tests (detailed in next section)
+ showed performance improvements in virtio devices with MSIs.
+ 
+ * Based on that findings, we are hereby enabling MSIs for the linux-kvm
+ derivatives in all series (Bionic / Focal / Groovy / Hirsute) - notice
+ that Xenial already has that config option enabled.
+ 
+ [Test Case]
+ * All below tests were performed in a x86-64 KVM guest with 2 VCPUs and 2GB 
of RAM, running in a Focal host. Three runs of each test were performed, and we 
took the average.
+  
+ * Boot time test (measured by dmesg timestamp) showed an improvement of ~21%, 
the following chart exhibiting the data: 
https://kernel.ubuntu.com/~gpiccoli/MSI/boot_time.svg
+ We also timed the full boot until the login prompt is available, we had a 
decrease from ~1 second.
+ 
+ * The storage test was performed with the fio tool, using a virtio-blk empty 
disk. The following arguments were used:
+ fio --filename /dev/vdc --rw=rw --runtime 600 --loops 100 --ioengine libaio 
--numjobs 2 --group_reporting
+ 
+ On average we had a ~4.5% speedup in both reads and writes, the
+ following chart represents the data:
+ https://kernel.ubuntu.com/~gpiccoli/MSI/fio_storage.svg
+ 
+ * From the network perspective, we've used iPerf with the following
+ arguments: iperf -c  -t 300 (server was the host machine). On
+ average, the performance improvement was ~8%, as per the following
+ chart: https://kernel.ubuntu.com/~gpiccoli/MSI/iperf_network.svg
+ 
+ [Where problems could occur]
+ * Given that the main linux package (generic) and basically all other 
derivatives already enable this option, and given that MSIs are the standard 
with regards to interrupts from drivers point-of-view, it's safe to say the 
risks are minimal, likely smaller than not enabling MSIs (since the hot path is 
usually more tested/exercised).
+ 
+ * That said, problems could occur if we have bugs in MSI-related code in
+ drivers or in PCI MSI core code, then those potential problems that
+ would already affect all other derivatives begin to affect linux-kvm
+ with this change.

** Changed in: linux-kvm (Ubuntu Xenial)
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1914283

Title:
  Enable CONFIG_PCI_MSI in the linux-kvm derivative

Status in linux-kvm package in Ubuntu:
  In Progress
Status in linux-kvm source package in Xenial:
  Invalid
Status in linux-kvm source package in Bionic:
  In Progress
Status in linux-kvm source package in Focal:
  In Progress
Status in linux-kvm source package in Groovy:
  In Progress
Status in linux-kvm source package in Hirsute:
  In Progress

Bug description:
  [Impact]
  * Currently linux-kvm derivative doesn't have CONFIG_PCI_MSI (and its 
dependency options) enabled. The goal for such derivative is to be minimal and 
boot as fast as possible in virtual environments, hence most config options 
were dropped.

  * Happens that MSI/MSI-X are the de facto drivers' standard with
  regards to interrupts, and as such the hot path is optimized for MSIs.
  Boot testing with that config enabled showed that we have improvements
  in boot time (details in next section).

  * Also, performance-wise MSIs are a good idea too, since it usually
  allows multiple queues in network devices and KVM is more optimized to
  MSIs in comparison with regular IRQs - tests (detailed in next
  section) showed performance improvements in virtio devices with MSIs.

  * Based on that findings, we are hereby enabling MSIs for the linux-
  kvm derivatives in all series (Bionic / Focal / Groovy / Hirsute) -
  notice that Xenial already has that config option enabled.

  [Test Case]
  * All below tests were performed in a x86-64 KVM guest with 2 VCPUs and 2GB 
of RAM, running in a Focal host. Three runs of each test were performed, and we 
took the average.
   
  * Boot time test (measured by dmesg timestamp) showed an improvement of ~21%, 
the following chart exhibiting the data: 
https://kernel.ubuntu.com/~gpiccoli/MSI/boot_time.svg
  We also timed the full boot until the login prompt is availa

[Group.of.nepali.translators] [Bug 1797990] Re: kdump fail due to an IRQ storm

2018-11-23 Thread Guilherme G. Piccoli
Thanks Maurício for submitting the patches and taking care of the bug
while I was out.

I've verified all the 3 releases (in fact, I've also verified Trusty HWE) with 
a similar
test as used by Maurício, "dmesg -t | sort" and the kernels are running fine.
During kdump, with the "pci=clearmsi" option, we can see the message:

"Clearing MSI/MSI-X enable bits early in boot (quirk)"
which shows that the quirk is working.

I'll attach the logs for documentation purposes.
Cheers,


Guilherme

** Changed in: linux (Ubuntu Trusty)
   Status: Confirmed => Won't Fix

** Tags removed: verification-needed-bionic verification-needed-cosmic 
verification-needed-xenial
** Tags added: verification-done-bionic verification-done-cosmic 
verification-done-xenial

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1797990

Title:
  kdump fail due to an IRQ storm

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Trusty:
  Won't Fix
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Fix Committed

Bug description:
  [Impact]

   * A kexec/crash kernel might get stuck and fail to boot
     (for crash kernel, kdump fails to collect a crashdump)
     if a PCI device is buggy/stuck/looping and triggers a
     continuous flood of MSI(X) interrupts (that the kernel
     does not yet know about).

   * This fix allowed to obtain crashdumps when debugging a
     heavy-load scenario, in which a (heavy-loaded) network
     adapter wouldn't stop triggering MSI-X interrupts ever
     after panic()->kdump kicked in.

   * This fix disables MSI(X) in all PCI devices on early
     boot (this is OK as it's (re-)enabled normally later)
     with a kernel cmdline parameter (disabled by default).

  [Test Case]

   * A synthetic test-case is not yet available, however,
     this particular system/workload triggered the problem
     consistently, and it was used for development/testing.

   * We'll update this bug once a synthetic test-case is
     available; we're working on patching QEMU for this.

   * $ cat /proc/cmdline
 <...> pci=clearmsi

 $ dmesg | grep 'Clearing MSI'
     [0.00] Clearing MSI/MSI-X enable bits early in boot (quirk)

   * The comparison of 'dmesg -t | sort' has been reviewed
     between option disabled/enabled on boot & kexec modes,
     and only expected differences found (MHz, PIDs, MIPS).

  [Regression Potential]

   * The potential area for regressions is early boot,
     particularly effects of applying quirks during PCI
     bus scan, which is changed/broader w/ these patches.

   * However, all quirks are applied based on PCI ID
     matching, so would only apply if actually targeting
     a new device.

   * Moreover, the new quirk is only applied based on
     a kernel cmdline parameter that is disabled by
     default, which constraints even more when this
     is actually in effect.

  [Other Info]

   * The patch series is still under review/discussion
     upstream, but it's relatively important for Ubuntu
     users at this point, and after internal discussions
     we decided to submit it for SRU.

   * These are links to the linux-pci archive with the
     patches [1, 2, 3]

     [1] [PATCH 1/3] x86/quirks: Scan all busses for early PCI quirks
     
https://lore.kernel.org/linux-pci/20181018183721.27467-1-gpicc...@canonical.com/

     [2] [PATCH 2/3] x86/PCI: Export find_cap() to be used in early PCI code
     
https://lore.kernel.org/linux-pci/20181018183721.27467-2-gpicc...@canonical.com/

     [3] [PATCH 3/3] x86/quirks: Add parameter to clear MSIs early on boot
     
https://lore.kernel.org/linux-pci/20181018183721.27467-3-gpicc...@canonical.com/

  [Original Description]

  We have reports of a kdump failure in Ubuntu (in x86 machine) that was
  narrowed down to a MSI irq storm coming from a PCI network device.

  The bug manifests as a lack of progress in the boot process of the
  kdump kernel, and a storm of kernel messages like:

  [...]
  [  342.265294] do_IRQ: 0.155 No irq handler for vector
  [  342.266916] do_IRQ: 0.155 No irq handler for vector
  [  347.258422] do_IRQ: 14053260 callbacks suppressed
  [...]

  The root cause of the issue is that the kdump kernel kexec process
  does not ensure PCI devices are reset and/or MSI capabilities are
  disabled, so a PCI device could produce a huge amount of PCI irqs
  which would take all the processing time for the CPU (specially since
  we restrict the kdump kernel to use one single CPU only).

  This was tested using upstream kernel version 4.18, and the problem 
reproduces.
  In the specific test scenario, the PCI NIC was an "Intel 82599ES 10-Gigabit 
[8086:10fb]" that was used in SR-IOV PCI passthrough mode (vfio_pci), under 
h

[Group.of.nepali.translators] [Bug 1810781] Re: mpt3sas - driver using the wrong register to update a queue index in FW

2019-01-07 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => Critical

** Changed in: linux (Ubuntu Cosmic)
   Importance: Undecided => Critical

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => Critical

** Changed in: linux (Ubuntu Xenial)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Bionic)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Cosmic)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Disco)
   Status: Confirmed => Fix Released

** Changed in: linux (Ubuntu Cosmic)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Bionic)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1810781

Title:
  mpt3sas - driver using the wrong register to update a queue index in
  FW

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Confirmed
Status in linux source package in Bionic:
  Confirmed
Status in linux source package in Cosmic:
  Confirmed
Status in linux source package in Disco:
  Fix Released

Bug description:
  [Impact]

  * The mpt3sas driver relies in a FW queue (Reply Post Descriptor
  Queue) in the I/O completion path; there's a MMIO register that driver
  uses to flag an empty entry in such queue, called Reply Post Host
  Index. This value is updated during the driver interrupt routine [in
  _base_interrupt() function].

  * Happens that there are 2 registers representing the Reply Post Host
  Index according to the type of the adapter. They are differentiated in
  the driver through the "ioc->combined_reply_queue" check. By the MPI
  specification (vendor spec), driver should use this combined reply
  queue according to the number of maximum MSI-X vectors that the
  adapter exposes and the spec version (SAS 3.0 vs SAS 3.5).

  * Currently, this is wrong checked for a class of adapters, which was fixed 
in the upstream
  kernel commit 2b48be65685a [0]. Without this commit, we can observe 
spontaneous resets in the
  driver due to queue overflow (FW is not aware that there are free entries in 
the Reply Post Descriptor Queue). The dmesg log will show the following output 
in case of this error:

mpt3sas_cm0: fault_state(0x2100)! 
mpt3sas_cm0: sending diag reset !! 
mpt3sas_cm0: diag reset: SUCCESS 
  [followed by a lot of driver messages as result of the reset procedure]

  * During these resets, I/O is stalled so it may affect performance.

  
  [Test Case]

  * It's not trivial to test the problem, but given a machine with an
  affected device, an I/O benchmark like FIO could be used to exercise
  the I/O path in a heavy way and trigger the issue. We have reports
  that the adapter "LSI Logic / Symbios Logic Device [1000:00ac]" is
  affected by the issue.

  
  [Regression Potential]

  * This is a long-term issue from the mpt3sas driver, affecting only a
  class of adapters of this vendor. Since it's a clear bug, the fix is
  necessary. The potential of regressions is unknown, but likely low -
  it changes the register used for the index updates given some set of
  characteristics of the adapter (according to the spec.), which
  restricts even more the scope of this patch.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1810781/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1810781] Re: mpt3sas - driver using the wrong register to update a queue index in FW

2019-01-07 Thread Guilherme G. Piccoli
Xenial has no support for the SAS 3.5 class, so we won't backport the
patch - it's only needed in Bionic (4.15 / Xenial HWE) and Cosmic kernel
(4.18).

** Changed in: linux (Ubuntu Xenial)
   Status: Confirmed => Won't Fix

** Changed in: linux (Ubuntu Xenial)
   Importance: Critical => Medium

** Changed in: linux (Ubuntu Disco)
   Importance: Critical => Medium

** Changed in: linux (Ubuntu Xenial)
     Assignee: Guilherme G. Piccoli (gpiccoli) => Mauricio Faria de Oliveira 
(mfo)

** Changed in: linux (Ubuntu Bionic)
 Assignee: Guilherme G. Piccoli (gpiccoli) => Mauricio Faria de Oliveira 
(mfo)

** Changed in: linux (Ubuntu Cosmic)
 Assignee: Guilherme G. Piccoli (gpiccoli) => Mauricio Faria de Oliveira 
(mfo)

** Changed in: linux (Ubuntu Disco)
     Assignee: Guilherme G. Piccoli (gpiccoli) => Mauricio Faria de Oliveira 
(mfo)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1810781

Title:
  mpt3sas - driver using the wrong register to update a queue index in
  FW

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Won't Fix
Status in linux source package in Bionic:
  Confirmed
Status in linux source package in Cosmic:
  Confirmed
Status in linux source package in Disco:
  Fix Released

Bug description:
  [Impact]

  * The mpt3sas driver relies in a FW queue (Reply Post Descriptor
  Queue) in the I/O completion path; there's a MMIO register that driver
  uses to flag an empty entry in such queue, called Reply Post Host
  Index. This value is updated during the driver interrupt routine [in
  _base_interrupt() function].

  * Happens that there are 2 registers representing the Reply Post Host
  Index according to the type of the adapter. They are differentiated in
  the driver through the "ioc->combined_reply_queue" check. By the MPI
  specification (vendor spec), driver should use this combined reply
  queue according to the number of maximum MSI-X vectors that the
  adapter exposes and the spec version (SAS 3.0 vs SAS 3.5).

  * Currently, this is wrong checked for a class of adapters, which was fixed 
in the upstream
  kernel commit 2b48be65685a [0]. Without this commit, we can observe 
spontaneous resets in the
  driver due to queue overflow (FW is not aware that there are free entries in 
the Reply Post Descriptor Queue). The dmesg log will show the following output 
in case of this error:

mpt3sas_cm0: fault_state(0x2100)! 
mpt3sas_cm0: sending diag reset !! 
mpt3sas_cm0: diag reset: SUCCESS 
  [followed by a lot of driver messages as result of the reset procedure]

  * During these resets, I/O is stalled so it may affect performance.

  
  [Test Case]

  * It's not trivial to test the problem, but given a machine with an
  affected device, an I/O benchmark like FIO could be used to exercise
  the I/O path in a heavy way and trigger the issue. We have reports
  that the adapter "LSI Logic / Symbios Logic Device [1000:00ac]" is
  affected by the issue.

  
  [Regression Potential]

  * This is a long-term issue from the mpt3sas driver, affecting only a
  class of adapters of this vendor. Since it's a clear bug, the fix is
  necessary. The potential of regressions is unknown, but likely low -
  it changes the register used for the index updates given some set of
  characteristics of the adapter (according to the spec.), which
  restricts even more the scope of this patch.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1810781/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1791758] Re: ldisc crash on reopened tty

2019-01-07 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu Trusty)
   Status: Confirmed => Won't Fix

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Cosmic)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1791758

Title:
  ldisc crash on reopened tty

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Trusty:
  Won't Fix
Status in linux source package in Xenial:
  Confirmed
Status in linux source package in Bionic:
  Confirmed
Status in linux source package in Cosmic:
  Confirmed

Bug description:
  [Impact]

  The following Oops was discovered by user:

  [684766.39] BUG: unable to handle kernel paging request at 
2268
  [684766.667642] IP: [] n_tty_receive_buf_common+0x6a/0xae0
  [684766.668487] PGD 8019574fe067 PUD 19574ff067 PMD 0
  [684766.669194] Oops:  [#1] SMP
  [684766.669687] Modules linked in: xt_nat dccp_diag dccp tcp_diag udp_diag 
inet_diag unix_diag xt_connmark ipt_REJECT nf_reject_ipv4 nf_conntrack_netlink 
nfnetlink veth ip6table_filter ip6_tables xt_tcpmss xt_multiport xt_conntrack 
iptable_filter xt_CHECKSUM xt_tcpudp iptable_mangle xt_CT iptable_raw 
ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_comment iptable_nat ip_tables x_tables 
target_core_mod configfs softdog scini(POE) ib_iser rdma_cm iw_cm ib_cm ib_sa 
ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 
openvswitch(OE) nf_nat_ipv6 nf_nat_ipv4 nf_nat gre kvm_intel kvm irqbypass ttm 
crct10dif_pclmul drm_kms_helper crc32_pclmul ghash_clmulni_intel drm 
aesni_intel aes_x86_64 i2c_piix4 lrw gf128mul fb_sys_fops syscopyarea 
glue_helper sysfillrect ablk_helper cryptd sysimgblt joydev
  [684766.679406]  input_leds mac_hid serio_raw 8250_fintek br_netfilter bridge 
stp llc nf_conntrack_proto_gre nf_conntrack_ipv6 nf_defrag_ipv6 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack xfs raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 psmouse multipath floppy pata_acpi linear dm_multipath
  [684766.683585] CPU: 15 PID: 7470 Comm: kworker/u40:1 Tainted: P   OE 
  4.4.0-124-generic #148~14.04.1-Ubuntu
  [684766.684967] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
Bochs 01/01/2011
  [684766.686062] Workqueue: events_unbound flush_to_ldisc
  [684766.686703] task: 88165e5d8000 ti: 88170dc2c000 task.ti: 
88170dc2c000
  [684766.687670] RIP: 0010:[]  [] 
n_tty_receive_buf_common+0x6a/0xae0
  [684766.688870] RSP: 0018:88170dc2fd28  EFLAGS: 00010202
  [684766.689521] RAX:  RBX: 88162c895000 RCX: 
0001
  [684766.690488] RDX:  RSI: 88162c895020 RDI: 
8819c2d3d4d8
  [684766.691518] RBP: 88170dc2fdc0 R08: 0001 R09: 
81ec2ba0
  [684766.692480] R10: 0004 R11:  R12: 
8819c2d3d400
  [684766.693423] R13: 8819c45b2670 R14: 8816a358c028 R15: 
8819c2d3d400
  [684766.694390] FS:  () GS:8819d73c() 
knlGS:
  [684766.695484] CS:  0010 DS:  ES:  CR0: 80050033
  [684766.696182] CR2: 2268 CR3: 00195752 CR4: 
00360670
  [684766.697141] DR0:  DR1:  DR2: 

  [684766.698114] DR3:  DR6: fffe0ff0 DR7: 
0400
  [684766.699079] Stack:
  [684766.699412]   8819c2d3d4d8  
8819c2d3d648
  [684766.700467]  8819c2d3d620 8819c9c10400 88170dc2fd68 
8106312e
  [684766.701501]  88170dc2fd78 0001  
88162c895020
  [684766.702534] Call Trace:
  [684766.702905]  [] ? kvm_sched_clock_read+0x1e/0x30
  [684766.703685]  [] n_tty_receive_buf2+0x14/0x20
  [684766.704505]  [] flush_to_ldisc+0xd5/0x120
  [684766.705269]  [] process_one_work+0x156/0x400
  [684766.706008]  [] worker_thread+0x11a/0x480
  [684766.706686]  [] ? rescuer_thread+0x310/0x310
  [684766.707386]  [] kthread+0xd8/0xf0
  [684766.707993]  [] ? kthread_park+0x60/0x60
  [684766.708664]  [] ret_from_fork+0x55/0x80
  [684766.709335]  [] ? kthread_park+0x60/0x60
  [684766.709998] Code: 85 70 ff ff ff e8 97 5f 33 00 49 8d 87 20 02 00 00 c7 
45 b4 00 00 00 00 48 89 45 88 49 8d 87 48 02 00 00 48 89 45 80 48 8b 45 b8 <48> 
8b b0 68 22 00 00 48 8b 08 89 f0 29 c8 41 f6 87 30 01 00 00
  [684766.713290] RIP  [] n_tty_receive_buf_common+0x6a/0xae0
  [684766.714105]  RSP 
  [684766.714609] CR2: 2268

  The issue happened in a VM
  KDUMP was configured, so a full Kernel crashdump was created

  User has Ubuntu Trusty, Kernel 4.4.0-124 on its VM

  [Test Case]

  * Deploy a Trusty KVM instance 

[Group.of.nepali.translators] [Bug 1800566] Re: Make the reset_devices parameter default for kdump kernels

2019-07-04 Thread Guilherme G. Piccoli
** Changed in: makedumpfile (Ubuntu)
 Assignee: Heitor Alves de Siqueira (halves) => Guilherme G. Piccoli 
(gpiccoli)

** Changed in: makedumpfile (Ubuntu Xenial)
 Assignee: Heitor Alves de Siqueira (halves) => Guilherme G. Piccoli 
(gpiccoli)

** Changed in: makedumpfile (Ubuntu Bionic)
 Assignee: Heitor Alves de Siqueira (halves) => Guilherme G. Piccoli 
(gpiccoli)

** Changed in: makedumpfile (Ubuntu Disco)
 Assignee: Heitor Alves de Siqueira (halves) => Guilherme G. Piccoli 
(gpiccoli)

** Changed in: makedumpfile (Ubuntu Trusty)
   Status: Confirmed => Won't Fix

** Changed in: makedumpfile (Ubuntu Cosmic)
 Assignee: Heitor Alves de Siqueira (halves) => Guilherme G. Piccoli 
(gpiccoli)

** Changed in: makedumpfile (Ubuntu Trusty)
 Assignee: Heitor Alves de Siqueira (halves) => Guilherme G. Piccoli 
(gpiccoli)

** Also affects: makedumpfile (Ubuntu Eoan)
   Importance: High
     Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: Confirmed

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1800566

Title:
  Make the reset_devices parameter default for kdump kernels

Status in makedumpfile package in Ubuntu:
  Confirmed
Status in makedumpfile source package in Trusty:
  Won't Fix
Status in makedumpfile source package in Xenial:
  Confirmed
Status in makedumpfile source package in Bionic:
  Confirmed
Status in makedumpfile source package in Cosmic:
  Confirmed
Status in makedumpfile source package in Disco:
  Confirmed
Status in makedumpfile source package in Eoan:
  Confirmed

Bug description:
  [Impact]
  Crash kernels do not advise some subsystems to perform a reset by default.

  [Description]
  Kernel has the "reset_devices" parameter that drivers can opt-in, and perform 
special activity in case this parameter is parsed from command-line. For 
example, in kdump kernels it hints the drivers that they (maybe) are booting 
from a non-healthy condition and needs to issue some reset to the adapter. 
Users currently (kernel v4.19) are: hpsa, ipr, megaraid_sas, mpt3sas, smartpqi, 
xenbus.

  This should be enabled by default in the kdump config file to be added
  in the kdump kernel command-line for all versions.

  [Test Case]
  1) Deploy a Disco VM e.g. with uvt-kvm
  2) Install the kdump-tools package
  3) Run `kdump-config test`and check for the 'reset_devices' parameter:

  $ kdump-config test
  ...
  kexec command to be used:
    /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-4.15.0-45-generic 
root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0 nr_cpus=1 
systemd.unit=kdump-tools.service irqpoll nousb ata_piix.prefer_ms_hyperv=0" 
/var/lib/kdump/vmlinuz

  [Regression Potential]
  The regression potential is very low, since it doesn't need any changes in 
makedumpfile code and we're only adding a parameter on the crashkernel cmdline.
  The fix will be tested with autopkgtests and normal kdump use-case scenarios.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1800566/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1800562] Re: Remove obsolete "nousb" option in kdump command-line for newer kernels

2019-07-04 Thread Guilherme G. Piccoli
** Also affects: makedumpfile (Ubuntu Eoan)
   Importance: High
 Assignee: Heitor Alves de Siqueira (halves)
   Status: Opinion

** Changed in: makedumpfile (Ubuntu Xenial)
   Status: In Progress => Won't Fix

** Changed in: makedumpfile (Ubuntu Disco)
   Status: Opinion => Confirmed

** Changed in: makedumpfile (Ubuntu Eoan)
   Status: Opinion => Confirmed

** Changed in: makedumpfile (Ubuntu Xenial)
 Assignee: Heitor Alves de Siqueira (halves) => Guilherme G. Piccoli 
(gpiccoli)

** Changed in: makedumpfile (Ubuntu Bionic)
 Assignee: Heitor Alves de Siqueira (halves) => Guilherme G. Piccoli 
(gpiccoli)

** Changed in: makedumpfile (Ubuntu Cosmic)
 Assignee: Heitor Alves de Siqueira (halves) => Guilherme G. Piccoli 
(gpiccoli)

** Changed in: makedumpfile (Ubuntu Disco)
 Assignee: Heitor Alves de Siqueira (halves) => Guilherme G. Piccoli 
(gpiccoli)

** Changed in: makedumpfile (Ubuntu Eoan)
 Assignee: Heitor Alves de Siqueira (halves) => Guilherme G. Piccoli 
(gpiccoli)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1800562

Title:
  Remove obsolete "nousb" option in kdump command-line for newer kernels

Status in makedumpfile package in Ubuntu:
  Confirmed
Status in makedumpfile source package in Xenial:
  Won't Fix
Status in makedumpfile source package in Bionic:
  In Progress
Status in makedumpfile source package in Cosmic:
  In Progress
Status in makedumpfile source package in Disco:
  Confirmed
Status in makedumpfile source package in Eoan:
  Confirmed

Bug description:
  [Impact]
  Crash kernels include an obsolete "nousb" parameter by default, which can 
cause confusion since it's been deprecated in newer kernel versions.

  [Description]
  Since kernel v4.5, the correct parameter to disable USB subsystem 
initialization is "usbcore.nousb" always (instead of "nousb" in case the 
subsystem is built-in). This was changed by commit 097a9ea0e48 ("usb: make 
"nousb" a clear module parameter").

  We need to take this into account in kdump-tools, or else we may boot
  with USB in kdump even the command-line appears to be saying the
  opposite.

  This affects Xenial onwards, since the system may be running an HWE or
  other supported v4.5+ kernel.

  [Test Case]
  1) Deploy a Disco VM e.g. with uvt-kvm
  2) Install the kdump-tools package
  3) Run `kdump-config test`and check for the 'nousb' parameter:

  $ kdump-config test
  ...
  kexec command to be used:
    /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-4.15.0-45-generic 
root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0 nr_cpus=1 
systemd.unit=kdump-tools.service irqpoll nousb ata_piix.prefer_ms_hyperv=0" 
/var/lib/kdump/vmlinuz

  [Regression Potential]
  The regression potential is extremely low, since it doesn't need any changes 
in makedumpfile code and we're only removing an already ineffective parameter 
from the crashkernel cmdline. Nonetheless, patches will be tested with 
autopkgtests and normal kdump use-case scenarios.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1800562/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1836760] [NEW] ixgbe{vf} - Physical Function gets IRQ when VF checks link state

2019-07-16 Thread Guilherme G. Piccoli
Public bug reported:

TBA

** Affects: linux (Ubuntu)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed

** Affects: linux (Ubuntu Xenial)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed

** Affects: linux (Ubuntu Bionic)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed

** Affects: linux (Ubuntu Cosmic)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed

** Affects: linux (Ubuntu Disco)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed

** Affects: linux (Ubuntu Eoan)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed

** Affects: linux (Ubuntu Ff-series)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Fix Released


** Tags: sts

** Summary changed:

- ixgbe - Physical Function gets IRQ when VF checks link state 
+ ixgbe{vf} - Physical Function gets IRQ when VF checks link state

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Cosmic)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Eoan)
   Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: Confirmed

** Also affects: linux (Ubuntu Ff-series)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Disco)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Ff-series)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Disco)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Cosmic)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Bionic)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Ff-series)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Disco)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Cosmic)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Bionic)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Disco)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Cosmic)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Bionic)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Xenial)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Ff-series)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Ff-series)
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1836760

Title:
  ixgbe{vf} - Physical Function gets IRQ when VF checks link state

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Xenial:
  Confirmed
Status in linux source package in Bionic:
  Confirmed
Status in linux source package in Cosmic:
  Confirmed
Status in linux source package in Disco:
  Confirmed
Status in linux source package in Eoan:
  Confirmed
Status in linux source package in FF-Series:
  Fix Released

Bug description:
  TBA

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1836760/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1836760] Re: ixgbe{vf} - Physical Function gets IRQ when VF checks link state

2019-07-16 Thread Guilherme G. Piccoli
SRU request was sent to kernel-team mailing list:
https://lists.ubuntu.com/archives/kernel-team/2019-July/102282.html

Cheers,


Guilherme

** Changed in: linux (Ubuntu Xenial)
   Status: Confirmed => In Progress

** Changed in: linux (Ubuntu Bionic)
   Status: Confirmed => In Progress

** Changed in: linux (Ubuntu Cosmic)
   Status: Confirmed => Won't Fix

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1836760

Title:
  ixgbe{vf} - Physical Function gets IRQ when VF checks link state

Status in linux package in Ubuntu:
  Confirmed
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  Confirmed
Status in linux source package in Eoan:
  Confirmed
Status in linux source package in FF-Series:
  Fix Released

Bug description:
  [Impact]

  * Intel NICs that are SR-IOV capable and are managed by ixgbe driver presents 
a potentially harmful behavior when the ixgbevf-managed VFs (Virtual Functions) 
perform an ethtool link check. The ixgbevf driver issues a mailbox command in 
the ethtool link state handler, which induces one IRQ in the PF (Physical 
Function) per link check.
   
  * This was reported as a sort of "denial-of-service" from a guest; due to 
some link check loop running inside a guest with PCI-PT of a ixgbevf-managed 
VF, the host received a huge amount of IRQs causing soft-lockups.
   
  * The patch proposed in this SRU request fix this behavior by relying in the 
saved link state (obtained in the ixgbevf's watchdog routine) instead of 
issuing a mailbox command to the PF in every link state check request. The 
commit is available on Linus tree: 1e1b0c658d9b ("ixgbevf: Use cached link 
state instead of re-reading the value for ethtool")
  http://git.kernel.org/linus/1e1b0c658d9b

  [Test case]

  Reproducing the behavior is pretty simple; having a machine with an
  Intel NIC managed by ixgbe, proceed with the following steps:

  a) Create one or more VFs
  (echo 1 > /sys/class/net//device/sriov_numvfs)

  b) In a different terminal, monitor the non-TxRx PF IRQs:
  (watch -n1 "cat /proc/interrupts | grep  | grep -v Tx")

  c) Run "ethtool " in a loop

  Without the hereby proposed patch, the PF IRQs will increase.

  [Regression potential]

  The patch scope is restricted to ixgbevf ethtool link-check procedure,
  and was developed by the vendor itself. Being a self-contained patch
  affecting only this driver's ethtool handler, the worst potential
  regression would be a wrong link state report.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1836760/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1836760] Re: ixgbe{vf} - Physical Function gets IRQ when VF checks link state

2019-07-16 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu Ff-series)
   Status: Fix Released => Fix Committed

** Changed in: linux (Ubuntu Eoan)
   Status: Confirmed => In Progress

** Changed in: linux (Ubuntu Disco)
   Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1836760

Title:
  ixgbe{vf} - Physical Function gets IRQ when VF checks link state

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Xenial:
  In Progress
Status in linux source package in Bionic:
  In Progress
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  In Progress
Status in linux source package in Eoan:
  In Progress
Status in linux source package in FF-Series:
  Fix Committed

Bug description:
  [Impact]

  * Intel NICs that are SR-IOV capable and are managed by ixgbe driver presents 
a potentially harmful behavior when the ixgbevf-managed VFs (Virtual Functions) 
perform an ethtool link check. The ixgbevf driver issues a mailbox command in 
the ethtool link state handler, which induces one IRQ in the PF (Physical 
Function) per link check.
   
  * This was reported as a sort of "denial-of-service" from a guest; due to 
some link check loop running inside a guest with PCI-PT of a ixgbevf-managed 
VF, the host received a huge amount of IRQs causing soft-lockups.
   
  * The patch proposed in this SRU request fix this behavior by relying in the 
saved link state (obtained in the ixgbevf's watchdog routine) instead of 
issuing a mailbox command to the PF in every link state check request. The 
commit is available on Linus tree: 1e1b0c658d9b ("ixgbevf: Use cached link 
state instead of re-reading the value for ethtool")
  http://git.kernel.org/linus/1e1b0c658d9b

  [Test case]

  Reproducing the behavior is pretty simple; having a machine with an
  Intel NIC managed by ixgbe, proceed with the following steps:

  a) Create one or more VFs
  (echo 1 > /sys/class/net//device/sriov_numvfs)

  b) In a different terminal, monitor the non-TxRx PF IRQs:
  (watch -n1 "cat /proc/interrupts | grep  | grep -v Tx")

  c) Run "ethtool " in a loop

  Without the hereby proposed patch, the PF IRQs will increase.

  [Regression potential]

  The patch scope is restricted to ixgbevf ethtool link-check procedure,
  and was developed by the vendor itself. Being a self-contained patch
  affecting only this driver's ethtool handler, the worst potential
  regression would be a wrong link state report.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1836760/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1681909] Re: kdump is not captured in remote host when kdump over ssh is configured

2019-07-23 Thread Guilherme G. Piccoli
Thanks Eric for sponsoring this LP!

I'm marking Xenial as "Won't Fix" for now, since we had no issue reports
and it'll require a slightly more complex backport than Bionic. We'll
discuss about the SRU to Xenial in some time, specially if we have
reports of this failure in that release.

Cheers,


Guilherme

** Changed in: makedumpfile (Ubuntu Xenial)
   Status: In Progress => Won't Fix

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1681909

Title:
  kdump is not captured in remote host when kdump over ssh is configured

Status in The Ubuntu-power-systems project:
  In Progress
Status in makedumpfile package in Ubuntu:
  Fix Committed
Status in makedumpfile source package in Xenial:
  Won't Fix
Status in makedumpfile source package in Bionic:
  In Progress
Status in makedumpfile source package in Cosmic:
  Won't Fix
Status in makedumpfile source package in Disco:
  In Progress
Status in makedumpfile source package in Eoan:
  Fix Committed

Bug description:
  [Impact]

  * Kdump over network (like NFS mount or SSH dump) relies on network-
  online target from systemd. Even so, there are some NICs that report
  "Link Up" state but aren't ready to transmit packets. This is a
  generally bad behavior that is credited probably to NIC firmware
  delays, usually not fixable from drivers. Some adapters known to act
  like this are bnx2x, tg3 and ixgbe.

  * Kdump is a mechanism that may be a last resort to debug complex/hard
  to reproduce issues, so it's interesting to increase its reliability /
  resilience. We then propose here a solution/quirk to this issue on
  network dump by adding a retry/delay mechanism; if it's a network
  dump, kdump will retry some times and sleep between the attempts in
  order to exclude the case of NICs that aren't ready yet but will soon
  be able to transmit packets.

  * Although first reported by IBM in PowerPC arch, the scope for this
  issue is the NIC, and it was later reported in x86 arch too.

  [Test case]

  Usually it's difficult to naturally reproduce this issue in a deterministic 
way, but we have an artificial test case on comment #24 of this LP.
  Also, we have a report from this bug in which the user managed to reproduce 
the problem consistently - it's fixed after testing our solution.

  [Regression potential]

  There's not a clear regression potential here since it's just a retry/delay 
mechanism. Some potential problems may come from bad coding in the script.
  The delay between attempts is only 3 sec per iteration, so it shouldn't block 
the kdump progress for a high amount of time at once.

  [Other information]

  Salsa Debian commit:
  
https://salsa.debian.org/debian/makedumpfile/commit/d63ba95337988be1eac8c8c76d90825ff5c6d17f

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1681909/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1832082] Re: bnx2x driver causes 100% CPU load

2019-07-30 Thread Guilherme G. Piccoli
I've validated the -proposed kernels for Xenial (4.4.0-158), Bionic (4.15.0-56) 
and Disco (5.0.0-22), using the test case mentioned in the description. All 
working fine, the issue is gone.
Also, the patch was released upstream in the 5.3.x series, so I'll mark 
ff-series as Released.

Cheers,


Guilherme

** Changed in: linux (Ubuntu Ff-series)
   Status: Fix Committed => Fix Released

** Tags removed: verification-needed-bionic verification-needed-disco 
verification-needed-xenial
** Tags added: verification-done-bionic verification-done-disco 
verification-done-xenial

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1832082

Title:
  bnx2x driver causes 100% CPU load

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed
Status in linux source package in FF-Series:
  Fix Released

Bug description:
  [Impact]

  * The PTP feature in bnx2x driver is implemented in a way that if the
  NIC firmware takes some time to perform the timestamping - which is
  observed as a bad register read in bnx2x_ptp_task() - then the ptp
  worker function will reschedule itself indefinitely until the value
  read from the register is meaningful. With that behavior, if an
  userspace tool request a bad configured RX filter to bnx2x (or if NIC
  firmware has any other issue in timestamping), the function
  bnx2x_ptp_task() will be rescheduled forever and cause a unbound
  resource consumption. This manifests as a kworker thread consuming
  100% of CPU.

  
  * The dmesg log will show the following message regarding other packets being 
skipped on timestamp routine due to a packet getting stuck in the timestamping 
"pipeline":

  "bnx2x: [bnx2x_start_xmit:3862(eno4)]The device supports only a single
  outstanding packet to timestamp, this packet will not be timestamped"

  Also, by using ftrace user can notice that function bnx2x_ptp_task()
  is being called a lot, and by enabling bnx2x PTP debugging log
  (ethtool -s  msglvl 16777216) it's possible to observe the
  following message flooding the kernel log:

  "bnx2x: [bnx2x_ptp_task:15242(eno4)]There is no valid Tx timestamp
  yet"

  
  * The  patch proposed in this SRU request is accepted upstream and is 
available currently (2019-07-03) in David Miller's linux-net tree:
  git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=3c91f25c2f72
  Besides fixing the issue, it also adds an ethtool statistics for accounting 
the ptp errors and reduces message flooding in case of errors.


  [Test case]

  Reproducing the problem is not difficult; we've used chrony in Bionic
  to trigger the problem. The steps are:

  a) Install chrony on Bionic in a system with working NIC managed by
  bnx2x;

  b) Edit chrony configuration and add: "hwtimestamp *" to the top of
  its conf file;

  c) Restart chrony service

  Check dmesg for the "[...]single outstanding packet" message and the
  overall CPU workload using a tool like "top" to observe a kthread
  consuming 100% of CPU.

  
  [Regression potential]

  The patch scope is restricted to bnx2x ptp handler, and was validated
  by the driver maintainer. If there's any possibility of regressions,
  we believe the worst would be an issue affecting the packet
  timestamping, not messing with the regular xmit path for the driver.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832082/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1836760] Re: ixgbe{vf} - Physical Function gets IRQ when VF checks link state

2019-07-30 Thread Guilherme G. Piccoli
I've validated the -proposed kernels for Xenial (4.4.0-158), Bionic (4.15.0-56) 
and Disco (5.0.0-22), using the test case mentioned in the description. All 
working fine, the issue is gone.
Also, the patch was released upstream in the 5.3.x series, so I'll mark 
ff-series as Released.

Cheers,

Guilherme

** Changed in: linux (Ubuntu Ff-series)
   Status: Fix Committed => Fix Released

** Tags removed: verification-needed-bionic verification-needed-disco 
verification-needed-xenial
** Tags added: verification-done-bionic verification-done-disco 
verification-done-xenial

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1836760

Title:
  ixgbe{vf} - Physical Function gets IRQ when VF checks link state

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  Fix Committed
Status in linux source package in Eoan:
  Fix Committed
Status in linux source package in FF-Series:
  Fix Released

Bug description:
  [Impact]

  * Intel NICs that are SR-IOV capable and are managed by ixgbe driver presents 
a potentially harmful behavior when the ixgbevf-managed VFs (Virtual Functions) 
perform an ethtool link check. The ixgbevf driver issues a mailbox command in 
the ethtool link state handler, which induces one IRQ in the PF (Physical 
Function) per link check.
   
  * This was reported as a sort of "denial-of-service" from a guest; due to 
some link check loop running inside a guest with PCI-PT of a ixgbevf-managed 
VF, the host received a huge amount of IRQs causing soft-lockups.
   
  * The patch proposed in this SRU request fix this behavior by relying in the 
saved link state (obtained in the ixgbevf's watchdog routine) instead of 
issuing a mailbox command to the PF in every link state check request. The 
commit is available on Linus tree: 1e1b0c658d9b ("ixgbevf: Use cached link 
state instead of re-reading the value for ethtool")
  http://git.kernel.org/linus/1e1b0c658d9b

  [Test case]

  Reproducing the behavior is pretty simple; having a machine with an
  Intel NIC managed by ixgbe, proceed with the following steps:

  a) Create one or more VFs
  (echo 1 > /sys/class/net//device/sriov_numvfs)

  b) In a different terminal, monitor the non-TxRx PF IRQs:
  (watch -n1 "cat /proc/interrupts | grep  | grep -v Tx")

  c) Run "ethtool " in a loop

  Without the hereby proposed patch, the PF IRQs will increase.

  [Regression potential]

  The patch scope is restricted to ixgbevf ethtool link-check procedure,
  and was developed by the vendor itself. Being a self-contained patch
  affecting only this driver's ethtool handler, the worst potential
  regression would be a wrong link state report.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1836760/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1795659] Re: kernel panic using CIFS share in smb2_push_mandatory_locks()

2019-07-31 Thread Guilherme G. Piccoli
** Changed in: linux-azure (Ubuntu)
   Status: Confirmed => Fix Released

** Changed in: linux (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1795659

Title:
  kernel panic using CIFS share in smb2_push_mandatory_locks()

Status in linux package in Ubuntu:
  Confirmed
Status in linux-azure package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Invalid
Status in linux-azure source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Committed
Status in linux-azure source package in Bionic:
  Invalid
Status in linux source package in Cosmic:
  Won't Fix
Status in linux-azure source package in Cosmic:
  Invalid
Status in linux source package in Disco:
  Fix Released
Status in linux-azure source package in Disco:
  Invalid

Bug description:
  [Impact]

  * We got reports of a kernel crash in cifs module with the following
  signature:

  BUG: unable to handle kernel NULL pointer dereference at 0038
  IP: smb2_push_mandatory_locks+0x10e/0x3b0 [cifs]
  PGD 0 P4D 0
  RIP: 0010:smb2_push_mandatory_locks+0x10e/0x3b0 [cifs]
  Call Trace:
   cifs_oplock_break+0x12f/0x3d0 [cifs]
   process_one_work+0x14d/0x410
   worker_thread+0x4b/0x460
   kthread+0x105/0x140
  [...]

  * Low-level analysis (decodecode script output and the objdump of the
  function) revealed that we are crashing in a NULL ptr dereference when
  trying to access "cfile->tlink"; below, a snippet of the objdump at
  function smb2_push_mandatory_locks():

  [...]
  mov0x10(%r14),%r15   # %r15 = cifsFileInfo *cfile
  mov0x18(%r14),%rbx   # %rbx = cifsLockInfo *li = (fdlocks->locks)
  lea0x18(%r14),%r12
  mov0x90(%r15),%rax   # %rax = struct tcon_link *tlink (cfile->tlink)
  cmp%r12,%rbx
  mov0x38(%rax),%rax   # <--- TRAP [trying to get cifs_tcon *tl_tcon]
  [...]

  * After discussing the issue with CIFS maintainers (Steve French and
  Pavel Shilovsky) they suggested commit b98749cac4a6 ("CIFS: keep
  FileInfo handle live during oplock break")
  [http://git.kernel.org/linus/b98749cac4a6] as a fix for multiple
  reports of this kind of crash.

  * The fix was sent to stable kernels and is present in Ubuntu kernels
  5.0 and newer. We are requesting the SRU for this patch here in order
  to fix the crashes, after reports of successful testing with the patch
  (see below section) and since the patch is restricted to the cifs
  module scope and accepted on linux stable.

  * Alternatively the issue is known to be avoided when oplocks are
  disabled using "cifs.enable_oplocks=N" module parameter.

  [Test case]

  * Unfortunately we cannot reproduce the issue. The patch proposed here was
  validated by us with xfstests (instructions followed from
  https://wiki.samba.org/index.php/Xfstesting-cifs) and fio. Also, we
  have a user report of test validation using LISA 
(https://github.com/LIS/LISAv2).

  * Using xfstest with the exclusions proposed in the link above we
  managed to get the same results as a non-patched kernel, i.e., the
  same tests failed in both kernels, we didn't get worse results with
  the patch. Fio also didn't show noticeable performance regression with
  the patch.

  [Regression potential]

  * The patch was validated by the cifs filesystem maintainers (in fact
  they suggested its inclusion in Ubuntu) and by the aforementioned
  tests; also, the scope is restricted to cifs only so the likelihood of
  regressions is considered low.

  * Due to the nature of the code modification (add a new reference of a
  file handler and manipulate it in different places), I consider that
  if we have a regression it'll manifest as deadlock/blocked tasks, not
  something more serious like crashes or data corruption.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1795659/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1844455] [NEW] Memory leak on libvirt 1.3.1

2019-09-17 Thread Guilherme G. Piccoli
Public bug reported:

It was reported that libvirt 1.3.1 running on Trusty (through
UCA/Mitaka) is getting OOM'ed after a while - in our reports took 2
years for the leak to trigger an out-of-memory situation, but this may
change according to the user available memory.

Valgrind was executed in a similar environment, we were able to collect
information about the "definitely lost" memory of libvirt process
(attached) below.

The leaks are detailed in next comments.

** Affects: libvirt (Ubuntu)
 Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: Confirmed

** Affects: libvirt (Ubuntu Xenial)
 Importance: Undecided
 Assignee: Guilherme G. Piccoli (gpiccoli)
 Status: New


** Tags: sts

** Also affects: libvirt (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Changed in: libvirt (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1844455

Title:
  Memory leak on libvirt 1.3.1

Status in libvirt package in Ubuntu:
  Confirmed
Status in libvirt source package in Xenial:
  New

Bug description:
  It was reported that libvirt 1.3.1 running on Trusty (through
  UCA/Mitaka) is getting OOM'ed after a while - in our reports took 2
  years for the leak to trigger an out-of-memory situation, but this may
  change according to the user available memory.

  Valgrind was executed in a similar environment, we were able to
  collect information about the "definitely lost" memory of libvirt
  process (attached) below.

  The leaks are detailed in next comments.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1844455/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1836760] Re: ixgbe{vf} - Physical Function gets IRQ when VF checks link state

2019-10-28 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu Trusty)
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1836760

Title:
  ixgbe{vf} - Physical Function gets IRQ when VF checks link state

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Trusty:
  Fix Released
Status in linux source package in Xenial:
  Fix Released
Status in linux source package in Bionic:
  Fix Released
Status in linux source package in Cosmic:
  Won't Fix
Status in linux source package in Disco:
  Fix Released
Status in linux source package in Eoan:
  Fix Released
Status in linux source package in Focal:
  Fix Released

Bug description:
  [Impact]

  * Intel NICs that are SR-IOV capable and are managed by ixgbe driver presents 
a potentially harmful behavior when the ixgbevf-managed VFs (Virtual Functions) 
perform an ethtool link check. The ixgbevf driver issues a mailbox command in 
the ethtool link state handler, which induces one IRQ in the PF (Physical 
Function) per link check.
   
  * This was reported as a sort of "denial-of-service" from a guest; due to 
some link check loop running inside a guest with PCI-PT of a ixgbevf-managed 
VF, the host received a huge amount of IRQs causing soft-lockups.
   
  * The patch proposed in this SRU request fix this behavior by relying in the 
saved link state (obtained in the ixgbevf's watchdog routine) instead of 
issuing a mailbox command to the PF in every link state check request. The 
commit is available on Linus tree: 1e1b0c658d9b ("ixgbevf: Use cached link 
state instead of re-reading the value for ethtool")
  http://git.kernel.org/linus/1e1b0c658d9b

  [Test case]

  Reproducing the behavior is pretty simple; having a machine with an
  Intel NIC managed by ixgbe, proceed with the following steps:

  a) Create one or more VFs
  (echo 1 > /sys/class/net//device/sriov_numvfs)

  b) In a different terminal, monitor the non-TxRx PF IRQs:
  (watch -n1 "cat /proc/interrupts | grep  | grep -v Tx")

  c) Run "ethtool " in a loop

  Without the hereby proposed patch, the PF IRQs will increase.

  [Regression potential]

  The patch scope is restricted to ixgbevf ethtool link-check procedure,
  and was developed by the vendor itself. Being a self-contained patch
  affecting only this driver's ethtool handler, the worst potential
  regression would be a wrong link state report.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1836760/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1681909] Re: kdump is not captured in remote host when kdump over ssh is configured

2019-11-11 Thread Guilherme G. Piccoli
** Changed in: ubuntu-power-systems
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1681909

Title:
  kdump is not captured in remote host when kdump over ssh is configured

Status in The Ubuntu-power-systems project:
  Fix Released
Status in makedumpfile package in Ubuntu:
  Fix Released
Status in makedumpfile source package in Xenial:
  Won't Fix
Status in makedumpfile source package in Bionic:
  Fix Released
Status in makedumpfile source package in Cosmic:
  Won't Fix
Status in makedumpfile source package in Disco:
  Fix Released
Status in makedumpfile source package in Eoan:
  Fix Released

Bug description:
  [Impact]

  * Kdump over network (like NFS mount or SSH dump) relies on network-
  online target from systemd. Even so, there are some NICs that report
  "Link Up" state but aren't ready to transmit packets. This is a
  generally bad behavior that is credited probably to NIC firmware
  delays, usually not fixable from drivers. Some adapters known to act
  like this are bnx2x, tg3 and ixgbe.

  * Kdump is a mechanism that may be a last resort to debug complex/hard
  to reproduce issues, so it's interesting to increase its reliability /
  resilience. We then propose here a solution/quirk to this issue on
  network dump by adding a retry/delay mechanism; if it's a network
  dump, kdump will retry some times and sleep between the attempts in
  order to exclude the case of NICs that aren't ready yet but will soon
  be able to transmit packets.

  * Although first reported by IBM in PowerPC arch, the scope for this
  issue is the NIC, and it was later reported in x86 arch too.

  [Test case]

  Usually it's difficult to naturally reproduce this issue in a deterministic 
way, but we have an artificial test case on comment #24 of this LP.
  Also, we have a report from this bug in which the user managed to reproduce 
the problem consistently - it's fixed after testing our solution.

  [Regression potential]

  There's not a clear regression potential here since it's just a retry/delay 
mechanism. Some potential problems may come from bad coding in the script.
  The delay between attempts is only 3 sec per iteration, so it shouldn't block 
the kdump progress for a high amount of time at once.

  [Other information]

  Salsa Debian commit:
  
https://salsa.debian.org/debian/makedumpfile/commit/d63ba95337988be1eac8c8c76d90825ff5c6d17f

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1681909/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1855409] Re: qede driver causes 100% CPU load

2019-12-09 Thread Guilherme G. Piccoli
** Also affects: linux (Ubuntu Disco)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Focal)
   Importance: Undecided
   Status: Incomplete

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu Xenial)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Bionic)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Disco)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Eoan)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Focal)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: linux (Ubuntu Focal)
   Status: Incomplete => New

** Changed in: linux (Ubuntu Disco)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Bionic)
   Status: New => Confirmed

** Changed in: linux (Ubuntu Xenial)
   Status: New => Invalid

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1855409

Title:
  qede driver causes 100% CPU load

Status in linux package in Ubuntu:
  New
Status in linux source package in Xenial:
  Invalid
Status in linux source package in Bionic:
  Confirmed
Status in linux source package in Disco:
  Confirmed
Status in linux source package in Eoan:
  New
Status in linux source package in Focal:
  New

Bug description:
  This bug is similar to #1832082 (bnx2x driver causes 100% CPU load)
  but applies for qede driver instead of bnx2x. The symptoms are the
  same:

  With chrony installed, and configured with "hwtimestamp *", I observe
  100% CPU load on 2 CPU cores.

  Running perf report shows that kernel is busy executing qede_ptp_task
  function in qede driver.

  A workaround is to disable "hwtimestamp *" in chrony configuration.

  ---

  $ modinfo qede
  filename:   
/lib/modules/4.15.0-72-generic/kernel/drivers/net/ethernet/qlogic/qede/qede.ko
  version:8.10.10.21
  license:GPL
  description:QLogic FastLinQ 4 Ethernet Driver
  srcversion: D5EC89D815FC81B973EE9F0
  alias:  pci:v1077d8090sv*sd*bc*sc*i*
  alias:  pci:v1077d8070sv*sd*bc*sc*i*
  alias:  pci:v1077d1664sv*sd*bc*sc*i*
  alias:  pci:v1077d1656sv*sd*bc*sc*i*
  alias:  pci:v1077d1654sv*sd*bc*sc*i*
  alias:  pci:v1077d1644sv*sd*bc*sc*i*
  alias:  pci:v1077d1636sv*sd*bc*sc*i*
  alias:  pci:v1077d1666sv*sd*bc*sc*i*
  alias:  pci:v1077d1634sv*sd*bc*sc*i*
  depends:ptp,qed
  retpoline:  Y
  intree: Y
  name:   qede
  vermagic:   4.15.0-72-generic SMP mod_unload 
  signat: PKCS#7
  signer: 
  sig_key:
  sig_hashalgo:   md4
  parm:   debug: Default debug msglevel (uint)

  
  $ uname -a
  Linux dcn1-clm-inf-1 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 
2019 x86_64 x86_64 x86_64 GNU/Linux

  
  $ lspci | grep -i ether
  19:00.0 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 
10/25/40/50GbE Controller (rev 02)
  19:00.1 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 
10/25/40/50GbE Controller (rev 02)
  19:00.2 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 
10/25/40/50GbE Controller (rev 02)
  19:00.3 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 
10/25/40/50GbE Controller (rev 02)

  
  # perf report snippet:

Children  Self  Command  Shared Object
  -   44.76% 0.00%  kworker/16:5 [kernel.kallsyms]
   ret_from_fork
 - kthread
- 44.74% worker_thread
   - 44.57% process_one_work
  - 42.67% qede_ptp_task
 - 38.86% qed_ptp_hw_read_tx_ts
  qed_rd
 - 3.03% queue_work_on
- 2.06% __queue_work
   - 0.68% get_work_pool
  - 0.61% radix_tree_lookup
   __radix_tree_lookup
0.50% set_work_pool_and_clear_pending

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1855409/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1800566] Re: Make reset_devices parameter default for kdump and decouple kdump systemd service from the KDUMP_CMDLINE_APPEND

2019-12-12 Thread Guilherme G. Piccoli
** Also affects: makedumpfile (Ubuntu Focal)
   Importance: High
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: In Progress

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1800566

Title:
  Make reset_devices parameter default for kdump and decouple kdump
  systemd service from the KDUMP_CMDLINE_APPEND

Status in makedumpfile package in Ubuntu:
  In Progress
Status in makedumpfile source package in Trusty:
  Won't Fix
Status in makedumpfile source package in Xenial:
  Confirmed
Status in makedumpfile source package in Bionic:
  Confirmed
Status in makedumpfile source package in Cosmic:
  Won't Fix
Status in makedumpfile source package in Disco:
  Confirmed
Status in makedumpfile source package in Eoan:
  In Progress
Status in makedumpfile source package in Focal:
  In Progress

Bug description:
  [Impact]

  * Kdump does not configure by default the crash kernel to perform a
  device reset by default, by passing the "reset_devices" parameter.
  Also, the systemd service "kdump-tools-dump" is tightly-coupled with
  KDUMP_CMDLINE_APPEND and it shouldn't, to prevent user confusion.

  * Kernel has the "reset_devices" parameter that drivers can opt-in,
  and perform special activity in case this parameter is parsed from
  command-line. For example, in kdump kernels it hints the drivers that
  they are booting from a non-healthy condition and needs to issue some
  form of reset to the adapter, like clearing DMA mapping in their
  firmware for example. Users currently (kernel v5.2) are: aacraid,
  hpsa, ipr, megaraid_sas, mpt3sas, smartpqi, xenbus.

  This should be enabled by default in the kdump config file to be added
  in the kdump kernel command-line for all versions.

  * The systemd service"kdump-tools-dump" is responsible for triggering the 
execution of the makedumpfile tool ultimately. Kdump from Xenial+ releases rely 
on systemd as their init system, so this service is the way to trigger the 
kdump mechanism. Currently it is configured as any other parameter in 
KDUMP_CMDLINE_APPEND, meaning if user decides to change the line they need to 
remember adding the systemd service back. It's not really a parameter that 
should be easily manipulated in kdump line, since there's no use for it except 
to instruct systemd to load kdump; the only 
  reasonable case for removing it is to debug kdump itself.

  
  [Test Case]

  1) Deploy a Disco VM e.g. with uvt-kvm
  2) Install the kdump-tools package
  3) Run `kdump-config test`and check for the 'reset_devices' parameter:

  $ kdump-config test
  ...
  kexec command to be used:
    /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-4.15.0-45-generic 
root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0 nr_cpus=1 
systemd.unit=kdump-tools.service irqpoll nousb ata_piix.prefer_ms_hyperv=0" 
/var/lib/kdump/vmlinuz

  Also, by changing the KDUMP_CMDLINE_APPEND we can see "systemd.unit
  =kdump-tools.service" to be removed.

  
  [Regression Potential]

  The regression potential is low, since it doesn't need any changes in
  makedumpfile code and we're only adding a parameter on the crash
  kernel command-line. The risks are related with bad behavior with the
  kernel when using "reset_devices", like if the driver has bugs in this
  path. It's considered safer to have the option (and this way prevent
  problems for booting a unhealthy kernel with potential stuck DMAs in
  the devices) than not having it.

  Regarding the other change, about the systemd service, it'll only
  affect users the are debugging kdump itself and it has no known
  regression potential.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1800566/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1855409] Re: qede driver causes 100% CPU load

2019-12-17 Thread Guilherme G. Piccoli
** Changed in: linux (Ubuntu Focal)
   Status: Incomplete => Fix Released

** Changed in: linux (Ubuntu Eoan)
   Status: Incomplete => Fix Released

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1855409

Title:
  qede driver causes 100% CPU load

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Xenial:
  Invalid
Status in linux source package in Bionic:
  Confirmed
Status in linux source package in Disco:
  Confirmed
Status in linux source package in Eoan:
  Fix Released
Status in linux source package in Focal:
  Fix Released

Bug description:
  This bug is similar to #1832082 (bnx2x driver causes 100% CPU load)
  but applies for qede driver instead of bnx2x. The symptoms are the
  same:

  With chrony installed, and configured with "hwtimestamp *", I observe
  100% CPU load on 2 CPU cores.

  Running perf report shows that kernel is busy executing qede_ptp_task
  function in qede driver.

  A workaround is to disable "hwtimestamp *" in chrony configuration.

  ---

  $ modinfo qede
  filename:   
/lib/modules/4.15.0-72-generic/kernel/drivers/net/ethernet/qlogic/qede/qede.ko
  version:8.10.10.21
  license:GPL
  description:QLogic FastLinQ 4 Ethernet Driver
  srcversion: D5EC89D815FC81B973EE9F0
  alias:  pci:v1077d8090sv*sd*bc*sc*i*
  alias:  pci:v1077d8070sv*sd*bc*sc*i*
  alias:  pci:v1077d1664sv*sd*bc*sc*i*
  alias:  pci:v1077d1656sv*sd*bc*sc*i*
  alias:  pci:v1077d1654sv*sd*bc*sc*i*
  alias:  pci:v1077d1644sv*sd*bc*sc*i*
  alias:  pci:v1077d1636sv*sd*bc*sc*i*
  alias:  pci:v1077d1666sv*sd*bc*sc*i*
  alias:  pci:v1077d1634sv*sd*bc*sc*i*
  depends:ptp,qed
  retpoline:  Y
  intree: Y
  name:   qede
  vermagic:   4.15.0-72-generic SMP mod_unload 
  signat: PKCS#7
  signer: 
  sig_key:
  sig_hashalgo:   md4
  parm:   debug: Default debug msglevel (uint)

  
  $ uname -a
  Linux dcn1-clm-inf-1 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 
2019 x86_64 x86_64 x86_64 GNU/Linux

  
  $ lspci | grep -i ether
  19:00.0 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 
10/25/40/50GbE Controller (rev 02)
  19:00.1 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 
10/25/40/50GbE Controller (rev 02)
  19:00.2 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 
10/25/40/50GbE Controller (rev 02)
  19:00.3 Ethernet controller: QLogic Corp. FastLinQ QL41000 Series 
10/25/40/50GbE Controller (rev 02)

  
  # perf report snippet:

Children  Self  Command  Shared Object
  -   44.76% 0.00%  kworker/16:5 [kernel.kallsyms]
   ret_from_fork
 - kthread
- 44.74% worker_thread
   - 44.57% process_one_work
  - 42.67% qede_ptp_task
 - 38.86% qed_ptp_hw_read_tx_ts
  qed_rd
 - 3.03% queue_work_on
- 2.06% __queue_work
   - 0.68% get_work_pool
  - 0.61% radix_tree_lookup
   __radix_tree_lookup
0.50% set_work_pool_and_clear_pending

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1855409/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp


[Group.of.nepali.translators] [Bug 1816743] Re: Add systemd's kdump service command-line regardless if user provides or not KDUMP_CMDLINE_APPEND

2019-12-20 Thread Guilherme G. Piccoli
I'm un-marking this as duplicate - LP #1800566 is being worked only for the 
reset_devices portion, so I'm decoupling both bugs in order we can work this 
one soon-ish.
Cheers,


Guilherme

** This bug is no longer a duplicate of bug 1800566
   Make reset_devices parameter default for kdump and decouple kdump systemd 
service from the KDUMP_CMDLINE_APPEND

** Also affects: makedumpfile (Ubuntu Eoan)
   Importance: Undecided
   Status: New

** Also affects: makedumpfile (Ubuntu Focal)
   Importance: Undecided
 Assignee: Guilherme G. Piccoli (gpiccoli)
   Status: Confirmed

** Changed in: makedumpfile (Ubuntu Cosmic)
   Status: Confirmed => Won't Fix

** Changed in: makedumpfile (Ubuntu Disco)
   Status: Confirmed => Won't Fix

** Changed in: makedumpfile (Ubuntu Eoan)
 Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)

** Changed in: makedumpfile (Ubuntu Eoan)
   Status: New => Confirmed

** Changed in: makedumpfile (Ubuntu Focal)
   Importance: Undecided => Low

** Changed in: makedumpfile (Ubuntu Eoan)
   Importance: Undecided => Low

** Changed in: makedumpfile (Ubuntu Disco)
   Importance: Undecided => Low

** Changed in: makedumpfile (Ubuntu Cosmic)
   Importance: Undecided => Low

** Changed in: makedumpfile (Ubuntu Bionic)
   Importance: Undecided => Low

** Changed in: makedumpfile (Ubuntu Xenial)
   Importance: Undecided => Low

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1816743

Title:
  Add systemd's kdump service command-line regardless if user provides
  or not KDUMP_CMDLINE_APPEND

Status in makedumpfile package in Ubuntu:
  Confirmed
Status in makedumpfile source package in Xenial:
  Confirmed
Status in makedumpfile source package in Bionic:
  Confirmed
Status in makedumpfile source package in Cosmic:
  Won't Fix
Status in makedumpfile source package in Disco:
  Won't Fix
Status in makedumpfile source package in Eoan:
  Confirmed
Status in makedumpfile source package in Focal:
  Confirmed

Bug description:
  Since Xenial release, Ubuntu relies on systemd as its init system -
  there's a kdump service to prevent some other services to
  unnecessarily start in kdump environment.

  Problem: if we add something to KDUMP_CMDLINE_APPEND, the entry for
  kdump service, "systemd.unit=kdump-tools.service" is removed from the
  command-line. The user manually needs to add that, and this seems
  highly prone to error.

  We propose here to decouple the "systemd.unit=kdump-tools.service"
  parameter from KDUMP_CMDLINE_APPEND, so if user wants really to remove
  this option, they should used KDUMP_CMDLINE instead.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1816743/+subscriptions

___
Mailing list: https://launchpad.net/~group.of.nepali.translators
Post to : group.of.nepali.translators@lists.launchpad.net
Unsubscribe : https://launchpad.net/~group.of.nepali.translators
More help   : https://help.launchpad.net/ListHelp