Thanks for that check David, that is good news to finally get a hold on
this.

In git current branch "master" (4bb3583c) is at 24.0-1 which isn't in either 
Debian nor Ubuntu.
Both Distros due to the freezes are on 22.1-1 from branch "debian/master" 
e5651d01 for now.

v22.1 is just a minimal set of fixes on top of v22:
 * Backport fixes:
   * ibacm: Print correct pkey
   * libhns: Bugfix for allocating and freeing sq db buffer
   * verbs: Fix pingpong buffer validation
   * ABI Files

Thanks to your test we know we are most likely looking at something in
v22..head on the master branch to be a fix for your case.

I further checked if the v24 from 4bb3583c would have any massive
changes in the packaging itself that could explain it, but no - it is
only minor changes for pyverbs. That is good as that means one of the
actual code changes most likely is the fix you are looking for.

There are fixes queued in the "stable-v22" branch already (but not yet 
released):
d05900db libhns: Bugfix for filtering zero length sge
f3bb8968 buildlib: Ensure stanza is properly sorted
e02238ea debian: Create empty pyverbs package for builds without pyverbs
90886054 verbs: Fix attribute returning
c7c842a3 build: Fix pyverbs build issues on Debian
8043035f travis: Change SuSE package target due to Travis CI failures
a4bbfc33 verbs: Avoid inline send when using device memory in rc_pingpong
4b38d3cd mlx5: Use copy loop to read from device memory
d38817ea verbs: clear cmd buffer when creating indirection table
9dcfa6cd libhns: Bugfix for using buffer length
23e3a5da mlx5: Fix incorrect error handling when SQ wqe count is 0

With some (still unlikely) luck 23e3a5da might already be your fix and is 
backported.
OTOH none of the other changes v22..head seem to be obviously the fix, none of 
the ib/lmx fixes states your problem exactly and no change is made to 
uverbs_request_finish or ib_uverbs_ex_create_rwq_ind_table.

Since you fortunately have a setup that can rebuild and retest, could you test 
with a:
- rdma-core build on 23e3a5da (probably fix on stable-v22)
- rdma-core build on d05900db (current head on stable-v22)
- if none of the above quick checks helps, we'd need a bisect a la:
  $ git bisect start --term-new fixed --term-old broken
  $ git bisect broken v22
  $ git bisect fixed 4bb3583c
  Bisecting: 86 revisions left to test after this (roughly 7 steps)
  [85cf1829e94585ceb38c5c221b49305866fb4344] Merge pull request #472 from 
oulijun/lijun-rdma-core

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1823836

Title:
  dpdk app is reporting: net_mlx5: probe of PCI device xxxx aborted
  after encountering an error: Unknown error -95

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1823836/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to