Executive summary for kernel team:
What makes both libvirt and Nova unhappy about the Cavium Thunder X NIC is the
fact that they are denied with "Operation not supported" when attempting to
read from sysfs node phys_port_id from its virtual functions.
Example:
'/sys/devices/pci0003:00/0003:00:00
** Changed in: charm-nova-compute
Assignee: (unassigned) => Frode Nordahl (fnordahl)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771662
Title:
libvirtError: Node device not found: no node de
1) The 'No compute node record for host phanpy:
ComputeHostNotFound_Remote: Compute host phanpy could not be found.'
message is benign, this message appears on first start of the `nova-
compute` service. It keeps appearing in the log here due to failure to
register available resources. See 3)
2
** Attachment added: "libvirt-debug.log"
https://bugs.launchpad.net/charm-nova-compute/+bug/1771662/+attachment/5157735/+files/libvirt-debug.log
** Changed in: charm-nova-compute
Status: Incomplete => Invalid
** Also affects: nova
Importance: Undecided
Status: New
--
You re
To be clear, on our lab machines (gigabyte arm64), we don't observe this
issue with Bionic + Queens, hence the request to try to triage on the
specific kit involved. Thanks!
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs
Incomplete in libvirt pending debug from live system by openstack team.
** Changed in: libvirt (Ubuntu)
Status: New => Incomplete
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771662
Title:
Escalated due to delay in triage and fix given our contract with ARM
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771662
Title:
libvirtError: Node device not found: no node device with matching na
@raharper - I concur, there is a workflow gap in the nova-compute charm
with regard to hypervisor registration success with nova, and I've
raised a separate bug to address that generically. However, that won't
fix this bug, it will just make it more visible by blocking the juju
charm unit and juju
In order to make progress from the charm front, I would need access to
at least one machine with the hardware which is specific to this bug,
plus two adjacent machines for control/data plane. Can we arrange that
access for openstack charms engineering?
** Changed in: charm-nova-compute
Sta
I'm not certain we can rule out the charm; the observant behavior is
that the compute nodes do not get enrolled.
Certainly the lack of a nova-compute node being registered has some
touch point to the charms.
The follow-up I think comes from the Openstack team to walk through
where the charm leaves
This defect seems to have stalled somewhat. Is there more information we
can gather for this to move forward again?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771662
Title:
libvirtError: Node de
If this is a bug on the OpenStack side, it's not in the charm. It would
be in nova proper.
** Changed in: charm-nova-compute
Status: New => Opinion
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bug
After comparing the sysfs data, I don't see any differences w.r.t the
physical paths in sysfs for the thunder nic.
I wonder if there is something that detects "xenial" and does one
thing, vs "bionic" despite the xenial host using the same kernel
level.
The apparmor denied on the namespaces only sh
ls -alR /sys on bionic http://paste.ubuntu.com/p/nrxyRGP3By/
The bionic kernel has also bumped:
Linux aurorus 4.15.0-22-generic #24-Ubuntu SMP Wed May 16 12:14:36 UTC
2018 aarch64 aarch64 aarch64 GNU/Linux
On Tue, May 22, 2018 at 7:10 PM, Ryan Harper <1771...@bugs.launchpad.net> wrote:
> Looks li
Looks like the ls -aLR contains more data; we can compare bionic.
On Tue, May 22, 2018 at 6:53 PM, Jason Hobbs wrote:
> cd /sys/bus/pci/devices && grep -nr . *
>
> xenial:
> http://paste.ubuntu.com/p/F5qyvN2Qrr/
>
> On Tue, May 22, 2018 at 5:27 PM, Jason Hobbs
> wrote:
>> Do you really want a
cd /sys/bus/pci/devices && grep -nr . *
xenial:
http://paste.ubuntu.com/p/F5qyvN2Qrr/
On Tue, May 22, 2018 at 5:27 PM, Jason Hobbs wrote:
> Do you really want a tar? How about ls -alR? xenial:
>
> http://paste.ubuntu.com/p/wyQ3kTsyBB/
>
> On Tue, May 22, 2018 at 5:14 PM, Jason Hobbs
> wrote:
>
Do you really want a tar? How about ls -alR? xenial:
http://paste.ubuntu.com/p/wyQ3kTsyBB/
On Tue, May 22, 2018 at 5:14 PM, Jason Hobbs wrote:
> ok; looks like that 4.15.0-22-generic just released and wasn't what I
> used in the first reproduction... I doubt that's it.
>
> On Tue, May 22, 2018 a
ok; looks like that 4.15.0-22-generic just released and wasn't what I
used in the first reproduction... I doubt that's it.
On Tue, May 22, 2018 at 4:58 PM, Ryan Harper <1771...@bugs.launchpad.net> wrote:
> Comparing the kernel logs, on Xenial, the second nic comes up:
>
> May 22 15:00:27 aurorus k
Comparing the kernel logs, on Xenial, the second nic comes up:
May 22 15:00:27 aurorus kernel: [ 24.840500] IPv6:
ADDRCONF(NETDEV_UP): enP2p1s0f2: link is not ready
May 22 15:00:27 aurorus kernel: [ 25.472391] thunder-nicvf
0002:01:00.2 enP2p1s0f2: Link is Up 1 Mbps Full duplex
But on bio
marked new on nova-compute-charm due to rharper's comment #18, and new
on libvirt because I've posted all the requested logs now.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771662
Title:
libvirt
@rharper, here are the logs you requested from the xenial deploy.
** Attachment added: "xenial-logs-1771662.tgz"
https://bugs.launchpad.net/charm-nova-compute/+bug/1771662/+attachment/5142976/+files/xenial-logs-1771662.tgz
** Changed in: charm-nova-compute
Status: Invalid => New
** Ch
Christian, thanks for digging in. Yes, I really just setup base
openstack and hit this condition. I'm not doing anything to setup
devices as passthrough or anything along those lines, and I'm not trying
to start instances.
--
You received this bug notification because you are a member of Ubuntu
B
Newly deployed Cavium System with 18.04 to get my own view onto this
(without openstack/charms in the way)
1. start a basic guest
$ sudo apt install uvtool-libvirt qemu-efi-aarch64
$ uvt-simplestreams-libvirt --verbose sync --source
http://cloud-images.ubuntu.com/daily arch=arm64 label=dail
Thanks for the logs.
I generally don't see anything *fatal* to libvirt. In the nova logs, I
can see that virsh capabilities returns host information. It certainly
is failing to find the VFs on the SRIOV device; it's not clear if that's
because the device is misbehaving (we can see the kernel eve
all of /var/log and /etc from the bionic deploy.
** Attachment added: "bionic-var-log-and-etc.tgz"
https://bugs.launchpad.net/charm-nova-compute/+bug/1771662/+attachment/5141000/+files/bionic-var-log-and-etc.tgz
--
You received this bug notification because you are a member of Ubuntu
Bugs, w
@rharper here are the logs you asked for from the bionic deploy
** Attachment added: "bionic-logs.tgz"
https://bugs.launchpad.net/charm-nova-compute/+bug/1771662/+attachment/5140998/+files/bionic-logs.tgz
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is
Some package level deltas that may be relevant:
ii linux-firmware 1.173
ii linux-firmware 1.157.18
ii pciutils 1:3.3.1-1.1ubuntu1.2
ii pciutils 1:3.5.2-1ubuntu
libvirt0:arm644.0.0-1ubuntu7~cloud0
libvirt0:arm644.0.0-1ubuntu8
Les
@rharper still working on getting the other stuff you've asked for, but here is
the uname -a output from xenial vs bionic:
http://paste.ubuntu.com/p/rJDpK5SyW9/
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.ne
To make it more clear; the hardware SRIOV device is different that
normal:
TL;DR this special device has VFs that have NO PF associated
software doesn't understand this
Though per comment #3; it seems odd that a Xenial/Queens with the same
kernel (HWE) works OK. So some tracing in libvirt/nova
And for the xenial deployment version, can we get what's in
/etc/network/interfaces* (including the .d)?
I'm generally curious w.r.t what interfaces are managed by the OS, and
which ones are being delegated to the guests.
--
You received this bug notification because you are a member of Ubuntu
B
Please capture:
1) cloud-init collect-logs (writes cloud-init.tar to $CWD)
2) the journal /var/log/journal
3) /etc/netplan and /run/systemd
4) /etc/udev/rules.d
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.ne
steve captured what I meant in #8 better than I did: 17:46 < slangasek>
one could as accurately say "I'm suspicious this is related to us
replacing the whole networking stack in Ubuntu" ;-)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubunt
> I'm suspicious of netplan here.
netplan is only the messenger here, between cloud-init+juju and
networkd. Can you show the complete netplan yaml as it's been laid down
on the system in question?
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed
This looks like it is specific to this hardware and the way it does VFs
and PFs, so I'm removing field-high.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771662
Title:
libvirtError: Node device no
given it works with the same libvirt and kernel on 16.04 but not 18.04,
I'm suspicious of netplan here.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771662
Title:
libvirtError: Node device not fou
The deploy works fine with juju 2.4 beta 2 and xenial/queens.
package versions: http://paste.ubuntu.com/p/PF7Jb7gxnX/
we do see this in nova-compute.log, but it's not fatal:
http://paste.ubuntu.com/p/Dh4ZGVTtH8/
--
You received this bug notification because you are a member of Ubuntu
Bugs, whic
Further information: Using juju 2.4 beta2 I was able to deploy magpie on
bionic in lxd and baremetal via MAAS.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771662
Title:
libvirtError: Node device
We think this is an issue in libvirt, related to how it handles the
sriov hardware in these machines.
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771662
Title:
libvirtError: Node device not found
** Changed in: charm-nova-compute
Status: New => Invalid
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1771662
Title:
libvirtError: Node device not found: no node device with matching name
T
** Description changed:
After deploying openstack on arm64 using bionic and queens, no
hypervisors show upon. On my compute nodes, I have an error like:
2018-05-16 19:23:08.165 282170 ERROR nova.compute.manager libvirtError:
Node device not found: no node device with matching name
'ne
What puzzles me is Xenial-Queens working and Bionic showing issues.
Because it seems like libvirt being unable to cope with this type of HW, but
since it works in one but not the other ...
Yet versions are:
- xenial-queens
libvirt 4.0.0-1ubuntu7~cloud0
qemu 1:2.11+dfsg-1ubuntu7~cloud0
- b
41 matches
Mail list logo