Wracking my brains over this one, would appreciate any pointers...

Setup: Small test deployment with just 3 compute nodes, Queens on Ubuntu
Bionic. The first compute node is an NFS server for
/var/lib/nova/instances, and the other compute nodes mount that as NFS
clients.

Problem: Sometimes, when launching an instance which is scheduled to one of
the client nodes, nova-compute (in imagebackend.py) gets Permission Denied
(errno 13) when calling utime to touch the timestamp on the instance file.

Through various bits of debugging and hackery, I've established that:

- it looks like the problem never occurs when this is the call that
bootstraps the privsep setup; but it does occur quite frequently on later
calls

- when the problem occurs, retrying doesn't help (5 times, with 0.5s in
between)

- the instance file does exist, and is owned by root with read/write
permission for root

- the privsep helper is running as root

- the privsep helper receives and executes the request - so it's not a
problem with communication between nova-compute and the helper

- root is uid 0 on both NFS server and client

- NFS setup does not have the root_squash option

- there is some AppArmor setup, on both client and server, and I haven't
yet worked out whether that might be relevant.

Any ideas?

Many thanks,
      Neil
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to