Thank you Fajar.

I have made some progress, but would still value help (fresh detail at the bottom).


On 14/08/18 19:32, Fajar A. Nugraha wrote:
On Tue, Aug 14, 2018 at 1:54 PM, Tony Lewis <t...@lewistribe.com <mailto:t...@lewistribe.com>> wrote:

    Apologies in advance for the bump, but does anyone have an
    insights on this?


Did you install lxd before using source instead of snap?

It turns out there were some residual config files left over from the package-based install.  No binaries, just various files in /etc.  I cleaned them up.


What does /var/snap/lxd/common/lxd/logs/lxd.log say? Does it have any error?

Not much of interest that I can see.  Here it is from a reboot today:

lvl=info msg="LXD 3.3 is starting in normal mode" path=/var/snap/lxd/common/lxd t=2018-08-15T01:25:20+0000
lvl=info msg="Kernel uid/gid map:" t=2018-08-15T01:25:20+0000
lvl=info msg=" - u 0 0 4294967295" t=2018-08-15T01:25:20+0000
lvl=info msg=" - g 0 0 4294967295" t=2018-08-15T01:25:20+0000
lvl=info msg="Configured LXD uid/gid map:" t=2018-08-15T01:25:20+0000
lvl=info msg=" - u 0 1000000 1000000000" t=2018-08-15T01:25:20+0000
lvl=info msg=" - g 0 1000000 1000000000" t=2018-08-15T01:25:20+0000
lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored." t=2018-08-15T01:25:20+0000
lvl=info msg="Initializing local database" t=2018-08-15T01:25:20+0000
lvl=info msg="Initializing database gateway" t=2018-08-15T01:25:20+0000
address= id=1 lvl=info msg="Start database node" t=2018-08-15T01:25:20+0000
lvl=info msg="Raft: Restored from snapshot 1-23922-1534296032171" t=2018-08-15T01:25:20+0000 lvl=info msg="Raft: Initial configuration (index=1): [{Suffrage:Voter ID:1 Address:0}]" t=2018-08-15T01:25:20+0000 lvl=info msg="Raft: Node at 0 [Leader] entering Leader state" t=2018-08-15T01:25:20+0000
lvl=info msg="LXD isn't socket activated" t=2018-08-15T01:25:20+0000
lvl=info msg="Starting /dev/lxd handler:" t=2018-08-15T01:25:20+0000
lvl=info msg=" - binding devlxd socket" socket=/var/snap/lxd/common/lxd/devlxd/sock t=2018-08-15T01:25:20+0000
lvl=info msg="REST API daemon:" t=2018-08-15T01:25:20+0000
lvl=info msg=" - binding Unix socket" socket=/var/snap/lxd/common/lxd/unix.socket t=2018-08-15T01:25:20+0000
lvl=info msg="Initializing global database" t=2018-08-15T01:25:20+0000
lvl=info msg="Initializing storage pools" t=2018-08-15T01:25:21+0000
lvl=info msg="Initializing networks" t=2018-08-15T01:25:21+0000
lvl=info msg="Loading configuration" t=2018-08-15T01:25:22+0000
lvl=info msg="Connected to MAAS controller" t=2018-08-15T01:25:22+0000
lvl=info msg="Pruning expired images" t=2018-08-15T01:25:22+0000
lvl=info msg="Done pruning expired images" t=2018-08-15T01:25:22+0000
lvl=info msg="Updating instance types" t=2018-08-15T01:25:22+0000
lvl=info msg="Expiring log files" t=2018-08-15T01:25:22+0000
lvl=info msg="Done expiring log files" t=2018-08-15T01:25:22+0000
lvl=info msg="Updating images" t=2018-08-15T01:25:22+0000
lvl=info msg="Done updating images" t=2018-08-15T01:25:22+0000
lvl=warn msg="Unable to update backup.yaml at this time" name=backuptests t=2018-08-15T01:25:23+0000
lvl=info msg="Done updating instance types" t=2018-08-15T01:25:35+0000




My GUESS is that you have /usr/bin/lxd and /snap/bin/lxd, which interfere with each other. If that's not it, then my next guess is that there's probably some group issue, like https://github.com/lxc/lxd/issues/1861#issuecomment-206507631 . In any case lxd.log might have more info.

Thank you.  There is no lxd binary anywhere other than three snap versions, and only one of those is running.  The old service was still there, now removed, but it was showing failure because there was no binary to start.

Progress:

I know that lxd is starting, but my containers still don't start. When I try to stop the service I see the following in systemctl:

# systemctl stop snap.lxd.daemon
# systemctl status snap.lxd.daemon
● snap.lxd.daemon.service - Service for snap application lxd.daemon
   Loaded: loaded (/etc/systemd/system/snap.lxd.daemon.service; enabled; vendor preset: enabled)
   Active: inactive (dead) since Wed 2018-08-15 11:40:51 AEST; 2s ago
  Process: 6761 ExecStop=/usr/bin/snap run --command=stop lxd.daemon (code=exited, status=0/SUCCESS)   Process: 5432 ExecStart=/usr/bin/snap run lxd.daemon (code=killed, signal=TERM)
 Main PID: 5432 (code=killed, signal=TERM)

Aug 15 11:40:47 server systemd[1]: Stopping Service for snap application lxd.daemon... Aug 15 11:40:47 server /usr/bin/snap[6761]: cmd.go:105: DEBUG: restarting into "/snap/core/current/usr/bin/snap" Aug 15 11:40:47 server snap[6777]: cmd.go:105: DEBUG: restarting into "/snap/core/current/usr/bin/snap"
Aug 15 11:40:47 server snap[6761]: error: no changes found
Aug 15 11:40:50 server snap[6761]: => Stop reason is: host shutdown
Aug 15 11:40:50 server snap[6761]: => Stopping LXD (with container shutdown)
Aug 15 11:40:50 server snap[6761]: lxd: error while loading shared libraries: liblxc.so.1: cannot open shared object file: No such file or directory
Aug 15 11:40:50 server snap[6761]: => Stopping LXCFS
Aug 15 11:40:51 server snap[5432]: => LXD is ready
Aug 15 11:40:51 server systemd[1]: Stopped Service for snap application lxd.daemon.

A key line is: lxd: error while loading shared libraries: liblxc.so.1: cannot open shared object file: No such file or directory

The library is present in what looks to be the right places in the snap directories, but not anywhere else:

# find /snap -name liblxc.so.1 -print
/snap/lxd/7651/lib/liblxc.so.1
/snap/lxd/7792/lib/liblxc.so.1
/snap/lxd/8011/lib/liblxc.so.1

But when being launched, the daemon does not attempt to load from the snap directories:

# strace -f -F -etrace=file /usr/bin/snap run --command=stop lxd.daemon 2>&1 | grep liblxc [pid  4964] open("/lib/x86_64-linux-gnu/tls/x86_64/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/lib/x86_64-linux-gnu/tls/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/lib/x86_64-linux-gnu/x86_64/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/lib/x86_64-linux-gnu/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/usr/lib/x86_64-linux-gnu/tls/x86_64/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/usr/lib/x86_64-linux-gnu/tls/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/usr/lib/x86_64-linux-gnu/x86_64/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/usr/lib/x86_64-linux-gnu/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/lib/tls/x86_64/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/lib/tls/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/lib/x86_64/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/lib/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/usr/lib/tls/x86_64/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/usr/lib/tls/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/usr/lib/x86_64/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) [pid  4964] open("/usr/lib/liblxc.so.1", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) lxd: error while loading shared libraries: liblxc.so.1: cannot open shared object file: No such file or directory

Strangely, even if I copy /snap/lxd/8011/lib/liblxc.so.1 into /lib, the file is not found (strace reports no such file or directory).  I can't explain this, and I've checked and rechecked this.

If I kill the daemon itself, I can restart it using systemctl and my containers will start.  However I cannot gracefully stop containers (lxc stop <container> just hangs) nor can I gracefully stop lxd (same missing library error).

Any thoughts?

Tony

_______________________________________________
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Reply via email to