I’ve just discovered a new failure on a different container too;

# lxc move host2:nexus host1:nexus
error: Error transferring container data: checkpoint failed:
(00.355457) Error (files-reg.c:422): Can't dump ghost file 
/usr/local/sonatype-work/nexus/tmp/jar_cache5838699621686145685.tmp of 1177738 
size, increase limit
(00.355477) Error (cr-dump.c:1255): Dump files (pid: 22072) failed with -1
(00.357100) Error (cr-dump.c:1617): Dumping FAILED.




On 06/11/2015, 08:40, "lxc-users on behalf of Jamie Brown" 
<lxc-users-boun...@lists.linuxcontainers.org on behalf of 
jamie.br...@mpec.co.uk> wrote:

>Tycho,
>
>Thanks for your help.
>
>The kernels were in fact different versions, though I’m not sure how I got 
>into that state! So they’re now both running 3.19.0.
>
>Now, I at least receive the same error when migrating in both directions;
># lxc move host2:test host1:test2
>error: Error transferring container data: restore failed:
>(00.008103)      1: Error (mount.c:2030): Can't mount at ./dev/.lxd-mounts: No 
>such file or directory
>
># lxc move host1:test1 host2:test1
>error: Error transferring container data: restore failed:
>(00.008103) 1: Error (mount.c:2030): Can't mount at ./dev/.lxd-mounts: No such 
>file or directory
>
>
>
>
>The backing store is the default (directory based). However, on host2 the 
>/var/lib/lxd/containers directory is a symlink to an ext3 mount. On host1 
>they’re on ext4, is that likely to cause any issues?
>
>The strange thing is, [randomly] the live move DOES succeed. I’ve definitely 
>migrated a clean [running] container about 3 times from host2 to host1, but 
>then when I try again with a new container it fails. This even worked before I 
>updated the kernel. However, I can’t seem to find specific steps to replicate 
>the successful move. I’ve never succeeded in migrating the same container back 
>from host1 to host2 without stopping it. This is what is concerning me the 
>most, I would expect either permanent failure or permanent success. I keep 
>gaining false hope because the first time I migrated a container after 
>updating the kernel it worked, so I thought, problem solved! But then I 
>couldn’t migrate another :(
>
>-- Jamie
>
>
>
>05/11/2015, 16:58, "lxc-users on behalf of Tycho Andersen" 
><lxc-users-boun...@lists.linuxcontainers.org on behalf of 
>tycho.ander...@canonical.com> wrote:
>
>>Hi Jamie,
>>
>>Thanks for trying it out.
>>
>>On Thu, Nov 05, 2015 at 11:39:43AM +0000, Jamie Brown wrote:
>>> Hello again,
>>> 
>>> Oddly, I've now re-installed the old server and configured it identically 
>>> to before (except now using RAID) and tried migrating a container back and 
>>> I am getting a different failure;
>>> 
>>> # lxc move host2:test host1:test
>>> 
>>> error: Error transferring container data: restore failed:
>>> (00.007414)      1: Error (mount.c:2030): Can't mount at ./dev/.lxd-mounts: 
>>> No such file or directory
>>> (00.026443) Error (cr-restore.c:1939): Restoring FAILED.
>>> 
>>> The container appears in the remote container list whilst moving, but then 
>>> after failure it is deleted and it is in the STOPPED state on the source 
>>> host.
>>
>>Right, the restore failed, so the container had already been stopped
>>from the dump, so it was stopped on the target. What we should really
>>do is leave it in a frozen state after the dump, and once the restore
>>succeeds then we can kill it. Hopefully that's something I can
>>implement this cycle.
>>
>>As for the actual error, sounds like the target LXD didn't have
>>shmounts but the source one did. Are they using different backing
>>stores? What version of LXD are they?
>>
>>> 
>>> Here's the output from the log, not sure how much is relevant to the 
>>> migration attempt.
>>> 
>>> # lxc info --show-log test
>>> ...
>>> lxc 1446723150.396 DEBUG    lxc_start - start.c:__lxc_start:1210 - unknown 
>>> exit status for init: 9
>>>             lxc 1446723150.396 DEBUG    lxc_start - 
>>> start.c:__lxc_start:1215 - Pushing physical nics back to host namespace
>>>             lxc 1446723150.396 DEBUG    lxc_start - 
>>> start.c:__lxc_start:1218 - Tearing down virtual network devices used by 
>>> container
>>>             lxc 1446723150.396 WARN     lxc_conf - 
>>> conf.c:lxc_delete_network:2939 - failed to remove interface '(null)'
>>>             lxc 1446723150.396 INFO     lxc_error - 
>>> error.c:lxc_error_set_and_log:55 - child <10499> ended on signal (9)
>>>             lxc 1446723150.396 WARN     lxc_conf - 
>>> conf.c:lxc_delete_network:2939 - failed to remove interface '(null)'
>>>             lxc 1446723295.520 WARN     lxc_cgmanager - 
>>> cgmanager.c:cgm_get:993 - do_cgm_get exited with error
>>>             lxc 1446723295.522 WARN     lxc_cgmanager - 
>>> cgmanager.c:cgm_get:993 - do_cgm_get exited with error
>>> 
>>> 
>>> If I try to migrate a container in the reverse direction, I get a similar 
>>> error;
>>> 
>>> # lxc move host1:test1 host2:test1
>>> error: Error transferring container data: restore failed:
>>> (00.001093) Error (cgroup.c:1204): cg:      Can't mount controller dir 
>>> .criu.cgyard.aOuQtF/net_cls: No such file or directory
>>
>>This is probably because the kernel on host1 is newer than the
>>kernel on host2 and has net_cls cgroup support where as host2's
>>doesn't.
>>
>>Tycho
>>
>>> 
>>> 
>>> 
>>> Any ideas?
>>> 
>>> -- Jamie
>>> 
>>> 
>>> 
>>> On 05/11/2015, 08:05, "lxc-users on behalf of Jamie Brown" 
>>> <lxc-users-boun...@lists.linuxcontainers.org on behalf of 
>>> jamie.br...@mpec.co.uk> wrote:
>>> 
>>> >Thanks Tycho, installing CRIU solved the problem;
>>> >
>>> ># apt-get install criu
>>> >
>>> >Should this package not be included as a dependency for LXD, or at least 
>>> >provide a meaningful warning if the package isn’t available? It seems odd 
>>> >to advertise out-the-box live migration in LXD, but then have to install 
>>> >another package to provide it.
>>> >
>>> >Is this in the documentation anywhere?
>>> >
>>> >Thanks again.
>>> >
>>> >-- Jamie
>>> >
>>> >
>>> >
>>> >
>>> >On 04/11/2015, 16:47, "lxc-users on behalf of Tycho Andersen" 
>>> ><lxc-users-boun...@lists.linuxcontainers.org on behalf of 
>>> >tycho.ander...@canonical.com> wrote:
>>> >
>>> >>On Wed, Nov 04, 2015 at 01:48:44PM +0000, Jamie Brown wrote:
>>> >>> Greetings all.
>>> >>> 
>>> >>> I’ve been using LXD in a development environment for a few weeks and so 
>>> >>> far very impressed, 
>>> >>> I can see a really bright future for this technology!
>>> >>> 
>>> >>> However, today I thought I’d try out the live migration, based on the 
>>> >>> following guide;
>>> >>> https://insights.ubuntu.com/2015/05/06/live-migration-in-lxd/
>>> >>> 
>>> >>> I believe I have followed the steps correctly, however when I run the 
>>> >>> move command, I 
>>> >>> receive the following output;
>>> >>> 
>>> >>> # lxc move host1:test host2:test
>>> >>> error: Error transferring container data: checkpoint failed:
>>> >>> Problem accessing CRIU log: open /tmp/lxd_migration_899480871/dump.log: 
>>> >>> no such file or directory
>>> >>> 
>>> >>> The file it is referring to above doesn't exist. However, there are 
>>> >>> other lxd_migration_* 
>>> >>> directories with different numbers appended. Each time I attempt the 
>>> >>> migration a new directory 
>>> >>> is created (e.g. lxd_migration_192965652), but there is no dump.log in 
>>> >>> there.
>>> >>> 
>>> >>> The migration doesn't create a log file as per the guide above in;
>>> >>> /var/log/lxd/test/migration_{dump|restore}_.log
>>> >>> 
>>> >>> Steps I've taken;
>>> >>> 
>>> >>> - Copied all profiles from host1 to host2
>>> >>> - Added the migratable profile to the container
>>> >>> - Removed lxcfs package (on both hosts)
>>> >>> - Added the remote HTTPS hosts for both the local and remote hosts
>>> >>> 
>>> >>> Both hosts are running Ubuntu 14.04.3 LTS (x64) with LXD version 0.21.
>>> >>> 
>>> >>> The only difference I can tell between my hosts and the guide is that 
>>> >>> the 'migratable'
>>> >>> profile (which came out-the-box with my LXD installation) doesn't 
>>> >>> contain the autostart
>>> >>> entries as in the guide above;
>>> >>> 
>>> >>> # lxc profile show migratable
>>> >>> name: migratable
>>> >>> config:
>>> >>>   raw.lxc: |-
>>> >>>     lxc.console = none
>>> >>>     lxc.cgroup.devices.deny = c 5:1 rwm
>>> >>>     lxc.seccomp =
>>> >>>   security.privileged: "true"
>>> >>> devices: {}
>>> >>> 
>>> >>> 
>>> >>> Any help would be much appreciated!
>>> >>
>>> >>Have you installed CRIU? lxc info --show-log test probably has more
>>> >>info about what failed, but my guess is that it can't find CRIU if you
>>> >>haven't installed it.
>>> >>
>>> >>Tycho
>>> >>
>>> >>> Thank you,
>>> >>> 
>>> >>> Jamie
>>> >>> 
>>> >>> _______________________________________________
>>> >>> lxc-users mailing list
>>> >>> lxc-users@lists.linuxcontainers.org
>>> >>> http://lists.linuxcontainers.org/listinfo/lxc-users
>>> >>_______________________________________________
>>> >>lxc-users mailing list
>>> >>lxc-users@lists.linuxcontainers.org
>>> >>http://lists.linuxcontainers.org/listinfo/lxc-users
>>> >_______________________________________________
>>> >lxc-users mailing list
>>> >lxc-users@lists.linuxcontainers.org
>>> >http://lists.linuxcontainers.org/listinfo/lxc-users
>>> _______________________________________________
>>> lxc-users mailing list
>>> lxc-users@lists.linuxcontainers.org
>>> http://lists.linuxcontainers.org/listinfo/lxc-users
>>_______________________________________________
>>lxc-users mailing list
>>lxc-users@lists.linuxcontainers.org
>>http://lists.linuxcontainers.org/listinfo/lxc-users
>_______________________________________________
>lxc-users mailing list
>lxc-users@lists.linuxcontainers.org
>http://lists.linuxcontainers.org/listinfo/lxc-users
_______________________________________________
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Reply via email to