Re: [Xen-API] Snapshot causes disk errors and VM crashes

2014-11-24 Thread Thanos Makatos
> Where should I look now? I am relatively new to Xen but am going to be > admin for this system eventually so needo t know these things :) Just check all system logs around that time for anything of potential interest. Also, you could try strace'ing tapdisk when taking a snapshot to see which f

Re: [Xen-API] Snapshot causes disk errors and VM crashes

2014-11-21 Thread Mark Benson
(Sorry, I will try not to. I always set mailing lists up to reply-to the list, not the poster, perhaps someone can suggest this to the admins!) Where should I look now? I am relatively new to Xen but am going to be admin for this system eventually so needo t know these things :) -- Mark Benson

Re: [Xen-API] Snapshot causes disk errors and VM crashes

2014-11-21 Thread Thanos Makatos
(Please don't drop xen-api from the CC) > I dropped them all here, the large ones are trimmed to the relevant day only: > > https://www.dropbox.com/sh/4v1l141dw7fao3c/AAC8YrMONznv6Wdl0Y0Yy > ueCa?dl=0 > > I think the relevant time frame is about 20-11-2014 at 09:20-10:30 - I think > the > snaps

Re: [Xen-API] Snapshot causes disk errors and VM crashes

2014-11-21 Thread Thanos Makatos
> I couldn't find any tap-segfault messages for the time of the incident, I Just to clarify, it's either "tap-err" or "segfault", but not "tap-segfault". Also, check /var/log/user.log. > pastebin'd the SMlog covering that time slot whgen the snapshot was taken > (it can be seen in the log) but I

Re: [Xen-API] Snapshot causes disk errors and VM crashes

2014-11-20 Thread Thanos Makatos
> I installed the xs-tools on the VM from the XenServer 6.2 distribution, and > most everything works just fine, however I have stumbled on an issue that's > struck me 3 times on 2 serparate servers. When I take a live snapshot, > *sometimes* (not always) the guest OS pukes and starts throwing disk

[Xen-API] Snapshot causes disk errors and VM crashes

2014-11-20 Thread Mark Benson
Hi, I have a problem that's occurred a couple of times with both my test and production systems. I'd like to know if I'm doing something wrong or it could be a potential bug. I have a Debian 7 wheezy dom0 running a Debian 7 wheezy domU. I plan on deploying multiple instances but right now I'm bui