Am 07.04.2015 um 21:01 schrieb Dr. David Alan Gilbert: > * Peter Lieven (p...@kamp.de) wrote: >> Am 07.04.2015 um 17:29 schrieb Dr. David Alan Gilbert: >>> * Peter Lieven (p...@kamp.de) wrote: >>>> Hi David, >>>> >>>> Am 07.04.2015 um 10:43 schrieb Dr. David Alan Gilbert: >>>>>>>> Any particular workload or reproducer? >>>>>>> Workload is almost zero. I try to figure out if there is a way to >>>>>>> trigger it. >>>>>>> >>>>>>> Maybe playing a role: Machine type is -M pc1.2 and we set -kvmclock as >>>>>>> CPU flag since kvmclock seemed to be quite buggy in 2.6.16... >>>>>>> >>>>>>> Exact cmdline is: >>>>>>> /usr/bin/qemu-2.2.1 -enable-kvm -M pc-1.2 -nodefaults -netdev >>>>>>> type=tap,id=guest2,script=no,downscript=no,ifname=tap2 -device >>>>>>> e1000,netdev=guest2,mac=52:54:00:ff:00:65 -drive >>>>>>> format=raw,file=iscsi://172.21.200.53/iqn.2001-05.com.equallogic:4-52aed6-88a7e99a4-d9e00040fdc509a3-XXX-hd0/0,if=ide,cache=writeback,aio=native >>>>>>> -serial null -parallel null -m 1024 -smp >>>>>>> 2,sockets=1,cores=2,threads=1 -monitor tcp:0:4003,server,nowait -vnc >>>>>>> :3 -qmp tcp:0:3003,server,nowait -name 'XXX' -boot >>>>>>> order=c,once=dc,menu=off -drive >>>>>>> index=2,media=cdrom,if=ide,cache=unsafe,aio=native,readonly=on -k de >>>>>>> -incoming tcp:0:5003 -pidfile /var/run/qemu/vm-146.pid -mem-path >>>>>>> /hugepages -mem-prealloc -rtc base=utc -usb -usbdevice tablet >>>>>>> -no-hpet -vga cirrus -cpu qemu64,-kvmclock >>>>>>> >>>>>>> Exact kernel is: >>>>>>> 2.6.16.46-0.12-smp (i think this is SLES10 or sth.) >>>>>>> >>>>>>> The machine does not hang. It seems just I/O is hanging. So you can >>>>>>> type at the console or ping the system, but no longer login. >>>>>>> >>>>>>> Thank you, >>>>>>> Peter >>>>>> Interesting observation: Migrating the vServer again seems to fix to >>>>>> problem (at least in one case I could test just now). >>>>>> >>>>>> 2.6.8-24-smp is also affected. >>>>> How often does it fail - you say 'sometimes' - is it a 1/10 or a 1/1000 ? >>>> Its more often than 1/10 I would say. >>> OK, that's not too bad - it's the 1/1000 that are really nasty to find. >>> In your setup, how easy would it be for you to try : >>> with either 2.1 or current head? >>> with a newer machine-type? >>> without the cdrom? >> Its all possible. I can clone the system and try everything on my test >> systems. I hope >> it reproduces there. > Great. I think the order I would go would be: > Try head - if it works we know we've already got the fix somewhere > Try 2.1 - if it works we know it's something we introduced between > 2.1 and 2.2.1 > Try a newer machine type - because pc-1.2 probably isn't tested much
I don't mind chaning the machine time. The reason it is pc-1.2 is we set the machine type the vServer was created with. > CDROM at the end. > >> Has the cdrom the power of taking down the bus? > I just know the cdrom migration is a bit lacking and the simpler > the test case the better. Just for the record there was no CD inserted during migration. Peter