Hi all, I'm still trying to make my kvm dependent setup to work, and any help would be very much appreciated.
TL;DR; after restoring from arm kvm checkpoints simulation never advances. I'd also like to accelerate gem5 simulation with arm kvm but generate checkpoints with atomic, so I could restore them in machines where arm kvm is not available (x86 servers), but this also does not work. First, a brief comment on previous help (to make kvm boot work): I was comparing my work with the stable (and master) branch(es), but not with the develop which had all the modifications Giacomo mentioned. So now I paired my repo with the develop, and kvm boot with 8 cores worked out-of-the-box, and it boots much faster than any gem5 models. So I discarded the modifications I did and stick with the develop branch to avoid introducing new errors (even though my modifications were also working). However, I'm struggling to leverage KVM for checkpointing, because simulation never advances when restoring from a kvm checkpoint. When using fs.py with --restore-with-cpu ArmV8KvmCPU --cpu-type ArmV8KvmCPU flags, the checkpoint is restored but I see no progress in output_folder/system.terminal and also gem5 never exits. Seems like the simulation gets stuck. (This setup actually does not matter because I need to restore from kvm checkpoint to a gem5 model, not from kvm to kvm, but just reporting this test I did in case it is useful.) The same "stuck simulation" thing happens if I use --restore-with-cpu ArmV8KvmCPU --cpu-type AtomicSimpleCPU. In this case, I also activated the --debug-flag=Exec and observed the code gets stuck at the "_raw_spin_lock_irqsave" method from the kernel. (By stuck I mean, more than 3 hours without reporting any new info from the debug-flags). Not sure what causes this. Alternatively, I tried also switching CPUs from kvm to AtomicSimpleCPU right before creating the checkpoint. Since I had successfully used AtomicSimpleCPU to boot gem5 generated/restored checkpoints in the past, I know AtomicSimpleCPU checkpoints should work. In fact, this scenario would be the best for me because later on, I'd like to restore my checkpoints in x86 servers, where ArmV8KvmCPU will not be available and I could never --restore-with-cpu ArmV8KvmCPU. But restoring from this checkpoint causes " fatal: fatal condition !paramInImpl(cp, name, param) occurred: Can't unserialize 'system.cpu:_pid' " My guess for the latter case was that AtomicSimpleCPU was in the system.switch_cpu (not in system.cpu) which is not looked up when restoring the checkpoint. So a final attempt I did was to set the AtomicSimpleCPU as the default CPU (testsys.cpu in fs.py) and the ArmV8KvmCPU as the switch_cpu (testsys.switch_cpus). The idea was to switch cpus right in the start, run with kvm most of the time, and switch back to atomic just to generate the checkpoints. Like this, system.cpu should be filled with AtomicSimpleCPU data, hence I would be able to restore in x86 servers later. However gem5 returned a segfault when I assigned "testsys.switch_cpus = switch_cpus", after I created the switch_cpus list with kvm models: switch_cpus = [ArmV8KvmCPU(switched_out=True, cpu_id=(i)) for i in range(np)] for i in range(np): switch_cpus[i].system = testsys switch_cpus[i].workload = testsys.cpu[i].workload switch_cpus[i].clk_domain = testsys.cpu[i].clk_domain switch_cpus[i].isa = testsys.cpu[i].isa testsys.switch_cpus = switch_cpus # this line causes a gem5 segfault switch_cpu_list = [(testsys.cpu[i], switch_cpus[i]) for i in range(np)] I see that using kvm is very common in different scripts on gem5-resources (https://gem5.googlesource.com/public/gem5-resources/), but they all seem to use kvm for x86. Is switching to x86 the best solution for my problem? Any suggestions on the way I'm setting things up? Again, thank you very much. _______________________________________________ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s