Hi Shehab,

Can you confirm a few details about the configuration you are using? Are
you using classic caches or Ruby? What is the kernel version and disk image
you are using? What is the implementation of your "multithreaded hello
world" (are you using OMP)?

Best,

On Fri, Sep 6, 2019 at 8:58 AM Shehab Elsayed <shehaby...@gmail.com> wrote:

> First of all, thanks for your replies, Ryan and Jason.
>
> I have already pulled the latest changes by Pouya and the problem still
> persists.
>
> As for checkpointing, I was originally doing exactly what Jason mentioned
> and ran into the same problem. I then switched to not checkpointing just
> to avoid any problems that might be caused by checkpointing (if any). My
> plan was to go back to checkpointing after proving that it works without
> it.
>
> However, the problem doesn't seem to be related to KVM as linux boots
> reliable every time. The problem happens after the benchmarks starts
> execution and it seems to be happening only when running multiple cores
> (>=4). My latest experiments with a single core and 8 threads for the
> benchmark seem to be working fine. But once I increase the number of
> simulated cores problems happen.
>
> Also, I have posted a link to the repo I am using to run my tests in a
> previous message. I have also added 2 debug traces with the Exec flag for a
> working and non-working examples.
>
>
> On Fri, Sep 6, 2019 at 11:28 AM Jason Lowe-Power <ja...@lowepower.com>
> wrote:
>
>> Hi Shehab,
>>
>> One quick note: There is *no way* to have deterministic behavior when
>> running with KVM. Since you are using the hardware, the underlying host OS
>> will influence the execution path of the workload.
>>
>> To try to narrow down the bug you're seeing, you can try to take a
>> checkpoint after booting with KVM. Then, the execution from the checkpoint
>> should be deterministic since it is 100% in gem5.
>>
>> BTW, I doubt you can run the KVM CPU in a VM since this would require
>> your hardware and the VM to support nested virtualization. There *is*
>> support for this in the Linux kernel, but I don't think it's been widely
>> deployed outside of specific cloud environments.
>>
>> One other note: Pouya has pushed some changes which implement some x86
>> instructions that were causing issues for him. You can try with the current
>> gem5 mainline to see if that helps.
>>
>> Cheers,
>> Jason
>>
>> On Fri, Sep 6, 2019 at 8:22 AM Shehab Elsayed <shehaby...@gmail.com>
>> wrote:
>>
>>> That's interesting. Are you using Full System as well? I don't think FS
>>> behavior is supposed to be so dependent on the host environment!
>>>
>>> On Fri, Sep 6, 2019 at 11:16 AM Gambord, Ryan <gambo...@oregonstate.edu>
>>> wrote:
>>>
>>>> I have found that gem5 behavior is sensitive to the execution
>>>> environment. I now run gem5 inside an ubuntu vm on qemu and have had much
>>>> more consistent results. I haven't tried running kvm gem5 inside a kvm qemu
>>>> vm, so not sure how that works, but might be worth trying.
>>>>
>>>> Ryan
>>>>
>>>>
>>>> On Fri, Sep 6, 2019, 08:07 Shehab Elsayed <shehaby...@gmail.com> wrote:
>>>>
>>>>> I was wondering if anyone is running into the same problem or if
>>>>> anyone has any suggestions on how to proceed with debugging this problem.
>>>>>
>>>>> On Mon, Jul 29, 2019 at 4:57 PM Shehab Elsayed <shehaby...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Sorry for the spam. I just forgot to mention that the system
>>>>>> configuration I am using is mainly from
>>>>>> https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs.
>>>>>> <https://github.com/darchr/gem5/tree/jason/kvm-testing/configs/myconfigs>
>>>>>>
>>>>>>
>>>>>> Shehab Y. Elsayed, MSc.
>>>>>> PhD Student
>>>>>> The Edwards S. Rogers Sr. Dept. of Electrical and Computer Engineering
>>>>>> University of Toronto
>>>>>> E-mail: shehaby...@gmail.com
>>>>>> <https://webmail.rice.edu/imp/message.php?mailbox=INBOX&index=11#>
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 29, 2019 at 4:08 PM Shehab Elsayed <shehaby...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I have set up a repo with gem5 that demonstrates the problem. The
>>>>>>> repo includes the latest version of gem5 from gem5's github repo with a 
>>>>>>> few
>>>>>>> patches applied to get KVM working together with the kernel binary and 
>>>>>>> disk
>>>>>>> image I am using. You can get the repo at
>>>>>>> https://github.com/ShehabElsayed/gem5_debug.git.
>>>>>>> <https://github.com/ShehabElsayed/gem5_debug.git>
>>>>>>>
>>>>>>> These steps should reproduce the problem:
>>>>>>> 1- scons build/X86/gem5.opt
>>>>>>> 2- ./scripts/get_fs_stuff.sh
>>>>>>> 3- ./scripts/run_fs.sh 8
>>>>>>>
>>>>>>> I have also included sample m5term outputs for both a 2 thread run
>>>>>>> (m5out_2t) and an 8 thread run (m5out_8t)
>>>>>>>
>>>>>>> Any help is really appreciated.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jul 23, 2019 at 11:01 AM Shehab Elsayed <
>>>>>>> shehaby...@gmail.com> wrote:
>>>>>>>
>>>>>>>> When I enable the Exec debug flag I can see that it seems to be
>>>>>>>> stuck in a spin lock (queued_spin_lock_slowpath)
>>>>>>>>
>>>>>>>> On Fri, Jul 19, 2019 at 5:28 PM Shehab Elsayed <
>>>>>>>> shehaby...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello All,
>>>>>>>>>
>>>>>>>>> I have a gem5 X86 full system set up that starts with KVM cores
>>>>>>>>> and then switches to O3 cores once the benchmark reaches the region of
>>>>>>>>> interest. Right now I am testing with a simple multithreaded
>>>>>>>>> hello world benchmark. Sometimes the benchmark completes successfully 
>>>>>>>>> while
>>>>>>>>> others gem5 just seems to hang after starting the benchmark. I 
>>>>>>>>> believe it
>>>>>>>>> is still executing some instructions but without making any progress. 
>>>>>>>>> The
>>>>>>>>> chance of this behavior (indeterminism) happening increases as
>>>>>>>>> the number of simulated cores or the number of threads created by the
>>>>>>>>> benchmark increases.
>>>>>>>>>
>>>>>>>>> Any ideas what might be the reason for this or how I can start
>>>>>>>>> debugging this problem?
>>>>>>>>>
>>>>>>>>> Note: I have tried the patch in https://gem5-review.googlesource
>>>>>>>>> .com/c/public/gem5/+/19568 but the problem persists.
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>> gem5-users mailing list
>>>>> gem5-users@gem5.org
>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>
>>>> _______________________________________________
>>>> gem5-users mailing list
>>>> gem5-users@gem5.org
>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>> _______________________________________________
>>> gem5-users mailing list
>>> gem5-users@gem5.org
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>> _______________________________________________
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
> _______________________________________________
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users



-- 
Pouya Fotouhi
PhD Candidate
Department of Electrical and Computer Engineering
University of California, Davis
_______________________________________________
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to