Thanks a lot Hanhwi & Tyler for your pointers. I really appreciate it.

I was more concerned with whether QEMU emulator's cache coherency
logic and Tyler's reply clarified it. Also I have not run in the stock
QEMU. With these inputs on emulator I think I will go with simulator
to get my results. I will also check with the multi-threaded version
of MARSS. Thanks for letting me know.

Regards,
karthik

On Mon, Mar 30, 2015 at 11:40 AM,  <[email protected]> wrote:
> Sorry, I misread your second question --
>
> You should definitely run your algorithms through the simulator. The
> emulator does NOT model the coherency logic well.
>
> Avadh got some speed up by running the multi-threaded version of MARSS:
> http://marssandbeyond.blogspot.com/2012/01/multi-threaded-simulation-in-marss.html
>
> I'm not sure the state of that branch, but it's worth a try if things are
> running too slowly for you.
>
> Tyler
>
>> There are some patches to qemu that have an effect even when running in
>> just plain emulation mode. MARSS leverages qemu to do some page table
>> book-keeping that I believe runs even when in pure emulation mode, for
>> example. If you're curious, you can grep for MARSS_QEMU in the qemu/
>> directory to see such changes. That being said, these changes should not
>> have that much of an effect on qemu's performance when running in
>> emulation mode... have you tried running a stock qemu (without KVM, just
>> TCG?)
>>
>> Regarding lock-contention, the research community will absolutely accept
>> your work. MARSS models the coherency logic between CPUs very accurately
>> (and it's configurable). If you want to be especially crafty, you could
>> use the DRAMSim2 plugin to model the RAMs with high accuracy as well, but
>> you're probably more concerned with the coherency simulation (which is
>> provided by the default configuration).
>>
>> Tyler
>>
>>> Hi,
>>>
>>> I am trying to use MARSS for my research work on lock contention
>>> issues on parallel programs running on future many-core processors.
>>> When I tried to compile MARSS for 32 cores and run my parallel
>>> programs, I find it to take a lot of time. But when I just emulate
>>> (using the default QEMU available) instead of switching to simulation,
>>> obviously I could run my parallel programs faster and could simulate
>>> the lock contentions. I have few questions from these observations for
>>> which I look for clarifications:
>>>
>>> 1) When the MARSS is running in emulated mode is it just another QEMU?
>>> or is there any difference?
>>> 2) Since I am able to reproduce my lock contention problem using
>>> emulation(& the simulator being too slow for large core counts) I am
>>> thinking of working with it to test my algorithms. Will the research
>>> community accept the results obtained from an emulator? Kindly let me
>>> know.
>>>
>>> Thanks for your time,
>>> karthik
>>>
>>> _______________________________________________
>>> http://www.marss86.org
>>> Marss86-Devel mailing list
>>> [email protected]
>>> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
>>>
>>
>>
>>
>> _______________________________________________
>> http://www.marss86.org
>> Marss86-Devel mailing list
>> [email protected]
>> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
>>
>
>

_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel

Reply via email to