Thanks a lot Hanhwi & Tyler for your pointers. I really appreciate it. I was more concerned with whether QEMU emulator's cache coherency logic and Tyler's reply clarified it. Also I have not run in the stock QEMU. With these inputs on emulator I think I will go with simulator to get my results. I will also check with the multi-threaded version of MARSS. Thanks for letting me know.
Regards, karthik On Mon, Mar 30, 2015 at 11:40 AM, <[email protected]> wrote: > Sorry, I misread your second question -- > > You should definitely run your algorithms through the simulator. The > emulator does NOT model the coherency logic well. > > Avadh got some speed up by running the multi-threaded version of MARSS: > http://marssandbeyond.blogspot.com/2012/01/multi-threaded-simulation-in-marss.html > > I'm not sure the state of that branch, but it's worth a try if things are > running too slowly for you. > > Tyler > >> There are some patches to qemu that have an effect even when running in >> just plain emulation mode. MARSS leverages qemu to do some page table >> book-keeping that I believe runs even when in pure emulation mode, for >> example. If you're curious, you can grep for MARSS_QEMU in the qemu/ >> directory to see such changes. That being said, these changes should not >> have that much of an effect on qemu's performance when running in >> emulation mode... have you tried running a stock qemu (without KVM, just >> TCG?) >> >> Regarding lock-contention, the research community will absolutely accept >> your work. MARSS models the coherency logic between CPUs very accurately >> (and it's configurable). If you want to be especially crafty, you could >> use the DRAMSim2 plugin to model the RAMs with high accuracy as well, but >> you're probably more concerned with the coherency simulation (which is >> provided by the default configuration). >> >> Tyler >> >>> Hi, >>> >>> I am trying to use MARSS for my research work on lock contention >>> issues on parallel programs running on future many-core processors. >>> When I tried to compile MARSS for 32 cores and run my parallel >>> programs, I find it to take a lot of time. But when I just emulate >>> (using the default QEMU available) instead of switching to simulation, >>> obviously I could run my parallel programs faster and could simulate >>> the lock contentions. I have few questions from these observations for >>> which I look for clarifications: >>> >>> 1) When the MARSS is running in emulated mode is it just another QEMU? >>> or is there any difference? >>> 2) Since I am able to reproduce my lock contention problem using >>> emulation(& the simulator being too slow for large core counts) I am >>> thinking of working with it to test my algorithms. Will the research >>> community accept the results obtained from an emulator? Kindly let me >>> know. >>> >>> Thanks for your time, >>> karthik >>> >>> _______________________________________________ >>> http://www.marss86.org >>> Marss86-Devel mailing list >>> [email protected] >>> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel >>> >> >> >> >> _______________________________________________ >> http://www.marss86.org >> Marss86-Devel mailing list >> [email protected] >> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel >> > > _______________________________________________ http://www.marss86.org Marss86-Devel mailing list [email protected] https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
