On Thu, Sep 8, 2011 at 9:06 AM, DRAM Ninjas <[email protected]> wrote:
> I think the short answer is -- all the easy stuff has already been done. A > lot of the if statements have branch predictor hints, debug output is cut > down to a minimum when building without debug mode, etc. > > Thats true. But many new code added to Marss is not profiled to find bottlenecks, specially for multicore simulations. In SCons I have setup compilation flags to use google-profile to profile simulations for memory usage and performance. I did use it when we had issue with high memory usage with checkpoints but I still need to find some time to get profile for performance bottlenecks. If some one can pick this task to generate the profile output and post it somewhere then we can try to optimize these bottlenecks. In optimized binary we don't disable ASSERT statements. Now Marss is more stable, may be in next release I'll disable ASSERT for better performance. Any real speedup would come from trying to parallelize the code, but I > doubt that will happen any time soon (if ever) -- the simulator is just too > detailed and complex to be easily parallelizable. > > :D . Well I have a very very alpha level code that uses pthreads that divides group of cores run on specific thread when you are using more than 8 cores. There are some locking issues which slow down the performance but preliminary results shows that if done right it will help us to simulate large number of cores with decent speed. > So I think as a community, we just have to accept the fact that detailed > simulation is slow. > > I agree. But as community we can develop new designs that enables us to take advantage of multicore systems or may be use some of these new languages like Go or D to build next generation simulators. Its really ironic that our research is focused on mutlicore systems and we can't use those to improve our life. - Avadh On Wed, Sep 7, 2011 at 4:54 PM, sparsh1 mittal1 <[email protected]>wrote: > >> Hello >> Does anyone have suggestion regarding speeding-up marss ? I am sure, this >> point will help others also. >> >> My friends who had used M5 told me, that in M5 the simulation speed >> reduces almost linearly with number of cores. Given this, the speed of Marss >> with multi-cores is already impressive! Yet, further speed-ups will help. >> >> Some general ideas are reducing print-outs, I/O. Yet, would you like to >> share more specific and substantial speed-up ideas? For example, my main >> interest is in cache related work. >> >> I would appreciate it. >> >> Thanks and Regards >> Sparsh Mittal >> >> >> >> _______________________________________________ >> http://www.marss86.org >> Marss86-Devel mailing list >> [email protected] >> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel >> >> > > _______________________________________________ > http://www.marss86.org > Marss86-Devel mailing list > [email protected] > https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel > >
_______________________________________________ http://www.marss86.org Marss86-Devel mailing list [email protected] https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
