Are you running the full length of the benchmark on Marss? I would imagine
it will take an extremely long time to complete the simulation. If not,
make sure you are comparing the cycle count using the same region of the
program, for example, only the bfs portion.

Also, because Graph500 is quite large (if you're using 12 as the input
level, then the input size is around 1TB), components other than the CPU
can have big impact on the overall performance.

I forget the approximate problem sizing for graph500, but how long does
> that simulation take in marss?
>
> One thing you are picking up is the noise from the shell -- that is,
> ./start_sim fires the ptlcall and then you workload, which causes the
> kernel to spin up and load your application into memory and start the
> process. Then after it runs, it will teardown the process, return control
> to the shell, which will invoke stop_sim, etc. If the size of your workload
> is too small, all of this noise will be greater than your actual
> application. PTLCalls are easy to put in -- I would recommend you add them
> to your workload and try again.
>
>
> On Wed, Apr 23, 2014 at 2:14 AM, alireza nazari <[email protected]
> >wrote:
>
> > Hello,
> >
> > I put Graph500 in the Ubuntu image, compiled and ran it on the simulator
> > with single core Xeon(E5620) config file which is provided in repo. As
> far
> > as I understand this core config file is completely similar to my
> > Westmere-EX(E7-2860) real-world machine except capacity of the caches
> (let
> > me know if you know that I am wrong). The problem is when I look at
> ooo_0_0
> > cycles and compare it to the total cycles which is counted by papiex, I
> see
> > 70% to 100% difference! I expected sth around 1% to 5% error. Does
> anybody
> > have any idea where this huge error comes from? I did not add any ptlcall
> > to the G500 source or do anything to make it more accurate, Does it mean
> > that I am counting anything more than my benchmark by doing :
> > ./start_sim;./omp-csr -s 12;./stop_sim   ?
> > like unexpected OS overhead or anything that papiex is not counting and
> > full system simulation forces to simulator? Any idea is appreciated.
> >
> >
> > Thank you
> > Alireza
> >
> >
> > _______________________________________________
> > http://www.marss86.org
> > Marss86-Devel mailing list
> > [email protected]
> > https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
> >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://www.cs.binghamton.edu/mailman/private/marss86-devel/attachments/20140423/c41d0eda/attachment-0001.html
> >
>
> ------------------------------
>
> _______________________________________________
> http://www.marss86.org
> Marss86-Devel mailing list
> [email protected]
> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
>
>
> End of Marss86-Devel Digest, Vol 50, Issue 7
> ********************************************
>
_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel

Reply via email to