What's the deal with Histogram::add()? Either it's too slow or it's being called too much, I'd say, unless we're tracking some incredibly vital statistics there. Can you use the call graph part of the profile to find where most of the calls are coming from?
Also, can you look at the stats and verify that the number of calls to CacheMemory::lookup() is not much larger than the total number of CPU memory references? Another thing to note is that even though PerfectSwitch::wakeup() is not a huge amount of time, the time per call is pretty large, so there may be opportunities for a lot of improvement there. Thanks, Steve On Wed, Jan 19, 2011 at 9:20 AM, Nilay Vaish <[email protected]> wrote: > I profiled m5 again, using the following command. > > ./build/ALPHA_FS_MOESI_hammer/m5.prof ./configs/example/ruby_fs.py > --maxtick 200000000000 -n 8 --topology Mesh --mesh-rows 2 --num-l2cache 8 > --num-dir 8 > > Results have been copied below. CacheMemory::lookup() still consumes some > time but is lot less than before. PrefectSwitch can be a candidate for > re-design. But given that, PerfectSwitch is taking only 3% of the time, it > would not yield that much gain. May be we need to look at these different > functions in a more holistic fashion. > > -- > Nilay > > > % cumulative self self total > time seconds seconds calls s/call s/call name > 7.29 35.89 35.89 606750284 0.00 0.00 Histogram::add(long > long) > 5.84 64.62 28.73 256533483 0.00 0.00 > CacheMemory::lookup(Address const&) > 4.56 87.06 22.44 124360139 0.00 0.00 > L1Cache_Controller::wakeup() > 3.88 106.18 19.12 121283110 0.00 0.00 > RubyPort::M5Port::recvTiming(Packet*) > 3.13 121.60 15.42 6875704 0.00 0.00 PerfectSwitch::wakeup() > 3.00 136.38 14.78 39527382 0.00 0.00 > MemoryControl::executeCycle() > 2.95 150.91 14.53 90855686 0.00 0.00 > BaseSimpleCPU::preExecute() > 2.68 164.10 13.19 147750111 0.00 0.00 > MessageBuffer::enqueue(RefCountingPtr<Message>, long long) > 2.41 175.96 11.86 180281626 0.00 0.00 > RubyEventQueueNode::process() > 2.07 186.17 10.21 302054741 0.00 0.00 > EventQueue::serviceOne() > 2.03 196.15 9.98 121176116 0.00 0.00 > Sequencer::getRequestStatus(RubyRequest const&) > > > On Tue, 18 Jan 2011, Nilay wrote: > > Brad, >> >> I got the simulation working. It seems to me that you wrote Mesh.py under >> the assumption that number of cpus = number of L1 controllers = number of >> L2 controllers (if present) = number of directory controllers. >> >> The following options worked after some struggle and some help from Arka - >> >> ./build/ALPHA_FS_MESI_CMP_directory/m5.fast ./configs/example/ruby_fs.py >> --maxtick 2000000000 -n 16 --topology Mesh --mesh-rows 4 --num-dirs 16 >> --num-l2caches 16 >> >> -- >> Nilay >> >> >> On Tue, January 18, 2011 10:28 am, Beckmann, Brad wrote: >> >>> Hi Nilay, >>> >>> My plan is to tackle the functional access support as soon as I check in >>> our current group of outstanding patches. I'm hoping to at least check >>> in >>> the majority of them in the next couple of days. Now that you've >>> completed the CacheMemory access changes, you may want to re-profile GEM5 >>> and make sure the next performance bottleneck is routing network messages >>> in the Perfect Switch. In particular, you'll want to look at rather >>> large >>> (16+ core) systems using a standard Mesh network. If you have any >>> questions on how to do that, Arka may be able to help you out, if not, I >>> can certainly help you. Assuming the Perfect Switch shows up as a major >>> bottleneck (> 10%), then I would suggest that as the next area you can >>> work on. When looking at possible solutions, don't limit yourself to >>> just >>> changes within Perfect Switch itself. I suspect that redesigning how >>> destinations are encoded and/or the interface between MessageBuffer >>> dequeues and the PerfectSwitch wakeup, will lead to a better solution. >>> >>> Brad >>> >>> >>> _______________________________________________ > m5-dev mailing list > [email protected] > http://m5sim.org/mailman/listinfo/m5-dev >
_______________________________________________ m5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/m5-dev
