Re: Fwd: [ofa-general] Performance evaluation of Opensm

Devesh Sharma Tue, 07 Jul 2009 06:44:00 -0700

Thanks Nicolas for your response.
On Tue, Jul 7, 2009 at 4:43 PM, Nicolas
Morey-Chaisemartin<[email protected]> wrote:
> Le 07/07/2009 12:58, Devesh Sharma a écrit :
>> Thanks Yevgeny, for your valuable input. This will surly help for my work.
>>
>> On Tue, Jul 7, 2009 at 2:59 PM, Yevgeny
>> Kliteynik<[email protected]> wrote:
>>> Hi Davesh,
>>>
>>> It's kind of hard to talk about "performance of OpenSM".
>>> Subnet Manager has different phases and modes of operation,
>>> each of them is completely separate issue:
>>>
>>> - Fabric discovery
>>> - Fabric ports/nodes configuration
>>> - Unicast routing calculation
>>> - Unicast routing configuration on fabric switches
>>> - Multicast routing calculation
>>> - Multicast routing configuration on fabric switches
>>> - SA queries processing
>>> - Memory consumption
>>> - Different routing algorithms consume different time and memory
>>> - QoS
>>> - etc, etc, etc
>>>
>>> Most of the above can be measured only on real cluster.
>> But how these can be measured is there any compile time flag available
>> in the Code?
>>> Some (such as routing calculation and memory consumption) can
>>> be measured while OSM is running on top of the simulator.
>> Simulation results are far far away from real situation..:( I am
>> interested in results with the real fabric.
>
> Actually it's not. The scanning of the fabric is done before OpenSM calls the 
> routing engine, so the routing engine is working from memory only anyway. So 
> routing calculation time is exactly the same on a real fabric or a simulated 
> one.


Hmm....ok.
> However, fabric discover time and LFT update time will differ I agree.
>>> Some are very affected by the number of CPU cores that you
>>> have on the management node (e.g. SA queries processing),
>>> others mostly affected by the CPU frequency (unicast routing).
>>> Also, various OpenSM options can affect these phases, such as
>>> unicast routing cache may reduce routing calculation time to 0.
>> Hmm........correct.
>>> Sorry that I'm not really answering your question :(
>>> I just want to point out the fact that there are many aspects
>>> that should be considered when talking about OpenSM performance.
>> Do we have any such tool with does profiling of all these phases of
>> SM. Such tool will be
>> helpful for the researcher working on different algorithms related to SM.
>
> For internal actions, you can use valgrind --tool=callgrind
> It provides a full analysis of any program so you can find where bottlenecks 
> are and pretty much any perf info you may need. However it does not allow to 
> mesure times for network operations.
>
ok....I will try this.
>>> If what you're interested in is just "system-wide" numbers,
>>> then you'll probably want to know how much time it takes for
>>> the OpenSM to bring up cluster from scratch, or how much time
>>> it takes to reconfigure the fabric after some change.
>> Will it be fine if I run OpenSM with "time" command and press Ctrl-C
>> moment I see
>> SUBNET UP msg. Of-course keeping some of the options and
>> configurations as constant?
>>
>> # time opensm -<some options>
>> SUBNET UP Ctrl-C
>>
> It should work. Problem is you won't have much granularity to know where the 
> time is consumed. Plus Ctr-C doesn't kill OpenSM right away. If there are a 
> lot of outstanding MAD, it can take few seconds before leaving.
>
Yevgeny has suggested a better way to do this. Will try that.
> Nicolas
>
>
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: Fwd: [ofa-general] Performance evaluation of Opensm

Reply via email to