Hey Alex, On Wed, 2011-04-06 at 07:09 -0700, Alex Netes wrote: > Hi Al, Jared, > > On 14:31 Wed 23 Mar , Albert Chu wrote: > > > > > > 1) Port Shifting > > > > > > This is similar to what was done with some of the LMC > 0 code. > > > Congestion would occur due to "alignment" of routes w/ common traffic > > > patterns. However, we found that it was also necessary for LMC=0 and > > > only for used-ports. For example, lets say there are 4 ports (called A, > > > B, C, D) and we are routing lids 1-9 through them. Suppose only routing > > > through A, B, and C will reach lids 1-9. > > > > > > The LFT would normally be: > > > > > > A: 1 4 7 > > > B: 2 5 8 > > > C: 3 6 9 > > > D: > > > > > > The Port Shifting option would make this: > > > > > > A: 1 6 8 > > > B: 2 4 9 > > > C: 3 5 7 > > > D: > > > > > > This option by itself improved the mpiGraph average send/recv bandwidth > > > from 420 MB/s and 508 MB/s to to 991 MB/s and 1172 MB/s. > > > > > After thinking about this a little more and reviewing Jared Carr's - Scatter > ports > patch, I think we should combine these efforts into one framework as Al > suggested. Moreover, isn't "port_shifting" too much fabric oriented? Do > general OpenSM users will find this useful for them? > Moreover, how can user identify that port_shifting may improve performance for > him.
I will admit, I'm unsure of how much non-HPC users would benefit from this option, be hurt by it, or if they would even care. I can't speak for all users, but here at LLNL and at most of the lab HPC sites, people play with the options and experiment to find the best routing algorithm + settings that support their environment. I would imagine the port_shifting option would just be another option for people to experiment with. I think adding Jared's Scatter Ports would be easy to merge into my line of patches. Let me see if I can integrate his patch into my line easily. > Is providing shift factor (more than the suggested 1) will help to make it > suitable foo a general case? That seems like a good idea, we certainly could support an arbitrary shift, allowing users to experiment if there is a better one for their particular environment. > > > 2) Remote Guid Sorting > > > > > > Most core/spine switches we've seen thus far have had line boards > > > connected to spine boards in a consistent pattern. However, we recently > > > got some Qlogic switches that connect from line/leaf boards to spine > > > boards in a (to the casual observer) random pattern. I'm sure there was > > > a good electrical/board reason for this design, but it does hurt routing > > > b/c updn doesn't account for this. Here's an output from iblinkinfo as > > > an example. > > > > > Why this problem can't be addressed by guid_routing_order_file option? The problem we encountered in our fabric is predominantly a switch-to-switch routing issue with a spine switch. The guid_routing_order_file wouldn't be able to solve this, since its input is just end ports. Or another way to say it, this option directly affects the routing decisions made. The guid_routing_order_file does not, it only affects the order in which routes are chosen (which can have consequences, but the routing algorithm itself is unchanged). Al > > --Alex -- Albert Chu ch...@llnl.gov Computer Scientist High Performance Systems Division Lawrence Livermore National Laboratory -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html