Gus -- Can you send the output of configure and your config.log?
On Oct 16, 2014, at 4:24 PM, Gus Correa <g...@ldeo.columbia.edu> wrote: > On 10/16/2014 05:38 PM, Nathan Hjelm wrote: >> On Thu, Oct 16, 2014 at 05:27:54PM -0400, Gus Correa wrote: >>> Thank you, Aurelien! >>> >>> Aha, "vader btl", that is new to me! >>> I tought Vader was that man dressed in black in Star Wars, >>> Obi-Wan Kenobi's nemesis. >>> That was a while ago, my kids were children, >>> and Alec Guiness younger than Harrison Ford is today. >>> Oh, how nostalgic code developers can get when it comes >>> to naming things ... >>> >>> If I am using "vader", it is totally inadvertent. >>> There was no such a thing in Open MPI 1.6 and earlier. >>> >>> Now that you mentioned, I can see lots of it in the 1.8.3 >>> ompi_info output. >>> In addition, my stderr files show messages like this: >>> >>> imb.e38352:[1,5]<stddiag>:[node13:16334] mca: bml: Not using sm btl to >>> [[59987,1],26] on node node13 because vader btl has higher exclusivity >>> (65536 > 65535) >>> >>> So, you are right, "vader" is taking over and knocking off "sm" (and openib >>> and everybody else). >>> Darn Vader! >>> Probably knem is going down the tubes along with sm, right? >> >> Depends. If there is a reason to continue supporting knem then vader >> will be updated to support it. I don't currently see a reason to at this >> time though (since sm continues to live for now). >> > > Right now knem is not working in OMPI 1.8.3, even if I turn off vader, > and leave only sm,self,openib. > I just sent another email documenting that. > >>> I was used to sm, openib, self and tcp BTLs. >>> I normally just do "btl = ^tcp" in the MCA parameters file, >>> to stick to sm, openib, and self. >>> >>> That worked fine in 1.6.5 (and earlier), and knem worked >>> flawlessly there. >>> The same settings in 1.8.3 don't bring up the knem functionality. >>> So, this seems to be yet another change in 1.8.3 that I need to learn. >>> >>> Can you or some other list subscriber elaborate a bit about >>> this 'vader' btl? >>> The Open MPI FAQ doesn't have anthing about it. >>> What is it after all? >>> Does it play the same role as "sm", i.e., an intra-node btl? >>> Considering the name, is "vader" good or bad? >>> Or better: In which circumstances is "vader" good and when is it bad? >> >> Vader is a btl I originally wrote to support Cray's XPMEM shared memory >> interface. It was designed to be cleaner than btl/sm have better small >> message latency, bandwidth, and message rates. Because its latency is so >> much better than sm I removed the XPMEM requirement and added CMA >> support. >> > > I presume this requires kernel 3.X, as Aurelien pointed out. > As a matter of policy, and to keep your user base broad, > I would suggest to keep a generous > range of backwards compatible support built into OMPI. > This would be sm, knem, etc, which I suppose can coexist with vader, or not? > I can't speak for others but we run production codes in > standard Linux distributions (Centos 6.X, 5.X) whith 2.6.Y kernels. > I suppose other people have similar situations. > >>> Should I give in to the dark side of the force and keep "vader" >>> turned on, or should I just do something like >>> "btl = ^tcp,^vader" ? >> >> You can turn off vader if you want to use knem. I would run some tests >> to see if there is much of a difference between sm/knem and vader >> though. I don't have any systems that have knem installed so I haven't >> been able to run these tests myself. I would primarily focus on the >> memory usage and the bandwidth. > > > > -Nathan > > Please, see my last email. > Turning off vader and sm on, still doesn't make knem work, > unless I made some big mistake along the way. > I would love to use 1.8.3 in production, > as long as sm+knem support works, hence it it would be > great if somebody points out any mistake that I may have made. > > Also, for large messages, IMB with 1.6.5+sm+knem gives > me ~30% speedups w.r.t. 1.8.3+sm+(broken)-knem or w.r.t. 1.8.3+vader, > although admittedly due to our 2.6 kernel, no CMA, etc, > the environment is not favorable to vader to begin with. > [And yet another good reason to fix/keep sm+knem in OMPI 1.8.] > > Thank you, > Gus Correa > >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/10/25516.php >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/10/25521.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/