Gus --

Can you send the output of configure and your config.log?


On Oct 16, 2014, at 4:24 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:

> On 10/16/2014 05:38 PM, Nathan Hjelm wrote:
>> On Thu, Oct 16, 2014 at 05:27:54PM -0400, Gus Correa wrote:
>>> Thank you, Aurelien!
>>> 
>>> Aha, "vader btl", that is new to me!
>>> I tought Vader was that man dressed in black in Star Wars,
>>> Obi-Wan Kenobi's nemesis.
>>> That was a while ago, my kids were children,
>>> and Alec Guiness younger than Harrison Ford is today.
>>> Oh, how nostalgic code developers can get when it comes
>>> to naming things ...
>>> 
>>> If I am using "vader", it is totally inadvertent.
>>> There was no such a thing in Open MPI 1.6 and earlier.
>>> 
>>> Now that you mentioned, I can see lots of it in the 1.8.3
>>> ompi_info output.
>>> In addition, my stderr files show messages like this:
>>> 
>>> imb.e38352:[1,5]<stddiag>:[node13:16334] mca: bml: Not using sm btl to
>>> [[59987,1],26] on node node13 because vader btl has higher exclusivity
>>> (65536 > 65535)
>>> 
>>> So, you are right, "vader" is taking over and knocking off "sm" (and openib
>>> and everybody else).
>>> Darn Vader!
>>> Probably knem is going down the tubes along with sm, right?
>> 
>> Depends. If there is a reason to continue supporting knem then vader
>> will be updated to support it. I don't currently see a reason to at this
>> time though (since sm continues to live for now).
>> 
> 
> Right now knem is not working in OMPI 1.8.3, even if I turn off vader,
> and leave only sm,self,openib.
> I just sent another email documenting that.
> 
>>> I was used to sm, openib, self and tcp BTLs.
>>> I normally just do "btl = ^tcp" in the MCA parameters file,
>>> to stick to sm, openib, and self.
>>> 
>>> That worked fine in 1.6.5 (and earlier), and knem worked
>>> flawlessly there.
>>> The same settings in 1.8.3 don't bring up the knem functionality.
>>> So, this seems to be yet another change in 1.8.3 that I need to learn.
>>> 
>>> Can you or some other list subscriber elaborate a bit about
>>> this 'vader' btl?
>>> The Open MPI FAQ doesn't have anthing about it.
>>> What is it after all?
>>> Does it play the same role as "sm", i.e., an intra-node btl?
>>> Considering the name, is "vader" good or bad?
>>> Or better: In which circumstances is "vader" good and when is it bad?
>> 
>> Vader is a btl I originally wrote to support Cray's XPMEM shared memory
>> interface. It was designed to be cleaner than btl/sm have better small
>> message latency, bandwidth, and message rates. Because its latency is so
>> much better than sm I removed the XPMEM requirement and added CMA
>> support.
>> 
> 
> I presume this requires kernel 3.X, as Aurelien pointed out.
> As a matter of policy, and to keep your user base broad,
> I would suggest to keep a generous
> range of backwards compatible support built into OMPI.
> This would be sm, knem, etc, which I suppose can coexist with vader, or not?
> I can't speak for others but we run production codes in
> standard Linux distributions (Centos 6.X, 5.X) whith 2.6.Y kernels.
> I suppose other people have similar situations.
> 
>>> Should I give in to the dark side of the force and keep "vader"
>>> turned on, or should I just do something like
>>> "btl = ^tcp,^vader" ?
>> 
>> You can turn off vader if you want to use knem. I would run some tests
>> to see if there is much of a difference between sm/knem and vader
>> though. I don't have any systems that have knem installed so I haven't
>> been able to run these tests myself. I would primarily focus on the
>> memory usage and the bandwidth.
> >
> > -Nathan
> 
> Please, see my last email.
> Turning off vader and sm on, still doesn't make knem work,
> unless I made some big mistake along the way.
> I would love to use 1.8.3 in production,
> as long as sm+knem support works, hence it it would be
> great if somebody points out any mistake that I may have made.
> 
> Also, for large messages, IMB with 1.6.5+sm+knem gives
> me ~30% speedups w.r.t. 1.8.3+sm+(broken)-knem or w.r.t. 1.8.3+vader,
> although admittedly due to our 2.6 kernel, no CMA, etc,
> the environment is not favorable to vader to begin with.
> [And yet another good reason to fix/keep sm+knem in OMPI 1.8.]
> 
> Thank you,
> Gus Correa
> 
>> 
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/10/25516.php
>> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/10/25521.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to