Re: [OMPI users] Q: Binding to cores on AWS?

Brian Dobbins Fri, 22 Dec 2017 14:30:02 -0800

Hi Ralph,

  Well, this gets chalked up to user error - the default AMI images come
without the NUMA-dev libraries, so OpenMPI didn't get built with it (and in
my haste, I hadn't checked).  Oops.  Things seem to be working correctly
now.


  Thanks again for your help,
  - Brian


On Fri, Dec 22, 2017 at 2:14 PM, r...@open-mpi.org <r...@open-mpi.org> wrote:

> I honestly don’t know - will have to defer to Brian, who is likely out for
> at least the extended weekend. I’ll point this one to him when he returns.
>
>
> On Dec 22, 2017, at 1:08 PM, Brian Dobbins <bdobb...@gmail.com> wrote:
>
>
>   Hi Ralph,
>
>   OK, that certainly makes sense - so the next question is, what prevents
> binding memory to be local to particular cores?  Is this possible in a
> virtualized environment like AWS HVM instances?
>
>   And does this apply only to dynamic allocations within an instance, or
> static as well?  I'm pretty unfamiliar with how the hypervisor (KVM-based,
> I believe) maps out 'real' hardware, including memory, to particular
> instances.  We've seen *some* parts of the code (bandwidth heavy) run
> ~10x faster on bare-metal hardware, though, *presumably* from memory
> locality, so it certainly has a big impact.
>
>   Thanks again, and merry Christmas!
>   - Brian
>
>
> On Fri, Dec 22, 2017 at 1:53 PM, r...@open-mpi.org <r...@open-mpi.org>
> wrote:
>
>> Actually, that message is telling you that binding to core is available,
>> but that we cannot bind memory to be local to that core. You can verify the
>> binding pattern by adding --report-bindings to your cmd line.
>>
>>
>> On Dec 22, 2017, at 11:58 AM, Brian Dobbins <bdobb...@gmail.com> wrote:
>>
>>
>> Hi all,
>>
>>   We're testing a model on AWS using C4/C5 nodes and some of our timers,
>> in a part of the code with no communication, show really poor performance
>> compared to native runs.  We think this is because we're not binding to a
>> core properly and thus not caching, and a quick 'mpirun --bind-to core
>> hostname' does suggest issues with this on AWS:
>>
>> *[bdobbins@head run]$ mpirun --bind-to core hostname*
>>
>> *--------------------------------------------------------------------------*
>> *WARNING: a request was made to bind a process. While the system*
>> *supports binding the process itself, at least one node does NOT*
>> *support binding memory to the process location.*
>>
>> *  Node:  head*
>>
>> *Open MPI uses the "hwloc" library to perform process and memory*
>> *binding. This error message means that hwloc has indicated that*
>> *processor binding support is not available on this machine.*
>>
>>   (It also happens on compute nodes, and with real executables.)
>>
>>   Does anyone know how to enforce binding to cores on AWS instances?  Any
>> insight would be great.
>>
>>   Thanks,
>>   - Brian
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Q: Binding to cores on AWS?

Reply via email to