from:"Shamis, Pavel"

Re: [OMPI devel] 1.5.x plans

2010-10-30 Thread Shamis, Pavel

IMHO "B" will require a lot of attention from all developers/vendors, as well 
it maybe quite time consuming task (btw, I think it is q couple of openib btl 
changes that aren't on the list). So probably it will be good to ask all btl 
(or other modules/features) maintainers directly.

Personally I prefer option C , A.

My 0.02c 

- Pasha

On Oct 26, 2010, at 5:07 PM, Jeff Squyres wrote:

> On the teleconf today, two important topics were discussed about the 1.5.x 
> series:
> 
> -
> 
> 1. I outlined my plan for a "small" 1.5.1 release.  It is intended to fix a 
> small number of compilation and portability issues.  Everyone seemed to think 
> that this was an ok idea.  I have done some tomfoolery in Trac to re-target a 
> bunch of tickets -- those listed in 1.5.1 are the only ones that I intend to 
> apply to 1.5.1:
> 
>https://svn.open-mpi.org/trac/ompi/report/15
> 
> (there's one critical bug that I don't know how to fix -- I'm waiting for 
> feedback from Red Hat before I can continue)
> 
> *** Does anyone have any other tickets/bugs that they want/need in a 
> short-term 1.5.1 release?
> 
> -
> 
> 2. We discussed what to do for 1.5.2.  Because 1.5[.0] took s long to 
> release, there's now a sizable divergence between the trunk and the 1.5 
> branch.  The problem is that there are a number of wide-reaching new features 
> on the trunk, some of which may (will) be difficult to bring to the v1.5 
> branch in a piecemeal fashion, including (but not limited to):
> 
> - Paffinity changes (including new hwloc component)
> - --with-libltdl changes
> - Ummunotify support
> - Solaris sysinfo component
> - Notifier improvements
> - OPAL_SOS
> - Common shared memory improvements
> - Build system improvements
> - New libevent
> - BFO PML
> - Almost all ORTE changes
> - Bunches of checkpoint restart mo'betterness (including MPI extensions)
> 
> There seem to be 3 obvious options about moving forward (all assume that we 
> do 1.5.1 as described above):
> 
>   A. End the 1.5 line (i.e., work towards transitioning it to 1.6), and then 
> re-branch the trunk to be v1.7.
>   B. Sync the trunk to the 1.5 branch en masse.  Stabilize that and call it 
> 1.5.2.
>   C. Do the same thing as A, but wait at least 6 months (i.e., give the 1.5 
> series time to mature).
> 
> Most people (including me) favored B.  Rich was a little concerned that B 
> spent too much time on maintenance/logistics when we could just be moving 
> forward, and therefore favored either A or C.
> 
> Any opinions from people who weren't there on the call today?
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] openib btl_openib_async_thread poll question

2010-12-21 Thread Shamis, Pavel

According to man pages, only POLLIN or Errors maybe returned in the specific 
case:

The bits returned in revents can include any of those specified in events, or 
one of the values POLLERR, POLLHUP, or POLLNVAL.  (These three bits are 
meaningless
 in the events field, and will be set in the revents field whenever the 
corresponding condition is true.)

Since POLLRDNORM was not specified in the even mask, it is "unexpected" event 
that handled as error.


Regards,
Pasha

On Dec 21, 2010, at 11:11 AM, Terry Dontje wrote:

After further inspection I saw that events is being set to POLLIN only.  Is 
that suppose to mask out any other bits from being set (like POLLRDNORM)?

--td
On 12/21/2010 10:35 AM, Terry Dontje wrote:
We're doing some testing with openib btl on a system with Solaris.  It looks 
like Solaris can return POLLIN|POLLRDNORM in revents from a poll call.  I 
looked at the manpages for Linux and it reads like Linux could possibly do this 
too.  However the code in btl_openib_async_thread that checks for valid revents 
is only checking for POLLIN and in the case it gets POLLIN|POLLRDNORM the btl 
ends up throwing an error.  I think erroring out on the POLLIN|POLLRDNORM case 
is a bug.

Does anyone feel otherwise and can explain to me why we should not consider 
POLLIN|POLLRDNORM as a valid condition?  I have the same question pertaining to 
POLLRDBAND too but I don't believe we've seen this set.

thanks,
--

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--

Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] async thread in openib BTL

2010-12-23 Thread Shamis, Pavel

The async thread is used to handle asynchronous error/notification events, like 
port up/down, hca errors etc. 
So most of the time the thread sleeps, and in healthy network you not supposed 
to see any events.

Regards,

Pasha

On Dec 23, 2010, at 12:49 AM, Eugene Loh wrote:

> I'm starting to look at the openib BTL for the first time and am 
> puzzled.  In btl_openib_async.c, it looks like an asynchronous thread is 
> started.  During MPI_Init(), the main thread sends the async thread a 
> file descriptor for each IB interface to be polled.  In MPI_Finalize(), 
> the main thread asks the async thread to shut down.  Between MPI_Init() 
> and MPI_Finalize(), I would think that the async thread would poll on 
> the IB fd's and handle events that come up.  If I stick print statements 
> into the async thread, however, I don't see any events come up on the IB 
> fd's.  So, the async thread is useless.  Yes?  It starts up and shuts 
> down, but never sees any events on the IB devices?
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] IBV_EVENT_QP_ACCESS_ERR

2011-01-03 Thread Shamis, Pavel

It looks that we are touching some QP that was released. Before close the QP we 
make sure to complete all outstanding messages on the endpoint. Once all qps 
(and other resources) are closed , we signal to async thread to remove this hca 
from monitoring list.  For me it looks that somehow we close the QP before all 
outstanding requests were completed.

Regards
---
Pavel Shamis (Pasha)

On Jan 3, 2011, at 12:44 PM, Jeff Squyres (jsquyres) wrote:

> I'd guess thesame thing as George - a race condition in the shutdown of the 
> async thread...?  I haven't looked at that code in a long log time to 
> remember how it tried to defend against the race condition. 
> 
> Sent from my PDA. No type good. 
> 
> On Jan 3, 2011, at 2:31 PM, "Eugene Loh"  wrote:
> 
>> George Bosilca wrote:
>> 
>>> Eugene,
>>> 
>>> This error indicate that somehow we're accessing the QP while the QP is in 
>>> "down" state. As the asynchronous thread is the one that see this error, I 
>>> wonder if it doesn't look for some information about a QP that has been 
>>> destroyed by the main thread (as this only occurs in MPI_Finalize).
>>> 
>>> Can you look in the syslog to see if there is any additional info related 
>>> to this issue there?
>>> 
>> Not much.  A one-liner like this:
>> 
>> Dec 27 21:49:36 burl-ct-x4150-11 hermon: [ID 492207 kern.info] hermon1: EQE 
>> local access violation
>> 
>>> On Dec 30, 2010, at 20:43, Eugene Loh  wrote:
>>> 
 I was running a bunch of np=4 test programs over two nodes.  Occasionally, 
 *one* of the codes would see an IBV_EVENT_QP_ACCESS_ERR during 
 MPI_Finalize().  I traced the code and ran another program that mimicked 
 the particular MPI calls made by that program.  This other program, too, 
 would occasionally trigger this error.  I never saw the problem with other 
 tests.  Rate of incidence could go from consecutive runs (I saw this once) 
 to 1:100s (more typically) to even less frequently -- I've had 1000s of 
 consecutive runs with no problems.  (The tests run a few seconds apiece.)  
 The traffic pattern is sends from non-zero ranks to rank 0, with root-0 
 gathers, and lots of Allgathers.  The largest messages are 1000bytes.  It 
 appears the problem is always seen on rank 3.
 
 Now, I wouldn't mind someone telling me, based on that little information, 
 what the problem is here, but I guess I don't expect that.  What I am 
 asking is what IBV_EVENT_QP_ACCESS_ERR means.  Again, it's seen during 
 MPI_Finalize.  The async thread is seeing this.  What is this error trying 
 to tell me?
 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] OMPI 1.4.3 hangs in gather

2011-01-12 Thread Shamis, Pavel

RDMACM or OOB can not effect on performance of this benchmark, since they are 
not involved in communication. So I'm not sure that the performance changes 
that you see are related to connection manager changes.
About oob - I'm not aware about hangs issue there, the code is very-very old, 
we did not touch it for a long time.

Regards,

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory
Email: sham...@ornl.gov





On Jan 12, 2011, at 8:45 AM, Doron Shoham wrote:

> Hi,
>
> For the first problem, I can see that when using rdmacm as openib oob
> I get much better performence results (and no hangs!).
>
> mpirun -display-map -np 64 -machinefile voltairenodes -mca btl
> sm,self,openib -mca btl_openib_connect_rdmacm_priority 100
> imb/src/IMB-MPI1 gather -npmin 64
>
>
> #bytes  #repetitionst_min[usec] t_max[usec] t_avg[usec]
>
> 0   10000.040.050.05
>
> 1   100019.64   19.69   19.67
>
> 2   100019.97   20.02   19.99
>
> 4   100021.86   21.96   21.89
>
> 8   100022.87   22.94   22.90
>
> 16  100024.71   24.80   24.76
>
> 32  100027.23   27.32   27.27
>
> 64  100030.96   31.06   31.01
>
> 128 100036.96   37.08   37.02
>
> 256 100042.64   42.79   42.72
>
> 512 100060.32   60.59   60.46
>
> 1024100082.44   82.74   82.59
>
> 20481000497.66  499.62  498.70
>
> 40961000684.15  686.47  685.33
>
> 8192519 544.07  546.68  545.85
>
> 16384   519 653.20  656.23  655.27
>
> 32768   519 704.48  707.55  706.60
>
> 65536   519 918.00  922.12  920.86
>
> 131072  320 2414.08 2422.17 2418.20
>
> 262144  160 4198.25 4227.58 4213.19
>
> 524288  80  7333.04 7503.99 7438.18
>
> 1048576 40  13692.6014150.2013948.75
>
> 2097152 20  30377.3432679.1531779.86
>
> 4194304 10  61416.7071012.5068380.04
>
> How can the oob cause the hang? Isn't it only used to bring up the connection?
> Does the oob has any part of the connections were made?
>
> Thanks,
> Dororn
>
> On Tue, Jan 11, 2011 at 2:58 PM, Doron Shoham  wrote:
>>
>> Hi
>>
>> All machines on the setup are IDataPlex with Nehalem 12 cores per node, 24GB 
>>  memory.
>>
>>
>>
>> · Problem 1 – OMPI 1.4.3 hangs in gather:
>>
>>
>>
>> I’m trying to run IMB and gather operation with OMPI 1.4.3 (Vanilla).
>>
>> It happens when np >= 64 and message size exceed 4k:
>>
>> mpirun -np 64 -machinefile voltairenodes -mca btl sm,self,openib  
>> imb/src-1.4.2/IMB-MPI1 gather –npmin 64
>>
>>
>>
>> voltairenodes consists of 64 machines.
>>
>>
>>
>> #
>>
>> # Benchmarking Gather
>>
>> # #processes = 64
>>
>> #
>>
>>   #bytes #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
>>
>>0 1000 0.02 0.02 0.02
>>
>>1  33114.0214.1614.09
>>
>>2  33112.8713.0812.93
>>
>>4  33114.2914.4314.34
>>
>>8  33116.0316.2016.11
>>
>>   16  33117.5417.7417.64
>>
>>   32  33120.4920.6220.53
>>
>>   64  33123.5723.8423.70
>>
>>  128  33128.0228.3528.18
>>
>>  256  33134.7834.8834.80
>>
>>  512  33146.3446.9146.60
>>
>> 1024  33163.9664.7164.33
>>
>> 2048  331   460.67   465.74   463.18
>>
>> 4096  331   637.33   643.99   640.75
>>
>>
>>
>> This the padb output:
>>
>> padb –A –x –Ormgr=mpirun –tree:
>>
>>
>>
>> =~=~=~=~=~=~=~=~=~=~=~= PuTTY log 2011.01.06 14:33:17 =~=~=~=~=~=~=~=~=~=~=~=
>>
>>
>>
>> Warning, remote process state differs across ranks
>>
>> state : ranks
>>
>> R (running) : 
>> [1,3-6,8,10-13,16-20,23-28,30-32,34-42,44-45,47-49,51-53,56-59,61-63]
>>
>> S (sleeping) : [0,2,7,9,14-15,21-22,29,33,43,46,50,54-55,60]
>>
>> Stack trace(s) for thread: 1
>>
>> -
>>
>> [0-63] (64 processes)
>>
>> -
>>
>> main() at ?:?
>>
>>  IMB_init_buffers_iter() at ?:?
>>
>>IMB_gather() at ?:?
>>
>>  PMPI_Gather() at pgather.c:175
>>
>>mca_coll_sync_gather() at coll_sync_gather.c:46
>>
>>  ompi_coll_tuned_gather_intra_dec_fixed() at 
>> coll_tuned_decision_fixe

Re: [OMPI devel] OMPI 1.4.3 hangs in gather

2011-01-13 Thread Shamis, Pavel

RDMACM creates the same QPs with the same tunings as OOB, so I don't see how 
CPC may effect on performance.

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jan 13, 2011, at 2:15 PM, Jeff Squyres wrote:

> +1 on what Pasha said -- if using rdmacm fixes the problem, then there's 
> something else nefarious going on...
>
> You might want to check padb with your hangs to see where all the processes 
> are hung to see if anything obvious jumps out.  I'd be surprised if there's a 
> bug in the oob cpc; it's been around for a long, long time; it should be 
> pretty stable.
>
> Do we create QP's differently between oob and rdmacm, such that perhaps they 
> are "better" (maybe better routed, or using a different SL, or ...) when 
> created via rdmacm?
>
>
> On Jan 12, 2011, at 12:12 PM, Shamis, Pavel wrote:
>
>> RDMACM or OOB can not effect on performance of this benchmark, since they 
>> are not involved in communication. So I'm not sure that the performance 
>> changes that you see are related to connection manager changes.
>> About oob - I'm not aware about hangs issue there, the code is very-very 
>> old, we did not touch it for a long time.
>>
>> Regards,
>>
>> Pavel (Pasha) Shamis
>> ---
>> Application Performance Tools Group
>> Computer Science and Math Division
>> Oak Ridge National Laboratory
>> Email: sham...@ornl.gov
>>
>>
>>
>>
>>
>> On Jan 12, 2011, at 8:45 AM, Doron Shoham wrote:
>>
>>> Hi,
>>>
>>> For the first problem, I can see that when using rdmacm as openib oob
>>> I get much better performence results (and no hangs!).
>>>
>>> mpirun -display-map -np 64 -machinefile voltairenodes -mca btl
>>> sm,self,openib -mca btl_openib_connect_rdmacm_priority 100
>>> imb/src/IMB-MPI1 gather -npmin 64
>>>
>>>
>>> #bytes  #repetitionst_min[usec] t_max[usec] t_avg[usec]
>>>
>>> 0   10000.040.050.05
>>>
>>> 1   100019.64   19.69   19.67
>>>
>>> 2   100019.97   20.02   19.99
>>>
>>> 4   100021.86   21.96   21.89
>>>
>>> 8   100022.87   22.94   22.90
>>>
>>> 16  100024.71   24.80   24.76
>>>
>>> 32  100027.23   27.32   27.27
>>>
>>> 64  100030.96   31.06   31.01
>>>
>>> 128 100036.96   37.08   37.02
>>>
>>> 256 100042.64   42.79   42.72
>>>
>>> 512 100060.32   60.59   60.46
>>>
>>> 1024100082.44   82.74   82.59
>>>
>>> 20481000497.66  499.62  498.70
>>>
>>> 40961000684.15  686.47  685.33
>>>
>>> 8192519 544.07  546.68  545.85
>>>
>>> 16384   519 653.20  656.23  655.27
>>>
>>> 32768   519 704.48  707.55  706.60
>>>
>>> 65536   519 918.00  922.12  920.86
>>>
>>> 131072  320 2414.08 2422.17 2418.20
>>>
>>> 262144  160 4198.25 4227.58 4213.19
>>>
>>> 524288  80  7333.04 7503.99 7438.18
>>>
>>> 1048576 40  13692.6014150.2013948.75
>>>
>>> 2097152 20  30377.3432679.1531779.86
>>>
>>> 4194304 10  61416.7071012.5068380.04
>>>
>>> How can the oob cause the hang? Isn't it only used to bring up the 
>>> connection?
>>> Does the oob has any part of the connections were made?
>>>
>>> Thanks,
>>> Dororn
>>>
>>> On Tue, Jan 11, 2011 at 2:58 PM, Doron Shoham  wrote:
>>>>
>>>> Hi
>>>>
>>>> All machines on the setup are IDataPlex with Nehalem 12 cores per node, 
>>>> 24GB  memory.
>>>>
>>>>
>>>>
>>>> · Problem 1 – OMPI 1.4.3 hangs in gather:
>>>>
>>>>
>>>>
>>>> I’m trying to run IMB and gather operation with OMPI 1.4.3 (Vanilla).
>>>>
>>>> It happens when np >= 64 and message size exceed 4k:
>>>>
>>>> mpirun -np 64 -machinefile voltairenod

Re: [OMPI devel] OMPI 1.4.3 hangs in gather

2011-01-16 Thread Shamis, Pavel

Well, then I would suspect rdmacm vs oob QP configuration. They supposed to be 
the same, but  probably it's some bug there, and  somehow rdmacm QP tuning 
different from oob, it is potential source cause for the performance 
differences that you see.

Regards,
Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jan 16, 2011, at 4:12 AM, Doron Shoham wrote:

> Hi,
>
> The gather hangs only in liner_sync algorithm but works with
> basic_linear and binomial algorithms.
> The gather algorithm is choosen dynamiclly depanding on block size and
> communicator size.
> So, in the beginning, binomial algorithm is chosen (communicator size
> is larger then 60).
> When increasing the message size, the liner_sync algorithm is chosen
> (with small_segment_size).
> When debugging on the cluster I saw that the linear_sync function is
> called in endless loop with segment size of 1024.
> This explain why hang occure in the middle of the run.
>
> I still don't understand why does RDMACM solve it or what causes
> liner_sync hangs.
>
> Again, in 1.5 it doesn't hang (maybe timing is different?).
> I'm still trying to understand what are the diffrences in those areas
> between 1.4.3 and 1.5
>
>
> BTW,
> Choosing RDMACM fixes hangs and performance issues in all collective 
> operations.
>
> Thanks,
> Doron
>
>
> On Thu, Jan 13, 2011 at 9:44 PM, Shamis, Pavel  wrote:
>> RDMACM creates the same QPs with the same tunings as OOB, so I don't see how 
>> CPC may effect on performance.
>>
>> Pavel (Pasha) Shamis
>> ---
>> Application Performance Tools Group
>> Computer Science and Math Division
>> Oak Ridge National Laboratory
>>
>>
>>
>>
>>
>>
>> On Jan 13, 2011, at 2:15 PM, Jeff Squyres wrote:
>>
>>> +1 on what Pasha said -- if using rdmacm fixes the problem, then there's 
>>> something else nefarious going on...
>>>
>>> You might want to check padb with your hangs to see where all the processes 
>>> are hung to see if anything obvious jumps out.  I'd be surprised if there's 
>>> a bug in the oob cpc; it's been around for a long, long time; it should be 
>>> pretty stable.
>>>
>>> Do we create QP's differently between oob and rdmacm, such that perhaps 
>>> they are "better" (maybe better routed, or using a different SL, or ...) 
>>> when created via rdmacm?
>>>
>>>
>>> On Jan 12, 2011, at 12:12 PM, Shamis, Pavel wrote:
>>>
>>>> RDMACM or OOB can not effect on performance of this benchmark, since they 
>>>> are not involved in communication. So I'm not sure that the performance 
>>>> changes that you see are related to connection manager changes.
>>>> About oob - I'm not aware about hangs issue there, the code is very-very 
>>>> old, we did not touch it for a long time.
>>>>
>>>> Regards,
>>>>
>>>> Pavel (Pasha) Shamis
>>>> ---
>>>> Application Performance Tools Group
>>>> Computer Science and Math Division
>>>> Oak Ridge National Laboratory
>>>> Email: sham...@ornl.gov
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Jan 12, 2011, at 8:45 AM, Doron Shoham wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> For the first problem, I can see that when using rdmacm as openib oob
>>>>> I get much better performence results (and no hangs!).
>>>>>
>>>>> mpirun -display-map -np 64 -machinefile voltairenodes -mca btl
>>>>> sm,self,openib -mca btl_openib_connect_rdmacm_priority 100
>>>>> imb/src/IMB-MPI1 gather -npmin 64
>>>>>
>>>>>
>>>>> #bytes  #repetitionst_min[usec] t_max[usec] 
>>>>> t_avg[usec]
>>>>>
>>>>> 0   10000.040.050.05
>>>>>
>>>>> 1   100019.64   19.69   19.67
>>>>>
>>>>> 2   100019.97   20.02   19.99
>>>>>
>>>>> 4   100021.86   21.96   21.89
>>>>>
>>>>> 8   100022.87   22.94   22.90
>>>>>
>>>>> 16  100024.71   24.80   24.76
>>>>>
>>>>> 32  100027.23   27.32   27.27
>>>>>
>&

Re: [OMPI devel] OFED question

2011-01-27 Thread Shamis, Pavel

Unfortunately verbose error reports are not so friendly...anyway , I may think 
about 2 issues:

1. You trying to open open too much QPs. By default ib devices support fairly 
large amount of QPs and it is quite hard to push it to this corner. But If your 
job is really huge it may be the case. Or for example, if you share the compute 
nodes with some other processes that create a lot of qps. The maximum amount of 
supported qps you may see in ibv_devinfo.

2. The memory limit for registered memory is too low, as result driver fails 
allocate and register memory for QP. This scenario is most common. Just 
happened to me recently, system folks pushed some crap into limits.conf.

Regards,

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory

On Jan 27, 2011, at 5:56 PM, Barrett, Brian W wrote:

> All -
> 
> On one of our clusters, we're seeing the following on one of our 
> applications, I believe using Open MPI 1.4.3:
> 
> [xxx:27545] *** An error occurred in MPI_Scatterv
> [xxx:27545] *** on communicator MPI COMMUNICATOR 5 DUP FROM 4
> [xxx:27545] *** MPI_ERR_OTHER: known error not in list
> [xxx:27545] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> [xxx][[31806,1],0][connect/btl_openib_connect_oob.c:857:qp_create_one] error 
> creating qp errno says Resource temporarily unavailable
> --
> mpirun has exited due to process rank 0 with PID 27545 on
> node rs1891 exiting without calling "finalize". This may
> have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --
> 
> 
> The problem goes away if we modify the eager protocol msg sizes so that there 
> are only two QPs necessary instead of the default 4.  Is there a way to bump 
> up the number of QPs that can be created on a node, assuming the issue is 
> just running out of available QPs?  If not, any other thoughts on working 
> around the problem?
> 
> Thanks,
> 
> Brian
> 
> --
>  Brian W. Barrett
>  Dept. 1423: Scalable System Software
>  Sandia National Laboratories
> 
> 
> 
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] OFED question

2011-01-27 Thread Shamis, Pavel

Brain,
I would calculate maximum number of qps for all-to-all connection:
4*num_nodes*num_cores^2
And then compare it to the number reported by : ibv_devinfo -v | grep max_qp
If your theoretical maximum is close to ib_devinfo number, then I would suspect 
the qp limitation. Driver manage some internal qps, so you can not get the max.

For memory limit, I do not have any good idea. If it happens in early stages of 
app, then probably the limit is really small and I would verify it with IT.

Regards,
Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jan 27, 2011, at 8:09 PM, Barrett, Brian W wrote:

> Pasha -
> 
> Is there a way to tell which of the two happened or to check the number of 
> QPs available per node?  The app likely does talk to a large number of peers 
> from each process, and the nodes are fairly "fat" - it's quad socket, quad 
> core and they are running 16 MPI ranks for each node.  
> 
> Brian
> 
> On Jan 27, 2011, at 6:17 PM, Shamis, Pavel wrote:
> 
>> Unfortunately verbose error reports are not so friendly...anyway , I may 
>> think about 2 issues:
>> 
>> 1. You trying to open open too much QPs. By default ib devices support 
>> fairly large amount of QPs and it is quite hard to push it to this corner. 
>> But If your job is really huge it may be the case. Or for example, if you 
>> share the compute nodes with some other processes that create a lot of qps. 
>> The maximum amount of supported qps you may see in ibv_devinfo.
>> 
>> 2. The memory limit for registered memory is too low, as result driver fails 
>> allocate and register memory for QP. This scenario is most common. Just 
>> happened to me recently, system folks pushed some crap into limits.conf.
>> 
>> Regards,
>> 
>> Pavel (Pasha) Shamis
>> ---
>> Application Performance Tools Group
>> Computer Science and Math Division
>> Oak Ridge National Laboratory
>> 
>> 
>> 
>> 
>> 
>> 
>> On Jan 27, 2011, at 5:56 PM, Barrett, Brian W wrote:
>> 
>>> All -
>>> 
>>> On one of our clusters, we're seeing the following on one of our 
>>> applications, I believe using Open MPI 1.4.3:
>>> 
>>> [xxx:27545] *** An error occurred in MPI_Scatterv
>>> [xxx:27545] *** on communicator MPI COMMUNICATOR 5 DUP FROM 4
>>> [xxx:27545] *** MPI_ERR_OTHER: known error not in list
>>> [xxx:27545] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>>> [xxx][[31806,1],0][connect/btl_openib_connect_oob.c:857:qp_create_one] 
>>> error creating qp errno says Resource temporarily unavailable
>>> --
>>> mpirun has exited due to process rank 0 with PID 27545 on
>>> node rs1891 exiting without calling "finalize". This may
>>> have caused other processes in the application to be
>>> terminated by signals sent by mpirun (as reported here).
>>> --
>>> 
>>> 
>>> The problem goes away if we modify the eager protocol msg sizes so that 
>>> there are only two QPs necessary instead of the default 4.  Is there a way 
>>> to bump up the number of QPs that can be created on a node, assuming the 
>>> issue is just running out of available QPs?  If not, any other thoughts on 
>>> working around the problem?
>>> 
>>> Thanks,
>>> 
>>> Brian
>>> 
>>> --
>>> Brian W. Barrett
>>> Dept. 1423: Scalable System Software
>>> Sandia National Laboratories
>>> 
>>> 
>>> 
>>> 
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
> 
> --
>  Brian W. Barrett
>  Dept. 1423: Scalable System Software
>  Sandia National Laboratories
> 
> 
> 
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] OFED question

2011-01-27 Thread Shamis, Pavel

Good point Paul.

I love XRC :-)

You may try to switch default configuration to XRC.
--mca btl_openib_receive_queues 
X,128,256,192,128:X,2048,256,128,32:X,12288,256,128,32:X,65536,256,128,32

If XRC is not supported on your platform, ompi should report some nice message.

BTW, on multi core system XRC should show better performance.

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jan 27, 2011, at 8:19 PM, Paul H. Hargrove wrote:

> Brian,
> 
> As Pasha said:
>> The maximum amount of supported qps you may see in ibv_devinfo.
> 
> However you'll probably need "-v":
> 
> {hargrove@cvrsvc05 ~}$ ibv_devinfo | grep max_qp:
> {hargrove@cvrsvc05 ~}$ ibv_devinfo -v | grep max_qp:
> max_qp: 261056
> 
> If you really are running out of QPs due to the "fattness" of the node, 
> then you should definitely look at enabling XRC if your HCA and 
> libibverbs version supports it.  ibv_devinfo can query the HCA capability:
> 
> {hargrove@cvrsvc05 ~}$ ibv_devinfo -v | grep port_cap_flags:
> port_cap_flags: 0x02510868
> 
> and look for bit 0x0010  ( == 1<<20).
> 
> -Paul
> 
> 
> 
> On 1/27/2011 5:09 PM, Barrett, Brian W wrote:
>> Pasha -
>> 
>> Is there a way to tell which of the two happened or to check the number of 
>> QPs available per node?  The app likely does talk to a large number of peers 
>> from each process, and the nodes are fairly "fat" - it's quad socket, quad 
>> core and they are running 16 MPI ranks for each node.
>> 
>> Brian
>> 
>> On Jan 27, 2011, at 6:17 PM, Shamis, Pavel wrote:
>> 
>>> Unfortunately verbose error reports are not so friendly...anyway , I may 
>>> think about 2 issues:
>>> 
>>> 1. You trying to open open too much QPs. By default ib devices support 
>>> fairly large amount of QPs and it is quite hard to push it to this corner. 
>>> But If your job is really huge it may be the case. Or for example, if you 
>>> share the compute nodes with some other processes that create a lot of qps. 
>>> The maximum amount of supported qps you may see in ibv_devinfo.
>>> 
>>> 2. The memory limit for registered memory is too low, as result driver 
>>> fails allocate and register memory for QP. This scenario is most common. 
>>> Just happened to me recently, system folks pushed some crap into 
>>> limits.conf.
>>> 
>>> Regards,
>>> 
>>> Pavel (Pasha) Shamis
>>> ---
>>> Application Performance Tools Group
>>> Computer Science and Math Division
>>> Oak Ridge National Laboratory
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Jan 27, 2011, at 5:56 PM, Barrett, Brian W wrote:
>>> 
>>>> All -
>>>> 
>>>> On one of our clusters, we're seeing the following on one of our 
>>>> applications, I believe using Open MPI 1.4.3:
>>>> 
>>>> [xxx:27545] *** An error occurred in MPI_Scatterv
>>>> [xxx:27545] *** on communicator MPI COMMUNICATOR 5 DUP FROM 4
>>>> [xxx:27545] *** MPI_ERR_OTHER: known error not in list
>>>> [xxx:27545] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>>>> [xxx][[31806,1],0][connect/btl_openib_connect_oob.c:857:qp_create_one] 
>>>> error creating qp errno says Resource temporarily unavailable
>>>> --
>>>> mpirun has exited due to process rank 0 with PID 27545 on
>>>> node rs1891 exiting without calling "finalize". This may
>>>> have caused other processes in the application to be
>>>> terminated by signals sent by mpirun (as reported here).
>>>> --
>>>> 
>>>> 
>>>> The problem goes away if we modify the eager protocol msg sizes so that 
>>>> there are only two QPs necessary instead of the default 4.  Is there a way 
>>>> to bump up the number of QPs that can be created on a node, assuming the 
>>>> issue is just running out of available QPs?  If not, any other thoughts on 
>>>> working around the problem?
>>>> 
>>>> Thanks,
>>>> 
>>>> Brian
>>>> 
>>>> --
>>>> Brian W. Barrett
>>>> Dept. 1423: Scalable System Software
>>>> Sandia National Lab

Re: [OMPI devel] OFED question

2011-01-28 Thread Shamis, Pavel

The command line actually is not so magic, but unfortunately we have never had 
time to complete btl_openib_receive_queue documentation. In the follow ticket 
you may find some initial documentation: 
https://svn.open-mpi.org/trac/ompi/ticket/1260
It may be good idea define some user friendly flag to switch to XRC or even SRQ.

Regards,
Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory

On Jan 27, 2011, at 8:38 PM, Paul H. Hargrove wrote:

> 
> RFE:  Could OMPI implement a short-hand for Pasha's following magical 
> incantation?
> 
> On 1/27/2011 5:34 PM, Shamis, Pavel wrote:
>> --mca btl_openib_receive_queues 
>> X,128,256,192,128:X,2048,256,128,32:X,12288,256,128,32:X,65536,256,128,32
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> HPC Research Department   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] RFC: Add support to send/receive CUDA device memory directly

2011-04-14 Thread Shamis, Pavel

Hello Rolf,

CUDA support is always welcome. 
Please see my comments bellow

+#if OMPI_CUDA_SUPPORT
+fl->fl_frag_block_alignment = 0;
+fl->fl_flags = 0;
+#endif

[pasha] It seem that the "fl_flags" is a hack that allow you to do the second 
(cuda) registration in
 mpool_rdma:

+#if OMPI_CUDA_SUPPORT
+if ((flags & MCA_MPOOL_FLAGS_CUDA_MEM) && 
mca_common_cuda_registered_memory) {
+mca_common_cuda_register(addr, size,
+ 
mpool->mpool_component->mpool_version.mca_component_name);
+   }
+#endif

[pasha] It is really _hack_ way to enable multiple device registration.  
I would prefer see new mpool component, that has support multiple device 
registration in contrast to single device registration in mpool_rdma.

 fl->fl_payload_buffer_size=0;
 fl->fl_payload_buffer_alignment=0;
 fl->fl_frag_class = OBJ_CLASS(ompi_free_list_item_t);
@@ -190,7 +194,19 @@
 alloc_size = num_elements * head_size + sizeof(ompi_free_list_memory_t) +
 flist->fl_frag_alignment;

+#if OMPI_CUDA_SUPPORT
+/* Hack for TCP since there is no memory pool. */
+if (flist->fl_frag_block_alignment) {
+alloc_size = OPAL_ALIGN(alloc_size, 4096, size_t);
+if((errno = posix_memalign((void *)&alloc_ptr, 4096, alloc_size)) != 
0) {
+alloc_ptr = NULL;
+}
+} else {
+alloc_ptr = (ompi_free_list_memory_t*)malloc(alloc_size);
+}
+#else
 alloc_ptr = (ompi_free_list_memory_t*)malloc(alloc_size);
+#endif

[pasha] I would prefer not to _hack_ ompi_free_list  in order to work around 
BTL related issues. Such kinda of problem should be handled by tcp btl. If you 
think, that it is not enough flexibility in free list or mpool interface, we 
may discuss the inderface update or modification. IMHO it is much better that 
hack.

Regards,

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Apr 13, 2011, at 12:47 PM, Rolf vandeVaart wrote:

> WHAT: Add support to send data directly from CUDA device memory via MPI calls.
>  
> TIMEOUT: April 25, 2011
>  
> DETAILS: When programming in a mixed MPI and CUDA environment, one cannot 
> currently send data directly from CUDA device memory.  The programmer first 
> has to move the data into host memory, and then send it.  On the receiving 
> side, it has to first be received into host memory, and then copied into CUDA 
> device memory.
>  
> This RFC adds the ability to send and receive CUDA device memory directly.
>  
> There are three basic changes being made to add the support.  First, when it 
> is detected that a buffer is CUDA device memory, the protocols that can be 
> used are restricted to the ones that first copy data into internal buffers.  
> This means that we will not be using the PUT and RGET protocols, just the 
> send and receive ones.  Secondly, rather than using memcpy to move the data 
> into and out of the host buffers, the library has to use a special CUDA copy 
> routine called cuMemcpy.  Lastly, to improve performance, the internal host 
> buffers have to also be registered with the CUDA environment (although 
> currently it is unclear how helpful that is)
>  
> By default, the code is disable and has to be configured into the library.
>   --with-cuda(=DIR)   Build cuda support, optionally adding DIR/include,
>  DIR/lib, and DIR/lib64
>   --with-cuda-libdir=DIR  Search for cuda libraries in DIR
>  
> An initial implementation can be viewed at:
> https://bitbucket.org/rolfv/ompi-trunk-cuda-3
>  
> Here is a list of the files being modified so one can see the scope of the 
> impact.
>  
> $ svn status
> M   VERSION
> M   opal/datatype/opal_convertor.h
> M   opal/datatype/opal_datatype_unpack.c
> M   opal/datatype/opal_datatype_pack.h
> M   opal/datatype/opal_convertor.c
> M   opal/datatype/opal_datatype_unpack.h
> M   configure.ac
> M   ompi/mca/btl/sm/btl_sm.c
> M   ompi/mca/btl/sm/Makefile.am
> M   ompi/mca/btl/tcp/btl_tcp_component.c
> M   ompi/mca/btl/tcp/btl_tcp.c
> M   ompi/mca/btl/tcp/Makefile.am
> M   ompi/mca/btl/openib/btl_openib_component.c
> M   ompi/mca/btl/openib/btl_openib_endpoint.c
> M   ompi/mca/btl/openib/btl_openib_mca.c
> M   ompi/mca/mpool/sm/Makefile.am
> M   ompi/mca/mpool/sm/mpool_sm_module.c
> M   ompi/mca/mpool/rdma/mpool_rdma_module.c
> M   ompi/mca/mpool/rdma/Makefile.am
> M   ompi/mca/mpool/mpool.h
> A   ompi/mca/common/cuda
> A   ompi/mca/common/cuda/configure.m4
> A   ompi/mca/common/cuda/common_cuda.c
> A   ompi/mca/common/cuda/help-mpi-common-cuda.txt
> A   ompi/mca/common/cuda/Makefile.am
> A   ompi/mca/common/cuda/common_cuda.h
> M   ompi/mca/pml/ob1/pml_ob1_component.c
> M   ompi/mca/pml/ob1/pml_ob1_sendreq.h
> M   ompi/mca/pml/ob1/pml_ob1_recvreq.h
> M   ompi/mca/pml/ob1/Makefile.am
>

Re: [OMPI devel] RFC: Add support to send/receive CUDA device memory directly

2011-04-14 Thread Shamis, Pavel

> 
>> By default, the code is disable and has to be configured into the library.
>>  --with-cuda(=DIR)   Build cuda support, optionally adding DIR/include,
>> DIR/lib, and DIR/lib64
>>  --with-cuda-libdir=DIR  Search for cuda libraries in DIR
> 
> My $0.02: cuda shouldn't be disabled by default (and only enabled if you 
> --with-cuda).  If configure finds all the Right cuda magic, then cuda support 
> should be enabled by default.  Just like all other optional support libraries 
> that OMPI uses.

Actually I'm not sure that it is good idea to enable CUDA by default, since it 
disables zero-copy protocol, that is critical for good performance.

My 0.02$

Pasha.

Re: [OMPI devel] RFC: Add support to send/receive CUDA device memory directly

2011-04-14 Thread Shamis, Pavel

> 
>> Actually I'm not sure that it is good idea to enable CUDA by default, since 
>> it disables zero-copy protocol, that is critical for good performance.
> 
> That can easily be a run-time check during startup.

It could be fixed. My point was that in the existing code, it's compile time 
decision and not run time.

Pasha

[OMPI devel] Open MPI + HWLOC + Static build issue

2011-07-25 Thread Shamis, Pavel

Hello,

I have been trying to compile Open MPI (trunk) static version with hwloc, the 
last is enabled by default in trunk.
The build platform is AMD machine, that has dynamic libnuma version only.

Problem:
Open MPI fails to link orted, because it can't find static version of libnuma.

Workaround:
add --without-hwloc

Real solution:
Is it a way to keep hwloc enabled when static libnuma isn't presented on the 
system ? If it's a such way, I would like to know how to enable it.
Otherwise, I think configure script should handle such scenario, it means 
disable hwloc and enable some other alternative.

Regards,

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory

Re: [OMPI devel] Open MPI + HWLOC + Static build issue

2011-07-26 Thread Shamis, Pavel

Hello Brice,

On this system libnuma dynamic lib and header files are installed without 
static lib.
Distro: SLES 10.1 based  machine


Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jul 26, 2011, at 3:54 AM, Brice Goglin wrote:

> Hello Pavel,
> Do you have libnuma headers and dynamic lib installed without static lib
> installed ? Which distro is this?
> Brice
> 
> 
> 
> Le 25/07/2011 23:56, Shamis, Pavel a écrit :
>> Hello,
>> 
>> I have been trying to compile Open MPI (trunk) static version with hwloc, 
>> the last is enabled by default in trunk.
>> The build platform is AMD machine, that has dynamic libnuma version only.
>> 
>> Problem:
>> Open MPI fails to link orted, because it can't find static version of 
>> libnuma.
>> 
>> Workaround:
>> add --without-hwloc
>> 
>> Real solution:
>> Is it a way to keep hwloc enabled when static libnuma isn't presented on the 
>> system ? If it's a such way, I would like to know how to enable it.
>> Otherwise, I think configure script should handle such scenario, it means 
>> disable hwloc and enable some other alternative.
>> 
>> Regards,
>> 
>> Pavel (Pasha) Shamis
>> ---
>> Application Performance Tools Group
>> Computer Science and Math Division
>> Oak Ridge National Laboratory
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> hxxp://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> hxxp://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] Open MPI + HWLOC + Static build issue

2011-08-03 Thread Shamis, Pavel

Please see my comments below.

> -Original Message-
> From: Brice Goglin [mailto:brice.gog...@inria.fr]
> Sent: Wednesday, August 03, 2011 10:29 AM
> To: Shamis, Pavel
> Cc: Open MPI Developers
> Subject: Re: [OMPI devel] Open MPI + HWLOC + Static build issue
> 
> I finally reproduced here. Based on the ornl platform script, you're
> configuring with LDFLAGS=-static and then building with make
> LDFLAGS=-all-static. Surprisingly, this works fine when building vanilla
> hwloc, but it breaks inside OMPI. The reason is that OMPI doesn't pass
> LDFLAGS=-static to hwloc's checks. Inside the vanilla hwloc, the libnuma
> related checks properly use the static libnuma:
>

Hw-loc vanilla works, because static mode does not build the binaries in static 
mode. If you would try to build build hwloc utilities in static mode it fails , 
just like ompi.

Regards,
Pasha

Re: [OMPI devel] Open MPI + HWLOC + Static build issue

2011-08-03 Thread Shamis, Pavel

> 
> Err.. I don't quite understand.  How exactly are you configuring?  If I do 
> this:
> 
> ./configure --prefix=/home/jsquyres/bogus --disable-mpi-f77 --disable-vt --
> disable-io-romio --disable-mpi-cxx --disable-shared --enable-static --enable-
> mpirun-prefix-by-default LDFLAGS=-static
> 
> I fail when linking opal_wrapper because of the ptmalloc shared memory
> hooks that we're looking for:
> 
>   CCLD   opal_wrapper
> ../../../opal/.libs/libopen-pal.a(memory_linux_munmap.o): In function
> `opal_memory_linux_free_ptmalloc2_munmap':
> /users/jsquyres/svn/ompi5/opal/mca/memory/linux/memory_linux_munm
> ap.c:74: undefined reference to `__munmap'
> 
> What is the goal here -- to make libmpi.a (and friends) that have no external
> dependencies?

Not only. We also need orted and all the friends as static binaries.

Regards,
Pasha

Re: [OMPI devel] Open MPI + HWLOC + Static build issue

2011-08-03 Thread Shamis, Pavel

> 
> I get static binaries on SLES11 with
> ./configure --enable-static --disable-shared LDFLAGS=-static
> and
> make LDFLAGS=-all-static LIBS=-lpthread
> 
> $ ldd utils/lstopo
> not a dynamic executable
> $ utils/lstopo
> Machine (24GB)
> [...]
> 
> No problem with libnuma here, it was disabled during configure
> (libpthread is needed for another reason).

Then it seems , that somehow we does not get it disabled in ompi build, right ?

Pasha

Re: [OMPI devel] Open MPI + HWLOC + Static build issue

2011-08-03 Thread Shamis, Pavel

Jeff,
1. We do not have libnuma.a in our setup. So if you want to reproduce the 
problem, I would suggest to move it to some .bk.
2. Build open mpi
./configure --enable-static --disable-shared
--with-wrapper-ldflags=-static --disable-dlopen --enable-contrib-no-build=vt
and
make -j 8 orted_LDFLAGS=-all-static all

3. Open MPI compilation fails in orte. Hwloc adds -lnuma to list of libs, as 
result when orted links the static binary it fails to find 
Static version of  libnuma and wits with error. 


Regards,
Pasha.

> -Original Message-
> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org]
> On Behalf Of Jeff Squyres
> Sent: Wednesday, August 03, 2011 10:34 AM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] Open MPI + HWLOC + Static build issue
> 
> Pasha --
> 
> I'm having trouble reproducing this.  My system (RHEL5) has libnuma.so and
> no libnuma.a, but when I configure with:
> 
> ./configure --disable-shared --enable-static ...
> 
> Everything works fine.
> 
> hwloc doesn't specifically look for libnuma.a or libnuma.so -- it just tries 
> to
> link with -lnuma.  If that works, then it rules that we have libnuma support.
> 
> Can you send more details on exactly what is failing, and how you make that
> happen?
> 
> 
> On Jul 25, 2011, at 5:56 PM, Shamis, Pavel wrote:
> 
> > Hello,
> >
> > I have been trying to compile Open MPI (trunk) static version with hwloc,
> the last is enabled by default in trunk.
> > The build platform is AMD machine, that has dynamic libnuma version only.
> >
> > Problem:
> > Open MPI fails to link orted, because it can't find static version of 
> > libnuma.
> >
> > Workaround:
> > add --without-hwloc
> >
> > Real solution:
> > Is it a way to keep hwloc enabled when static libnuma isn't presented on
> the system ? If it's a such way, I would like to know how to enable it.
> > Otherwise, I think configure script should handle such scenario, it means
> disable hwloc and enable some other alternative.
> >
> > Regards,
> >
> > Pavel (Pasha) Shamis
> > ---
> > Application Performance Tools Group
> > Computer Science and Math Division
> > Oak Ridge National Laboratory
> >
> >
> >
> >
> >
> >
> >
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > hxxp://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> hxxp://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> hxxp://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r25093

2011-08-30 Thread Shamis, Pavel

Hi all,
I'm not sure, if it is relevant to this specific commit, but it is relevant for 
some of epoch changes.
I was not able to compile latest trunk version on our cray system, the failure 
was in ess/alps component, for me it seems like simple typo. I did not have 
chance to check my fix on our system, because I have been fighting with Open 
MPI - VT component compilation on Cray. Please let me know if the patch is ok.

Please see the patch below:

Index: orte/mca/ess/alps/ess_alps_module.c
===
--- orte/mca/ess/alps/ess_alps_module.c (revision 25108)
+++ orte/mca/ess/alps/ess_alps_module.c (working copy)
@@ -363,8 +363,7 @@

 ORTE_PROC_MY_NAME->jobid = jobid;
 ORTE_PROC_MY_NAME->vpid = (orte_vpid_t) cnos_get_rank() + starting_vpid;
-ORTE_EPOCH_PRINT(ORTE_PROC_MY_NAME->epoch,ORTE_EPOCH_INVALID);
-
ORTE_EPOCH_PRINT(ORTE_PROC_MY_NAME->epoch,orte_ess.proc_get_epoch(ORTE_PROC_MY_NAME));
+
ORTE_EPOCH_SET(ORTE_PROC_MY_NAME->epoch,orte_ess.proc_get_epoch(ORTE_PROC_MY_NAME));

 OPAL_OUTPUT_VERBOSE((1, orte_ess_base_output,
  "ess:alps set name to %s", 
ORTE_NAME_PRINT(ORTE_PROC_MY_NAME)));


Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Aug 26, 2011, at 6:18 PM, Wesley Bland wrote:

> The epoch and resilient rote code is now macro'd away. To enable use
>
> --enable-resilient-orte
>
> which defines:
>
> ORTE_ENABLE_EPOCH
> ORTE_RESIL_ORTE
>
> --
>
> Wesley
>
> On Aug 26, 2011, at 6:16 PM, wbl...@osl.iu.edu wrote:
>
>> Author: wbland
>> Date: 2011-08-26 18:16:14 EDT (Fri, 26 Aug 2011)
>> New Revision: 25093
>> URL: hxxps://svn.open-mpi.org/trac/ompi/changeset/25093
>>
>> Log:
>> By popular demand the epoch code is now disabled by default.
>>
>> To enable the epochs and the resilient orte code, use the configure flag:
>>
>> --enable-resilient-orte
>>
>> This will define both:
>>
>> ORTE_ENABLE_EPOCH
>> ORTE_RESIL_ORTE
>>
>> Text files modified:
>>  trunk/ompi/mca/btl/openib/connect/btl_openib_connect_xoob.c  |12 
>>  trunk/ompi/mca/coll/sm2/coll_sm2_module.c| 3
>>  trunk/ompi/mca/crcp/bkmrk/crcp_bkmrk_pml.c   |49 
>> --
>>  trunk/ompi/mca/dpm/orte/dpm_orte.c   | 2
>>  trunk/ompi/mca/pml/bfo/pml_bfo_failover.c|10 +--
>>  trunk/ompi/mca/pml/bfo/pml_bfo_hdr.h | 6 --
>>  trunk/ompi/proc/proc.c   | 6 +-
>>  trunk/opal/config/opal_configure_options.m4  | 8 +++
>>  trunk/orte/include/orte/types.h  |24 
>> +
>>  trunk/orte/mca/db/daemon/db_daemon.c | 2
>>  trunk/orte/mca/errmgr/app/errmgr_app.c   |19 ++-
>>  trunk/orte/mca/errmgr/base/errmgr_base_fns.c |12 ++--
>>  trunk/orte/mca/errmgr/base/errmgr_base_tool.c| 6 +-
>>  trunk/orte/mca/errmgr/hnp/errmgr_hnp.c   |99 
>> +++
>>  trunk/orte/mca/errmgr/hnp/errmgr_hnp_autor.c | 6 +-
>>  trunk/orte/mca/errmgr/hnp/errmgr_hnp_crmig.c | 6 +-
>>  trunk/orte/mca/errmgr/orted/errmgr_orted.c   |71 
>> +---
>>  trunk/orte/mca/ess/alps/ess_alps_module.c| 4
>>  trunk/orte/mca/ess/base/base.h   | 4 +
>>  trunk/orte/mca/ess/base/ess_base_select.c|14 ++---
>>  trunk/orte/mca/ess/env/ess_env_module.c  | 3
>>  trunk/orte/mca/ess/ess.h | 4 +
>>  trunk/orte/mca/ess/generic/ess_generic_module.c  | 6 +-
>>  trunk/orte/mca/ess/hnp/ess_hnp_module.c  | 2
>>  trunk/orte/mca/ess/lsf/ess_lsf_module.c  | 3
>>  trunk/orte/mca/ess/singleton/ess_singleton_module.c  | 2
>>  trunk/orte/mca/ess/slave/ess_slave_module.c  | 3
>>  trunk/orte/mca/ess/slurm/ess_slurm_module.c  | 3
>>  trunk/orte/mca/ess/slurmd/ess_slurmd_module.c| 4
>>  trunk/orte/mca/ess/tm/ess_tm_module.c| 2
>>  trunk/orte/mca/filem/rsh/filem_rsh_module.c  | 6 +-
>>  trunk/orte/mca/grpcomm/base/grpcomm_base_coll.c  |21 ++-
>>  trunk/orte/mca/grpcomm/hier/grpcomm_hier_module.c| 8 +-
>>  trunk/orte/mca/iof/base/base.h   | 8 +-
>>  trunk/orte/mca/iof/base/iof_base_open.c  | 2
>>  trunk/orte/mca/iof/hnp/iof_hnp.c | 7 +-
>>  trunk/orte/mca/iof/hnp/iof_hnp_receive.c | 6 +-
>>  trunk/orte/mca/iof/orted/iof_orted

Re: [OMPI devel] Launcher in trunk is broken?

2011-10-10 Thread Shamis, Pavel

+ 1 , I see the same issue.

> -Original Message-
> From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org]
> On Behalf Of Yevgeny Kliteynik
> Sent: Monday, October 10, 2011 10:24 AM
> To: OpenMPI Devel
> Subject: [OMPI devel] Launcher in trunk is broken?
> 
> It looks like the process launcher is broken in the OMPI trunk:
> If you run any simple test (not necessarily including MPI calls) on 4 or
> more nodes, the MPI processes won't be killed after the test finishes.
> 
> $ mpirun -host host_1,host_2,host_3,host_4 -np 4 --mca btl sm,tcp,self
> /bin/hostname
> 
> Output:
> host_1
> host_2
> host_3
> host_4
> 
> And test is hanging..
> 
> I have an older trunk (r25228), and everything is OK there.
> Not sure if it means that something was broken after that, or the problem
> existed before, but kicked in only now due to some other change.
> 
> -- YK
> ___
> devel mailing list
> de...@open-mpi.org
> hxxp://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] [EXTERNAL] Re: Rename "vader" BTL to "xpmem"

2011-11-17 Thread Shamis, Pavel

+1,  If Nathan don't mind to change the name, then it's ok. 

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Nov 17, 2011, at 11:34 AM, Barrett, Brian W wrote:

> On 11/17/11 6:29 AM, "Ralph Castain"  wrote:
> 
>> Frankly, the only vote that counts is Nathan's - it's his btl, and we
>> have never forcibly made someone rename their component. I would suggest
>> we not set that precedent. I'm comfortable with whatever he decides to
>> call it.
> 
> I have no objection to a rename, but agree with Ralph that this seems like
> the sort of thing that should be handled by asking Nathan first (which
> Jeff may have done, who knows).  Naming arguments are always painful and
> we've largely avoided them.  I'd hate to start with a situation where we
> don't talk to the original author of the code first.
> 
> Brian
> 
> -- 
>  Brian W. Barrett
>  Dept. 1423: Scalable System Software
>  Sandia National Laboratories
> 
> 
> 
> 
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> hxxp://www.open-mpi.org/mailman/listinfo.cgi/devel
>

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r26106

2012-03-09 Thread Shamis, Pavel

>> Depending on the timing, this might go to 1.6 (1.5.5 has waited for too 
>> long, and this is not a regression).  Keep in mind that the problem has been 
>> around for *a long, long time*, which is why I approved the diag message 
>> (i.e., because a real solution is still nowhere in sight).  The real issue 
>> is that we can still run out of registered memory *and there is nothing left 
>> to deregister*.  The real solution there is that the PML should fall back to 
>> a different protocol, but I'm told that doesn't happen and will require a 
>> bunch of work to make work properly.
> 
> An mpool that is aware of local processes lru's will solve the problem in 
> most cases (all that I have seen) but yes, we need to rework the pml to 
> handle the remaining cases. There are two things that need to be changed 
> (from what I can tell):
> 
>  1) allow rget to fallback to send/put depending on the failure (I have 
> fallback on put implemented in my branch-- and in my btl).
>  2) need to devise new criteria on when we should progress the rdma_pending 
> list to avoid deadlock.
> 
> #1  is fairly simple and I haven't given much though to #2.


But #1 will be good start in right direction.Agree about #2.

> 
> -Nathan
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] RFC: ob1: fallback on put/send on rget failure

2012-03-15 Thread Shamis, Pavel

Nathan,

I did not get any patch.

Regards,

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Mar 15, 2012, at 5:07 PM, Nathan Hjelm wrote:

> 
> 
> What: Update ob1 to do the following:
>- fallback on send after rdma_put_retries_limit failures of prepare_dst
>- fallback on put (single non-pipelined) if the btl returns 
> OMPI_ERR_NOT_AVAILABLE on a get transaction.
> 
> When: Timeout in about one week (Mar 22)
> 
> Why: Two reasons:
>- Some btls (ugni) need to switch to put for certain transactions. It 
> makes sense to make this switch at the pml level.
>- If prepare_dst repeatedly fails for a get transaction we currently 
> deadlock. We can avoid the deadlock (in most cases) by switching to send for 
> the transaction.
> 
> Please take a look at the attached patch. Feedback and constructive criticism 
> is needed!
> 
> -Nathan Hjelm
> HPC-3, LANL

Re: [OMPI devel] RFC: ob1: fallback on put/send on rget failure

2012-03-19 Thread Shamis, Pavel

I got it. 
The patch looks ok.

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Mar 18, 2012, at 9:59 PM, Christopher Samuel wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 16/03/12 08:14, Shamis, Pavel wrote:
> 
>> I did not get any patch.
> 
> It arrived OK here, you can get it from the archive:
> 
> http://www.open-mpi.org/community/lists/devel/2012/03/10717.php
> 
> - -- 
>Christopher Samuel - Senior Systems Administrator
> VLSCI - Victorian Life Sciences Computation Initiative
> Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
> http://www.vlsci.unimelb.edu.au/
> 
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iEYEARECAAYFAk9mkwoACgkQO2KABBYQAh/4FwCghl/yE6A7IMMON6u2/RpplhzE
> HxQAn2suJEOYOoG+povWbuqKpkhWphyU
> =6/CG
> -END PGP SIGNATURE-
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] RFC: change default for tuned alltoallv to pairwise

2012-03-22 Thread Shamis, Pavel

> 
>> What: Change coll tuned default to pairwise exchange
>> 
>> Why: The linear algorithm does not scale to any reasonable number of PEs
>> 
>> When: Timeout in 2 days (Fri)
>> 
>> Is there any reason the default should not be changed?
> 
> Nathan,
> 
> I can see why people think the linear algorithm is bad. However I think it 
> depends on the application communication pattern in the alltoallv. Do you 
> have any numbers to back your claim?

George,
In addition the above list of dependencies it also depends on network 
technology. The linear algorithm does not play well with IB.

Pasha.

Re: [OMPI devel] barrier problem

2012-03-23 Thread Shamis, Pavel

Pavel,

Mvapich implements multicore optimized collectives, which perform substantially 
better than default algorithms.
FYI,  ORNL team works on new high performance collectives framework for OMPI. 
The framework provides significant boost in collectives performance.

Regards,

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory

On Mar 23, 2012, at 9:17 AM, Pavel Mezentsev wrote:

I've been comparing 1.5.4 and 1.5.5rc3 with the same parameters that's why I 
didn't use --bind-to-core. I checked and the usage of --bind-to-core improved 
the result comparing to 1.5.4:
#repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
 100084.9685.0885.02

So I guess with 1.5.5 the processes move from core to core within node even 
though I use all cores, right? Then why 1.5.4 behaves differently?

I need --bind-to-core in some cases and that's why I need 1.5.5rc3 instead of 
more stable 1.5.4. I know that I can use numactl explicitly but --bind-to-core 
is more convinient :)

2012/3/23 Ralph Castain mailto:r...@open-mpi.org>>
I don't see where you told OMPI to --bind-to-core. We don't automatically bind, 
so you have to explicitly tell us to do so.

On Mar 23, 2012, at 6:20 AM, Pavel Mezentsev wrote:

> Hello
>
> I'm doing some testing with IMB and dicovered a strange thing:
>
> Since I have a system with new AMD opteron 6276 processors I'm using 1.5.5rc3 
> since it supports binding to cores.
>
> But when I run the barrier test form intel mpi benchmarks, the best I get is:
> #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
>   598 15159.56 15211.05 15184.70
>  (/opt/openmpi-1.5.5rc3/intel12/bin/mpirun -x OMP_NUM_THREADS=1  -hostfile 
> hosts_all2all_2 -npernode 32 --mca btl openib,sm,self -mca 
> coll_tuned_use_dynamic_rules 1 -mca coll_tuned_barrier_algorithm 1 -np 256 
> openmpi-1.5.5rc3/intel12/IMB-MPI1 -off_cache 16,64 -msglog 1:16 -npmin 256 
> barrier)
>
> And with openmpi 1.5.4 the result is much better:
> #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
>  1000   113.23   113.33   113.28
>
> (/opt/openmpi-1.5.4/intel12/bin/mpirun -x OMP_NUM_THREADS=1  -hostfile 
> hosts_all2all_2 -npernode 32 --mca btl openib,sm,self -mca 
> coll_tuned_use_dynamic_rules 1 -mca coll_tuned_barrier_algorithm 3 -np 256 
> openmpi-1.5.4/intel12/IMB-MPI1 -off_cache 16,64 -msglog 1:16 -npmin 256 
> barrier)
>
> and still I couldn't come close to the result I got with mvapich:
> #repetitions  t_min[usec]  t_max[usec]  t_avg[usec]
>  100017.5117.5317.53
>
> (/opt/mvapich2-1.8/intel12/bin/mpiexec.hydra -env OMP_NUM_THREADS 1 -hostfile 
> hosts_all2all_2 -np 256 mvapich2-1.8/intel12/IMB-MPI1 -mem 2 -off_cache 16,64 
> -msglog 1:16 -npmin 256 barrier)
>
> I dunno if this is a bug or me doing something not the way I should. So is 
> there a way to improve my results?
>
> Best regards,
> Pavel Mezentsev
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Remove Portals support?

2012-03-27 Thread Shamis, Pavel

Probably ORNL-UT Kraken system still use it. I would not be so eager to remove 
it.


Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Mar 23, 2012, at 9:56 AM, Barrett, Brian W wrote:

> Hi all -
> 
> This is not an RFC, but more a question for the community.  Is anyone still 
> actively using the Portals MTL/BTLs?  We're not at Sandia.  I know ORNL was 
> using it at one point.  SNL probably can't do much in the way of support 
> anymore, so if no one wants them, it might make sense to remove the Portals 
> MTL/BTL for 1.7.  Of course, I'll continue supporting all the Portals4 code 
> :).
> 
> Brian___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Developers Meeting

2012-04-03 Thread Shamis, Pavel

I would like to propose Oak Ridge as a potential location for the meeting.

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Apr 3, 2012, at 11:44 AM, Barrett, Brian W wrote:

> Hi all -
> 
> There is discussion of attempting to have a developers meeting this
> summer.  We haven't had one in a while and people thought it would be good
> to work through some of the ideas on how to implement features for 1.7.
> We don't have a location yet, but possibilities include Los Alamos and San
> Jose.  To help us get an idea of who can attend, please add your
> information to the doodle poll below.
> 
>  http://www.doodle.com/cei3ve3qyeer9bv9
> 
> Note that the missing week in July is the MPI Forum in Chicago.
> 
> 
> 
> Brian
> 
> -- 
>  Brian W. Barrett
>  Dept. 1423: Scalable System Software
>  Sandia National Laboratories
> 
> 
> 
> 
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] mca_btl_tcp_alloc

2012-04-04 Thread Shamis, Pavel

> In mca_btl_tcp_alloc (openmpi-trunk/ompi/mca/btl/tcp/btl_tcp.c:188) the 
> first segment is initialized to point to "frag + 1".
> I don't get it... how/when is this location allocated? Isn't it just 
> after the mca_btl_tcp_frag_t structure ends?

Alex,
The frag allocation macros take the fragments from the free lists.
The free lists are created in function mca_btl_tcp_component_init().
As you will see there fragment size is mca_btl_tcp_frag_t + some_size.
frag + 1 , means skip the frag structure and jump to payload.

Bahazlaha ;-)

Pasha.

> 
> Thanks,
> Alex
> 
> mca_btl_base_descriptor_t* mca_btl_tcp_alloc(
> struct mca_btl_base_module_t* btl,
> struct mca_btl_base_endpoint_t* endpoint,
> uint8_t order,
> size_t size,
> uint32_t flags)
> {
> mca_btl_tcp_frag_t* frag = NULL;
> int rc;
> 
> if(size <= btl->btl_eager_limit) {
> MCA_BTL_TCP_FRAG_ALLOC_EAGER(frag, rc);
> } else if (size <= btl->btl_max_send_size) {
> MCA_BTL_TCP_FRAG_ALLOC_MAX(frag, rc);
> }
> if( OPAL_UNLIKELY(NULL == frag) ) {
> return NULL;
> }
> 
> frag->segments[0].seg_len = size;
> frag->segments[0].seg_addr.pval = frag+1;
> 
> frag->base.des_src = frag->segments;
> frag->base.des_src_cnt = 1;
> frag->base.des_dst = NULL;
> frag->base.des_dst_cnt = 0;
> frag->base.des_flags = flags;
> frag->base.order = MCA_BTL_NO_ORDER;
> frag->btl = (mca_btl_tcp_module_t*)btl;
> return (mca_btl_base_descriptor_t*)frag;
> }
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] How to debug segv

2012-04-25 Thread Shamis, Pavel

Alex,
+1 vote for core. It is good starting point.

* If you can't (from some reason) generate the core file, you may drop while 
(1) somewhere in the init code and attach the gdb later.
* If you are looking for more user-friendly experience, you may try Allinea DDT 
(they have 30day trial version).

Regards,
Pasha.

> Another thing to try is to load up the core file in gdb and see if that gives 
> you a valid stack trace of where exactly the segv occurred.
>
>
> On Apr 25, 2012, at 9:30 AM, Alex Margolin wrote:
>
>> On 04/25/2012 02:57 PM, Ralph Castain wrote:
>>> Strange that your code didn't generate any symbols - is that a mosix thing? 
>>> Have you tried just adding opal_output (so it goes to a special diagnostic 
>>> output channel) statements in your code to see where the segfault is 
>>> occurring?
>>>
>>> It looks like you are getting thru orte_init. You could add -mca 
>>> grpcomm_base_verbose 5 to see if you are getting in/thru the modex - if so, 
>>> then you are probably failing in add_procs.
>>>
>> I guess the symbols are a mosix thing, but it should still show some sort of 
>> segmentation fault trace, no? maybe only the assembly opcode... It seems 
>> that the SEGV is detected, rather then caught. This may also be related to 
>> mosix - I'll check it with the mosix developer.
>>
>> I added the parameter you suggested and appended the output. Modex seems to 
>> be working because I use it to exchange the IP and PID, and as you can see 
>> at the bottom these are received OK. I'll try debug printouts specifically 
>> in add_procs. Thanks for the advice!
>>
>> alex@singularity:~/huji/benchmarks/mpi/npb$ mpirun -mca grpcomm_base_verbose 
>> 5 -mca btl self,mosix -mca btl_base_verbose 100 -n 4 ft.S.4
>> [singularity:08915] mca:base:select:(grpcomm) Querying component [bad]
>> [singularity:08915] mca:base:select:(grpcomm) Query of component [bad] set 
>> priority to 10
>> [singularity:08915] mca:base:select:(grpcomm) Selected component [bad]
>> [singularity:08915] [[37778,0],0] grpcomm:base:receive start comm
>> [singularity:08915] [[37778,0],0] grpcomm:bad:xcast sent to job [37778,0] 
>> tag 1
>> [singularity:08915] [[37778,0],0] grpcomm:xcast:recv:send_relay
>> [singularity:08915] [[37778,0],0] grpcomm:base:xcast updating nidmap
>> [singularity:08915] [[37778,0],0] orte:daemon:send_relay - recipient list is 
>> empty!
>> [singularity:08916] mca:base:select:(grpcomm) Querying component [bad]
>> [singularity:08916] mca:base:select:(grpcomm) Query of component [bad] set 
>> priority to 10
>> [singularity:08916] mca:base:select:(grpcomm) Selected component [bad]
>> [singularity:08916] [[37778,1],0] grpcomm:base:receive start comm
>> [singularity:08919] mca:base:select:(grpcomm) Querying component [bad]
>> [singularity:08919] mca:base:select:(grpcomm) Query of component [bad] set 
>> priority to 10
>> [singularity:08919] mca:base:select:(grpcomm) Selected component [bad]
>> [singularity:08919] [[37778,1],2] grpcomm:base:receive start comm
>> [singularity:08917] mca:base:select:(grpcomm) Querying component [bad]
>> [singularity:08917] mca:base:select:(grpcomm) Query of component [bad] set 
>> priority to 10
>> [singularity:08917] mca:base:select:(grpcomm) Selected component [bad]
>> [singularity:08917] [[37778,1],1] grpcomm:base:receive start comm
>> [singularity:08921] mca:base:select:(grpcomm) Querying component [bad]
>> [singularity:08921] mca:base:select:(grpcomm) Query of component [bad] set 
>> priority to 10
>> [singularity:08921] mca:base:select:(grpcomm) Selected component [bad]
>> [singularity:08921] [[37778,1],3] grpcomm:base:receive start comm
>> [singularity:08916] [[37778,1],0] grpcomm:set_proc_attr: setting attribute 
>> MPI_THREAD_LEVEL data size 1
>> [singularity:08916] [[37778,1],0] grpcomm:set_proc_attr: setting attribute 
>> OMPI_ARCH data size 11
>> [singularity:08919] [[37778,1],2] grpcomm:set_proc_attr: setting attribute 
>> MPI_THREAD_LEVEL data size 1
>> [singularity:08919] [[37778,1],2] grpcomm:set_proc_attr: setting attribute 
>> OMPI_ARCH data size 11
>> [singularity:08917] [[37778,1],1] grpcomm:set_proc_attr: setting attribute 
>> MPI_THREAD_LEVEL data size 1
>> [singularity:08917] [[37778,1],1] grpcomm:set_proc_attr: setting attribute 
>> OMPI_ARCH data size 11
>> [singularity:08921] [[37778,1],3] grpcomm:set_proc_attr: setting attribute 
>> MPI_THREAD_LEVEL data size 1
>> [singularity:08921] [[37778,1],3] grpcomm:set_proc_attr: setting attribute 
>> OMPI_ARCH data size 11
>> [singularity:08916] mca: base: components_open: Looking for btl components
>> [singularity:08916] mca: base: components_open: opening btl components
>> [singularity:08916] mca: base: components_open: found loaded component mosix
>> [singularity:08916] mca: base: components_open: component mosix register 
>> function successful
>> [singularity:08916] mca: base: components_open: component mosix open 
>> function successful
>> [singularity:08916] mca: base: components_open: found loaded component self

Re: [OMPI devel] Time to unify OpenFabrics configury?

2012-04-27 Thread Shamis, Pavel

It is a good idea to unify the OFED configure scripts. BUT, I would prefer to 
do this rework after merge with the new collectives component, since we are 
going to bring totally new IB components based on extended verbs interface and 
it obviously adds new configure logic.

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Apr 27, 2012, at 7:48 AM, Jeff Squyres wrote:

> OpenFabrics vendors --
> 
> Now that there's a verbs-based component in orte, it really suggests that we 
> should update / reform the configure options and behavior w.r.t. 
> OpenFabrics-based components.
> 
> For example:
> 
> - is it finally time to rename --with-openib to --with-ofa?
> 
>  - should we also allow --with-openib as a deprecated synonym for the 1.7/1.8 
> series, and then kill it in 1.9?
> 
> - should we hack up ompi/config/ompi_check_openib.m4:
> 
>  1. split it up to check for smaller things (e.g., one macro to check for 
> basic OpenFabrics, another to check for the RDMACM, another to check for XRC, 
> ...etc.).  The rationale here is that oob/ud requires very little from OFA -- 
> it does not need RDMACM, XRC, ...etc.
> 
>  2. move the resulting OFA-based .m4 out to the top-level config/ directory 
> (vs. ompi/config)?
> 
> ==> Jeff's $0.02 on all of this is "yes".  :-)
> 
> 
> 
> Begin forwarded message:
> 
>> From: jsquy...@osl.iu.edu
>> Subject: [OMPI svn-full] svn:open-mpi r26350
>> Date: April 27, 2012 7:32:56 AM EDT
>> To: svn-f...@open-mpi.org
>> Reply-To: de...@open-mpi.org
>> 
>> Author: jsquyres
>> Date: 2012-04-27 07:32:56 EDT (Fri, 27 Apr 2012)
>> New Revision: 26350
>> URL: https://svn.open-mpi.org/trac/ompi/changeset/26350
>> 
>> Log:
>> Update configury in the new oob ud component: actually check to see if
>> it succeeds and run $1 or $2, accordingly.  This allows "make dist" to
>> run properly on machines that do not have OpenFabrics stuff installed
>> (e.g., the nightly tarball build machine).
>> 
>> There's still more to be done here -- it doesn't check for non-uniform
>> directories where the OpenFabrics headers/libraries might be
>> installed.  We might need to re-tool/combine
>> ompi/config/ompi_check_openib.m4 (which checks for way more than
>> oob/ud needs) and move it up to config/ompi_check_ofa.m4, or
>> something...?
>> 
>> Properties modified: 
>>  trunk/orte/mca/oob/ud/   (props changed)
>> Text files modified: 
>>  trunk/orte/mca/oob/ud/Makefile.am  | 8 ++-- 
>>
>>  trunk/orte/mca/oob/ud/configure.m4 |32 ++-- 
>>
>>  2 files changed, 36 insertions(+), 4 deletions(-)
>> 
>> Modified: trunk/orte/mca/oob/ud/Makefile.am
>> ==
>> --- trunk/orte/mca/oob/ud/Makefile.am(original)
>> +++ trunk/orte/mca/oob/ud/Makefile.am2012-04-27 07:32:56 EDT (Fri, 
>> 27 Apr 2012)
>> @@ -17,6 +17,8 @@
>> # $HEADER$
>> #
>> 
>> +AM_CPPFLAGS = $(orte_oob_ud_CPPFLAGS)
>> +
>> dist_pkgdata_DATA = help-oob-ud.txt
>> 
>> sources = \
>> @@ -49,9 +51,11 @@
>> mcacomponentdir = $(pkglibdir)
>> mcacomponent_LTLIBRARIES = $(component_install)
>> mca_oob_ud_la_SOURCES = $(sources)
>> -mca_oob_ud_la_LDFLAGS = -module -avoid-version -libverbs
>> +mca_oob_ud_la_LDFLAGS = -module -avoid-version $(orte_oob_ud_LDFLAGS)
>> +mca_oob_ud_la_LIBADD = $(orte_oob_ud_LIBS)
>> 
>> noinst_LTLIBRARIES = $(component_noinst)
>> libmca_oob_ud_la_SOURCES = $(sources)
>> -libmca_oob_ud_la_LDFLAGS = -module -avoid-version
>> +libmca_oob_ud_la_LDFLAGS = -module -avoid-version $(orte_oob_ud_LDFLAGS)
>> +libmca_oob_ud_la_LIBADD = $(orte_oob_ud_LIBS)
>> 
>> 
>> Modified: trunk/orte/mca/oob/ud/configure.m4
>> ==
>> --- trunk/orte/mca/oob/ud/configure.m4   (original)
>> +++ trunk/orte/mca/oob/ud/configure.m4   2012-04-27 07:32:56 EDT (Fri, 
>> 27 Apr 2012)
>> @@ -22,6 +22,34 @@
>> AC_DEFUN([MCA_orte_oob_ud_CONFIG],[
>>AC_CONFIG_FILES([orte/mca/oob/ud/Makefile])
>> 
>> -AC_CHECK_HEADER([infiniband/verbs.h])
>> -AC_CHECK_LIB([ibverbs], [ibv_create_qp])
>> +# JMS Still have problems with AC_ARG ENABLE not yet having been
>> +# called or CHECK_WITHDIR'ed.
>> +
>> +orte_oob_ud_check_save_CPPFLAGS=$CPPFLAGS
>> +orte_oob_ud_check_save_LDFLAGS=$LDFLAGS
>> +orte_oob_ud_check_save_LIBS=$LIBS
>> +
>> +OMPI_CHECK_PACKAGE([orte_oob_ud],
>> +   [infiniband/verbs.h],
>> +   [ibverbs],
>> +   [ibv_open_device],
>> +   [],
>> +   [$ompi_check_openib_dir],
>> +   [$ompi_check_openib_libdir],
>> +   [orte_oob_ud_check_happy=yes],
>> +   [orte_oob_ud_check_happy=no])])
>> +
>> +CPPFLAGS=$orte_oob_ud_check_save_CPPFLAGS
>> +LDFLAGS=$orte_o

Re: [OMPI devel] Time to unify OpenFabrics configury?

2012-04-27 Thread Shamis, Pavel


> On Apr 27, 2012, at 10:31 AM, Shamis, Pavel wrote:
> 
>> It is a good idea to unify the OFED configure scripts. BUT, I would prefer 
>> to do this rework after merge with the new collectives component, since we 
>> are going to bring totally new IB components based on extended verbs 
>> interface and it obviously adds new configure logic.
> 
> Did you add new stuff to ompi/config/ompi_check_openib.m4?

I think the answer yes. As well we added new configure scripts.

> 
> When do you expect to merge the new collective stuff?

Yesterday :-) I would say about 2-3 weeks.

> 
>> Pavel (Pasha) Shamis
>> ---
>> Application Performance Tools Group
>> Computer Science and Math Division
>> Oak Ridge National Laboratory
>> 
>> 
>> 
>> 
>> 
>> 
>> On Apr 27, 2012, at 7:48 AM, Jeff Squyres wrote:
>> 
>>> OpenFabrics vendors --
>>> 
>>> Now that there's a verbs-based component in orte, it really suggests that 
>>> we should update / reform the configure options and behavior w.r.t. 
>>> OpenFabrics-based components.
>>> 
>>> For example:
>>> 
>>> - is it finally time to rename --with-openib to --with-ofa?
>>> 
>>> - should we also allow --with-openib as a deprecated synonym for the 
>>> 1.7/1.8 series, and then kill it in 1.9?
>>> 
>>> - should we hack up ompi/config/ompi_check_openib.m4:
>>> 
>>> 1. split it up to check for smaller things (e.g., one macro to check for 
>>> basic OpenFabrics, another to check for the RDMACM, another to check for 
>>> XRC, ...etc.).  The rationale here is that oob/ud requires very little from 
>>> OFA -- it does not need RDMACM, XRC, ...etc.
>>> 
>>> 2. move the resulting OFA-based .m4 out to the top-level config/ directory 
>>> (vs. ompi/config)?
>>> 
>>> ==> Jeff's $0.02 on all of this is "yes".  :-)
>>> 
>>> 
>>> 
>>> Begin forwarded message:
>>> 
>>>> From: jsquy...@osl.iu.edu
>>>> Subject: [OMPI svn-full] svn:open-mpi r26350
>>>> Date: April 27, 2012 7:32:56 AM EDT
>>>> To: svn-f...@open-mpi.org
>>>> Reply-To: de...@open-mpi.org
>>>> 
>>>> Author: jsquyres
>>>> Date: 2012-04-27 07:32:56 EDT (Fri, 27 Apr 2012)
>>>> New Revision: 26350
>>>> URL: https://svn.open-mpi.org/trac/ompi/changeset/26350
>>>> 
>>>> Log:
>>>> Update configury in the new oob ud component: actually check to see if
>>>> it succeeds and run $1 or $2, accordingly.  This allows "make dist" to
>>>> run properly on machines that do not have OpenFabrics stuff installed
>>>> (e.g., the nightly tarball build machine).
>>>> 
>>>> There's still more to be done here -- it doesn't check for non-uniform
>>>> directories where the OpenFabrics headers/libraries might be
>>>> installed.  We might need to re-tool/combine
>>>> ompi/config/ompi_check_openib.m4 (which checks for way more than
>>>> oob/ud needs) and move it up to config/ompi_check_ofa.m4, or
>>>> something...?
>>>> 
>>>> Properties modified: 
>>>> trunk/orte/mca/oob/ud/   (props changed)
>>>> Text files modified: 
>>>> trunk/orte/mca/oob/ud/Makefile.am  | 8 ++--
>>>> 
>>>> trunk/orte/mca/oob/ud/configure.m4 |32 
>>>> ++--
>>>> 2 files changed, 36 insertions(+), 4 deletions(-)
>>>> 
>>>> Modified: trunk/orte/mca/oob/ud/Makefile.am
>>>> ==
>>>> --- trunk/orte/mca/oob/ud/Makefile.am  (original)
>>>> +++ trunk/orte/mca/oob/ud/Makefile.am  2012-04-27 07:32:56 EDT (Fri, 
>>>> 27 Apr 2012)
>>>> @@ -17,6 +17,8 @@
>>>> # $HEADER$
>>>> #
>>>> 
>>>> +AM_CPPFLAGS = $(orte_oob_ud_CPPFLAGS)
>>>> +
>>>> dist_pkgdata_DATA = help-oob-ud.txt
>>>> 
>>>> sources = \
>>>> @@ -49,9 +51,11 @@
>>>> mcacomponentdir = $(pkglibdir)
>>>> mcacomponent_LTLIBRARIES = $(component_install)
>>>> mca_oob_ud_la_SOURCES = $(sources)
>>>> -mca_oob_ud_la_LDFLAGS = -module -avoid-version -libverbs
>>>> +mca_oob_ud_la_LDFLAGS = -module -avoid-version $(orte_oob_ud_LDFLAGS)
>>>&

Re: [OMPI devel] Modex

2012-06-13 Thread Shamis, Pavel

> 
> We currently block on exchange of contact information for the BTL's when we 
> perform an all-to-all operation we term the "modex".

Do we have to do all-to-all or allgather ? allgather should be enough ...

> At the end of that operation, each process constructs a list of information 
> for all processes in the job, and each process contains the complete BTL 
> contact info for every process in its modex database. This consumes a 
> significant amount of memory, especially as we scale to ever larger 
> applications. In addition, the modex operation itself is one of the largest 
> time consumers during MPI_Init.
> 
> An alternative approach is for the BTL's to "add proc" only on "first 
> message" to or from that process - i.e., we would not construct a list of all 
> procs during MPI_Init, but only add an entry for a process with which we 
> communicate. The method would go like this:

> 
> 1. during MPI_Init, each BTL posts its contact info to the local modex
> 
> 2. the "modex" call in MPI_Init simply sends that data to the local daemon, 
> which asynchronously executes an all-to-all collective with the other daemons 
> in the job. At the end of that operation, each daemon holds a complete modex 
> database for the job. Meantime, the application process continues to run.
> 
> 3. we remove the "add_procs" call within MPI_Init, and perhaps can eliminate 
> the ORTE barrier at the end of MPI_Init. The reason we had that barrier was 
> to ensure that all procs were ready to communicate before we allowed anyone 
> to send a message. However, with this method, that may no longer be required.
> 
> 4. we modify the BTL's so they (a) can receive a message from an unknown 
> source, adding that source to their local proc list, and (b) when sending a 
> message to another process, obtain the required contact info from their local 
> daemon if they don't already have it. Thus, we will see an increased latency 
> on first message - but we will ONLY store info for processes with which we 
> actually communicate (thus reducing the memory burden) and will wireup much 
> faster than we do today.
> 

It is right direction. As far as I see for changes (1-2) we don't have to do 
any changes in MPI level code, all the new logic will sit behind modex. 

4 - On the first message we already do a lot of crap, so it is not a big deal. 
Even so, we have to make this change really careful, there are potential 
pitfalls. If you want, we may discuss it offline. 

> I'm not (yet) that familiar with the details of many of the BTLs, but my 
> initial review of them didn't see any showstoppers for this approach. If 
> people think this might work and be an interesting approach, I'd be happy to 
> help implement a prototype to quantify its behavior.

I'm interested. Let's talk.

-Pasha

Re: [OMPI devel] OpenIB compile error

2012-06-20 Thread Shamis, Pavel

I hate it ...

As far as I understand it is not reason to rename it. The OFED-lovin components 
should look at $with_openib.


Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jun 20, 2012, at 4:09 PM, Jeff Squyres wrote:

> Brian and I talked about this more off list and figured out the problem.  The 
> issue is that Brian has OFED installed in a non-standard location, so he 
> specified --with-openib=/path/to/ofed.  The openib BTL therefore knows where 
> OFED (and verbs.h) is installed, but other OFED-lovin' components don't:
> 
> - OOB UD
> - BTL UD
> - hwloc hwloc142
> 
> So it seems like it's finally time to rename and universalize the 
> --with-openib switch.
> 
> Ladies and gentlemen, I present: --with-ofed.
> 
> This one option will function exactly like --with-openib today, with the 
> exception that all OFED-lovin' components can look at $with_ofed.  
> 
> For the 1.7/1.8 series, we'll accept --with-openib in lieu of --with-ofed, 
> but we'll print a warning if you do so.  We'll delete the --with-openib=dir 
> form of --with-openib in 1.9/2.0 (i.e., --with-openib will just mean "you 
> must build openib; error if you cannot build it").
> 
> Speak now if you hate this plan...
> 
> 
> 
> On Jun 20, 2012, at 1:18 PM, Barrett, Brian W wrote:
> 
>> Hi all -
>> 
>> I'm seeing the compile error with the OMPI trunk and OFED 15.3.1.  Has
>> anyone seen this before?  I have vague recollections of seeing e-mail
>> discussion on the issue, but can't find those e-mails now...
>> 
>> Thanks,
>> 
>> Brian
>> 
>> 
>> In file included from ../../../../opal/mca/hwloc/hwloc.h:87,
>>from btl_openib_component.c:69:
>> ../../../../opal/mca/hwloc/hwloc142/hwloc142.h:38:10: error: #error Tried
>> to include hwloc verbs helper file, but hwloc was compiled with no
>> OpenFabrics support
>> btl_openib_component.c: In function 'get_ib_dev_distance':
>> btl_openib_component.c:2435: error: implicit declaration of function
>> 'opal_hwloc142_hwloc_ibv_get_device_cpuset'
>> make[1]: *** [btl_openib_component.lo] Error 1
>> make[1]: Leaving directory
>> `/home/bwbarre/projects/ompi/trunk/ompi/mca/btl/openib'
>> make: *** [all-recursive] Error 1
>> 
>> 
>> 
>> -- 
>> Brian W. Barrett
>> Dept. 1423: Scalable System Software
>> Sandia National Laboratories
>> 
>> 
>> 
>> 
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] OpenIB compile error

2012-06-21 Thread Shamis, Pavel

> On Jun 20, 2012, at 4:25 PM, Shamis, Pavel wrote:
> 
>> I hate it ...
>> 
>> As far as I understand it is not reason to rename it. The OFED-lovin 
>> components should look at $with_openib.
> 
> Ah, sorry -- I didn't think this would be controversial.

It is not controversial. The "hate" was only the option on the list :)

> 
> Just curious: why do you hate it?  OpenIB is a name that hasn't existed in 
> years -- we already have to 'splain it.  
> 
> Why not use a name that is commonly recognizable, like --with-ofed, or 
> --with-of?

OpenIB BTL is the primary source cause for existence of the OOB UD component. 
So, it is very natural, that in order to enable OPENIB BTL (with all he 
supportive components, like oob)  user uses --with-openib-*  . Logically it 
sounds very reasonable.

OFUD - with all the respect, this components is a leftover of the DR PML, which 
was removed from the trunk. I'm not sure why we keep OFUD in the trunk.

I remember that some people are expressed interested in reincarnation of the  
OFUD BTL.Well,  when the new OFUD (or whatever) will be ready, we may discuss 
the naming issue again. 

Bottom line, on this stage the renaming seems totally confusing. 

Regards,
Pasha

Re: [OMPI devel] OpenIB compile error

2012-06-21 Thread Shamis, Pavel

BTW, if people want to rename openib btl to something else and then change the 
configure scripts - I'm ok.
About naming - I would agree with Terry, it makes sense to name it after 
network API used for this btl - "verbs" (it is not ibverbs).

Bottom line, I would recommend to keep configure option naming in sync with 
component (in this case BTL) naming.

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jun 21, 2012, at 9:30 AM, Jeff Squyres wrote:

> On Jun 21, 2012, at 9:16 AM, Jeff Squyres wrote:
> 
>> 2. Have consistent behavior between the configury of all OFED-lovin' 
>> components (currently the 4 I listed), per your description:
>>  * --with-openfabrics[=DIR] means that all OFED-lovin' components must 
>> configure successfully, or fail
>>  * --without-openfabrics means that all OFED-lovin' component must not build
> 
> 
> I'm sorry -- that's not quite correct.
> 
> hwloc will build regardless of whether you specify --with-openfabrics or not 
> (because it doesn't *need* OpenFabrics support).  
> 
> But the other 3 OpenFabrics-based components (ofud, ud, openib) must all 
> succeed if --with-openfabrics is specified, and will not be built if 
> --without-openfabrics is specified.  Because all of these components *need* 
> OpenFabrics support -- they cannot build without OF support.
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26707 - in trunk/ompi: config mca/btl/ofud mca/btl/openib mca/common/ofacm mca/common/ofautils mca/dpm

2012-07-02 Thread Shamis, Pavel


> Keep in mind that this is currently not used for the openib BTL.  It is only 
> used in the upcoming OpenFabrics-based collectives component.
> 
> The iWARP-required connector-must-send-first logic is not yet included in 
> this code, as I understand it.  That must be added before it can be used with 
> the openib BTL.

It seems that IB and Iwarp vendors aren't interested to support it RDMACM or 
IBCM.

Pasha.

> 
> 
> On Jul 2, 2012, at 11:20 AM, Nathan Hjelm wrote:
> 
>> Nice! Are we moving this to 1.7 as well?
>> 
>> -Nathan
>> 
>> On Mon, Jul 02, 2012 at 11:20:12AM -0400, svn-commit-mai...@open-mpi.org 
>> wrote:
>>> Author: pasha (Pavel Shamis)
>>> Date: 2012-07-02 11:20:12 EDT (Mon, 02 Jul 2012)
>>> New Revision: 26707
>>> URL: https://svn.open-mpi.org/trac/ompi/changeset/26707
>>> 
>>> Log:
>>> 1. Adding 2 new components:
>>> ofacm - generic connection manager for IB interconnects.
>>> ofautils - IB common utilities and compatibility code
>>> 
>>> 2. Updating OpenIB configure code
>>> 
>>> - ORNL & Mellanox Teams
>>> 
>>> Added:
>>>  trunk/ompi/config/ompi_check_openfabrics.m4
>>>  trunk/ompi/mca/common/ofacm/
>>>  trunk/ompi/mca/common/ofacm/Makefile.am
>>>  trunk/ompi/mca/common/ofacm/base.h
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_base.c
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_empty.c
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_empty.h
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_oob.c
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_oob.h
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_xoob.c
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_xoob.h
>>>  trunk/ompi/mca/common/ofacm/configure.m4
>>>  trunk/ompi/mca/common/ofacm/configure.params
>>>  trunk/ompi/mca/common/ofacm/connect.h
>>>  trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-base.txt
>>>  trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-oob.txt
>>>  trunk/ompi/mca/common/ofautils/
>>>  trunk/ompi/mca/common/ofautils/Makefile.am
>>>  trunk/ompi/mca/common/ofautils/common_ofautils.c
>>>  trunk/ompi/mca/common/ofautils/common_ofautils.h
>>>  trunk/ompi/mca/common/ofautils/configure.m4
>>>  trunk/ompi/mca/common/ofautils/configure.params
>>> Deleted:
>>>  trunk/ompi/config/ompi_check_openib.m4
>>> Text files modified: 
>>>  trunk/ompi/config/ompi_check_openfabrics.m4|   403 
>>> +   
>>>  /dev/null  |   329 --- 
>>> 
>>>  trunk/ompi/mca/btl/ofud/configure.m4   | 2 
>>> 
>>>  trunk/ompi/mca/btl/openib/Makefile.am  | 4 
>>> 
>>>  trunk/ompi/mca/btl/openib/btl_openib_component.c   |49 -   
>>> 
>>>  trunk/ompi/mca/btl/openib/configure.m4 | 5 
>>> 
>>>  trunk/ompi/mca/common/ofacm/Makefile.am|76 +   
>>> 
>>>  trunk/ompi/mca/common/ofacm/base.h |   193 
>>> 
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_base.c|   678 
>>> 
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_empty.c   |48 +   
>>> 
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_empty.h   |22 
>>> 
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_oob.c |  1672 
>>> 
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_oob.h |20 
>>> 
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_xoob.c|  1537 
>>> 
>>>  trunk/ompi/mca/common/ofacm/common_ofacm_xoob.h|69 +   
>>> 
>>>  trunk/ompi/mca/common/ofacm/configure.m4   |63 +   
>>> 
>>>  trunk/ompi/mca/common/ofacm/configure.params   |26 
>>> 
>>>  trunk/ompi/mca/common/ofacm/connect.h  |   541 
>>> 
>>>  trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-base.txt |41 
>>> 
>>>  trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-oob.txt  |20 
>>> 
>>>  trunk/ompi/mca/common/ofautils/Makefile.am |68 +   
>>> 
>>>  trunk/ompi/mca/common/ofautils/common_ofautils.c   |89 ++  
>>> 
>>>  trunk/ompi/mca/common/ofautils/common_ofautils.h   |26 
>>> 
>>>  trunk/ompi/mca/common/ofautils/configure.m

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26707 - in trunk/ompi: config mca/btl/ofud mca/btl/openib mca/common/ofacm mca/common/ofautils mca/dpm

2012-07-02 Thread Shamis, Pavel


So is ofacm another replacement for ibcm and rdmacm?

Essentially it extraction of the OpenIB BTL connection manager functionality 
(minus rdmacm) from the OpenIB BTL. The idea is to allow access to this 
functionality for other communication components, like collectives and btls. 
OFACM supports OOB and XRC-OOB connection managers.

- Pasha


--td

On 7/2/2012 11:20 AM, Nathan Hjelm wrote:

Nice! Are we moving this to 1.7 as well?

-Nathan

On Mon, Jul 02, 2012 at 11:20:12AM -0400, 
svn-commit-mai...@open-mpi.org wrote:


Author: pasha (Pavel Shamis)
List-Post: devel@lists.open-mpi.org
Date: 2012-07-02 11:20:12 EDT (Mon, 02 Jul 2012)
New Revision: 26707
URL: https://svn.open-mpi.org/trac/ompi/changeset/26707

Log:
1. Adding 2 new components:
ofacm - generic connection manager for IB interconnects.
ofautils - IB common utilities and compatibility code

2. Updating OpenIB configure code

- ORNL & Mellanox Teams

Added:
   trunk/ompi/config/ompi_check_openfabrics.m4
   trunk/ompi/mca/common/ofacm/
   trunk/ompi/mca/common/ofacm/Makefile.am
   trunk/ompi/mca/common/ofacm/base.h
   trunk/ompi/mca/common/ofacm/common_ofacm_base.c
   trunk/ompi/mca/common/ofacm/common_ofacm_empty.c
   trunk/ompi/mca/common/ofacm/common_ofacm_empty.h
   trunk/ompi/mca/common/ofacm/common_ofacm_oob.c
   trunk/ompi/mca/common/ofacm/common_ofacm_oob.h
   trunk/ompi/mca/common/ofacm/common_ofacm_xoob.c
   trunk/ompi/mca/common/ofacm/common_ofacm_xoob.h
   trunk/ompi/mca/common/ofacm/configure.m4
   trunk/ompi/mca/common/ofacm/configure.params
   trunk/ompi/mca/common/ofacm/connect.h
   trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-base.txt
   trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-oob.txt
   trunk/ompi/mca/common/ofautils/
   trunk/ompi/mca/common/ofautils/Makefile.am
   trunk/ompi/mca/common/ofautils/common_ofautils.c
   trunk/ompi/mca/common/ofautils/common_ofautils.h
   trunk/ompi/mca/common/ofautils/configure.m4
   trunk/ompi/mca/common/ofautils/configure.params
Deleted:
   trunk/ompi/config/ompi_check_openib.m4
Text files modified:
   trunk/ompi/config/ompi_check_openfabrics.m4|   403 +
   /dev/null  |   329 ---
   trunk/ompi/mca/btl/ofud/configure.m4   | 2
   trunk/ompi/mca/btl/openib/Makefile.am  | 4
   trunk/ompi/mca/btl/openib/btl_openib_component.c   |49 -
   trunk/ompi/mca/btl/openib/configure.m4 | 5
   trunk/ompi/mca/common/ofacm/Makefile.am|76 +
   trunk/ompi/mca/common/ofacm/base.h |   193 
   trunk/ompi/mca/common/ofacm/common_ofacm_base.c|   678 

   trunk/ompi/mca/common/ofacm/common_ofacm_empty.c   |48 +
   trunk/ompi/mca/common/ofacm/common_ofacm_empty.h   |22
   trunk/ompi/mca/common/ofacm/common_ofacm_oob.c |  1672 

   trunk/ompi/mca/common/ofacm/common_ofacm_oob.h |20
   trunk/ompi/mca/common/ofacm/common_ofacm_xoob.c|  1537 

   trunk/ompi/mca/common/ofacm/common_ofacm_xoob.h|69 +
   trunk/ompi/mca/common/ofacm/configure.m4   |63 +
   trunk/ompi/mca/common/ofacm/configure.params   |26
   trunk/ompi/mca/common/ofacm/connect.h  |   541 

   trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-base.txt |41
   trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-oob.txt  |20
   trunk/ompi/mca/common/ofautils/Makefile.am |68 +
   trunk/ompi/mca/common/ofautils/common_ofautils.c   |89 ++
   trunk/ompi/mca/common/ofautils/common_ofautils.h   |26
   trunk/ompi/mca/common/ofautils/configure.m4|43 +
   trunk/ompi/mca/common/ofautils/configure.params|26
   trunk/ompi/mca/dpm/dpm.h   | 4
   26 files changed, 5674 insertions(+), 380 deletions(-)


Diff not shown due to size (240057 bytes).
To see the diff, run the following command:

svn diff -r 26706:26707 --no-diff-deleted

___
svn mailing list
s...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/svn


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle - Performance Technologies
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26707 - in trunk/ompi: config mca/btl/ofud mca/btl/openib mca/common/ofacm mca/common/ofautils mca/dpm

2012-07-02 Thread Shamis, Pavel

Yeah, it is going to 1.7

Do you want to move your UD connection manager code there :)

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jul 2, 2012, at 11:20 AM, Nathan Hjelm wrote:

> Nice! Are we moving this to 1.7 as well?
> 
> -Nathan
> 
> On Mon, Jul 02, 2012 at 11:20:12AM -0400, svn-commit-mai...@open-mpi.org 
> wrote:
>> Author: pasha (Pavel Shamis)
>> Date: 2012-07-02 11:20:12 EDT (Mon, 02 Jul 2012)
>> New Revision: 26707
>> URL: https://svn.open-mpi.org/trac/ompi/changeset/26707
>> 
>> Log:
>> 1. Adding 2 new components:
>> ofacm - generic connection manager for IB interconnects.
>> ofautils - IB common utilities and compatibility code
>> 
>> 2. Updating OpenIB configure code
>> 
>> - ORNL & Mellanox Teams
>> 
>> Added:
>>   trunk/ompi/config/ompi_check_openfabrics.m4
>>   trunk/ompi/mca/common/ofacm/
>>   trunk/ompi/mca/common/ofacm/Makefile.am
>>   trunk/ompi/mca/common/ofacm/base.h
>>   trunk/ompi/mca/common/ofacm/common_ofacm_base.c
>>   trunk/ompi/mca/common/ofacm/common_ofacm_empty.c
>>   trunk/ompi/mca/common/ofacm/common_ofacm_empty.h
>>   trunk/ompi/mca/common/ofacm/common_ofacm_oob.c
>>   trunk/ompi/mca/common/ofacm/common_ofacm_oob.h
>>   trunk/ompi/mca/common/ofacm/common_ofacm_xoob.c
>>   trunk/ompi/mca/common/ofacm/common_ofacm_xoob.h
>>   trunk/ompi/mca/common/ofacm/configure.m4
>>   trunk/ompi/mca/common/ofacm/configure.params
>>   trunk/ompi/mca/common/ofacm/connect.h
>>   trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-base.txt
>>   trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-oob.txt
>>   trunk/ompi/mca/common/ofautils/
>>   trunk/ompi/mca/common/ofautils/Makefile.am
>>   trunk/ompi/mca/common/ofautils/common_ofautils.c
>>   trunk/ompi/mca/common/ofautils/common_ofautils.h
>>   trunk/ompi/mca/common/ofautils/configure.m4
>>   trunk/ompi/mca/common/ofautils/configure.params
>> Deleted:
>>   trunk/ompi/config/ompi_check_openib.m4
>> Text files modified: 
>>   trunk/ompi/config/ompi_check_openfabrics.m4|   403 
>> +   
>>   /dev/null  |   329 --- 
>> 
>>   trunk/ompi/mca/btl/ofud/configure.m4   | 2 
>> 
>>   trunk/ompi/mca/btl/openib/Makefile.am  | 4 
>> 
>>   trunk/ompi/mca/btl/openib/btl_openib_component.c   |49 -   
>> 
>>   trunk/ompi/mca/btl/openib/configure.m4 | 5 
>> 
>>   trunk/ompi/mca/common/ofacm/Makefile.am|76 +   
>> 
>>   trunk/ompi/mca/common/ofacm/base.h |   193 
>> 
>>   trunk/ompi/mca/common/ofacm/common_ofacm_base.c|   678 
>> 
>>   trunk/ompi/mca/common/ofacm/common_ofacm_empty.c   |48 +   
>> 
>>   trunk/ompi/mca/common/ofacm/common_ofacm_empty.h   |22 
>> 
>>   trunk/ompi/mca/common/ofacm/common_ofacm_oob.c |  1672 
>> 
>>   trunk/ompi/mca/common/ofacm/common_ofacm_oob.h |20 
>> 
>>   trunk/ompi/mca/common/ofacm/common_ofacm_xoob.c|  1537 
>> 
>>   trunk/ompi/mca/common/ofacm/common_ofacm_xoob.h|69 +   
>> 
>>   trunk/ompi/mca/common/ofacm/configure.m4   |63 +   
>> 
>>   trunk/ompi/mca/common/ofacm/configure.params   |26 
>> 
>>   trunk/ompi/mca/common/ofacm/connect.h  |   541 
>> 
>>   trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-base.txt |41 
>> 
>>   trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-oob.txt  |20 
>> 
>>   trunk/ompi/mca/common/ofautils/Makefile.am |68 +   
>> 
>>   trunk/ompi/mca/common/ofautils/common_ofautils.c   |89 ++  
>> 
>>   trunk/ompi/mca/common/ofautils/common_ofautils.h   |26 
>> 
>>   trunk/ompi/mca/common/ofautils/configure.m4|43 +   
>> 
>>   trunk/ompi/mca/common/ofautils/configure.params|26 
>> 
>>   trunk/ompi/mca/dpm/dpm.h   |

Re: [OMPI devel] [OMPI svn] svn:open-mpi r26707 - in trunk/ompi: config mca/btl/ofud mca/btl/openib mca/common/ofacm mca/common/ofautils mca/dpm

2012-07-02 Thread Shamis, Pavel

Yevgeny,

ROCEE transport relays on RDMACM as well. I believe, Mellanox should be 
interested to support it.

Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jul 2, 2012, at 5:14 PM, Jeff Squyres wrote:

> Steve --
> 
> Can you extend this new stuff to support RDMACM, including the warp-needed 
> connector-sends-first stuff?  It would be *very* nice to ditch the openib CPC 
> stuff and only have the new ofacm stuff.
> 
> I'm asking Steve because he's effectively the only iWARP vendor left around 
> (and iWARP *requires* RDMACM)...
> 
> 
> On Jul 2, 2012, at 2:05 PM, Shamis, Pavel wrote:
> 
>> 
>> So is ofacm another replacement for ibcm and rdmacm?
>> 
>> Essentially it extraction of the OpenIB BTL connection manager functionality 
>> (minus rdmacm) from the OpenIB BTL. The idea is to allow access to this 
>> functionality for other communication components, like collectives and btls. 
>> OFACM supports OOB and XRC-OOB connection managers.
>> 
>> - Pasha
>> 
>> 
>> --td
>> 
>> On 7/2/2012 11:20 AM, Nathan Hjelm wrote:
>> 
>> Nice! Are we moving this to 1.7 as well?
>> 
>> -Nathan
>> 
>> On Mon, Jul 02, 2012 at 11:20:12AM -0400, 
>> svn-commit-mai...@open-mpi.org<mailto:svn-commit-mai...@open-mpi.org> wrote:
>> 
>> 
>> Author: pasha (Pavel Shamis)
>> Date: 2012-07-02 11:20:12 EDT (Mon, 02 Jul 2012)
>> New Revision: 26707
>> URL: https://svn.open-mpi.org/trac/ompi/changeset/26707
>> 
>> Log:
>> 1. Adding 2 new components:
>> ofacm - generic connection manager for IB interconnects.
>> ofautils - IB common utilities and compatibility code
>> 
>> 2. Updating OpenIB configure code
>> 
>> - ORNL & Mellanox Teams
>> 
>> Added:
>>  trunk/ompi/config/ompi_check_openfabrics.m4
>>  trunk/ompi/mca/common/ofacm/
>>  trunk/ompi/mca/common/ofacm/Makefile.am
>>  trunk/ompi/mca/common/ofacm/base.h
>>  trunk/ompi/mca/common/ofacm/common_ofacm_base.c
>>  trunk/ompi/mca/common/ofacm/common_ofacm_empty.c
>>  trunk/ompi/mca/common/ofacm/common_ofacm_empty.h
>>  trunk/ompi/mca/common/ofacm/common_ofacm_oob.c
>>  trunk/ompi/mca/common/ofacm/common_ofacm_oob.h
>>  trunk/ompi/mca/common/ofacm/common_ofacm_xoob.c
>>  trunk/ompi/mca/common/ofacm/common_ofacm_xoob.h
>>  trunk/ompi/mca/common/ofacm/configure.m4
>>  trunk/ompi/mca/common/ofacm/configure.params
>>  trunk/ompi/mca/common/ofacm/connect.h
>>  trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-base.txt
>>  trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-oob.txt
>>  trunk/ompi/mca/common/ofautils/
>>  trunk/ompi/mca/common/ofautils/Makefile.am
>>  trunk/ompi/mca/common/ofautils/common_ofautils.c
>>  trunk/ompi/mca/common/ofautils/common_ofautils.h
>>  trunk/ompi/mca/common/ofautils/configure.m4
>>  trunk/ompi/mca/common/ofautils/configure.params
>> Deleted:
>>  trunk/ompi/config/ompi_check_openib.m4
>> Text files modified:
>>  trunk/ompi/config/ompi_check_openfabrics.m4|   403 +
>>  /dev/null  |   329 ---
>>  trunk/ompi/mca/btl/ofud/configure.m4   | 2
>>  trunk/ompi/mca/btl/openib/Makefile.am  | 4
>>  trunk/ompi/mca/btl/openib/btl_openib_component.c   |49 -
>>  trunk/ompi/mca/btl/openib/configure.m4 | 5
>>  trunk/ompi/mca/common/ofacm/Makefile.am|76 +
>>  trunk/ompi/mca/common/ofacm/base.h |   193 
>>  trunk/ompi/mca/common/ofacm/common_ofacm_base.c|   678 
>> 
>>  trunk/ompi/mca/common/ofacm/common_ofacm_empty.c   |48 +
>>  trunk/ompi/mca/common/ofacm/common_ofacm_empty.h   |22
>>  trunk/ompi/mca/common/ofacm/common_ofacm_oob.c |  1672 
>> 
>>  trunk/ompi/mca/common/ofacm/common_ofacm_oob.h |20
>>  trunk/ompi/mca/common/ofacm/common_ofacm_xoob.c|  1537 
>> 
>>  trunk/ompi/mca/common/ofacm/common_ofacm_xoob.h|69 +
>>  trunk/ompi/mca/common/ofacm/configure.m4   |63 +
>>  trunk/ompi/mca/common/ofacm/configure.params   |26
>>  trunk/ompi/mca/common/ofacm/connect.h  |   541 
>> 
>>  trunk/ompi/mca/common/ofacm/help-mpi-common-ofacm-base.txt |41
>>  trunk/ompi/mca/com

Re: [OMPI devel] openib max_cqe

2012-07-05 Thread Shamis, Pavel

> So if I do a run of -np 2 across two separate node than the use of the 
> max_cqe of my ib device (4194303) is ok.  Once I go beyond 1 process on the 
> node I start getting the memlocked limits message.  So how much memory does a 
> cqe take?  Is it 1k by any chance?  I ask this because the machine I am 
> running on has 4GB of memory and so I am wondering if I just don't have 
> enough backing memory and if that is so I am wondering how commone of a case 
> this may be?

I mentioned on the call that for Mellanox devices (+OFA verbs) this resource is 
really cheap. Do you run mellanox hca + OFA verbs ?
Regards,
Pasha

Re: [OMPI devel] openib max_cqe

2012-07-05 Thread Shamis, Pavel

>> I mentioned on the call that for Mellanox devices (+OFA verbs) this resource 
>> is really cheap. Do you run mellanox hca + OFA verbs ?
> 
> (I'll reply because I know Terry is offline for the rest of the day)
> 
> Yes, he does.

I asked because SUN used to have own verbs driver.

> 
> The heart of the question: is it incorrect to assume that we'll consume (num 
> CQE * CQE size) registered memory for each QP opened?

QP or CQ ?  I think you don't want to assume anything there. Verbs (user and 
kernel) do their own magic there.
I think Mellanox should address this question.

Regards,
Pasha

Re: [OMPI devel] r27078 and OMPI build

2012-08-21 Thread Shamis, Pavel

Evgeny,

I don't have access to Solaris system, but please let me know if there a way to 
help you.

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Aug 21, 2012, at 11:36 AM, Eugene Loh wrote:

r27078 (ML collective component) broke some Solaris OMPI builds.

1)  In ompi/mca/coll/ml/coll_ml_lmngr.c
199 #ifdef HAVE_POSIX_MEMALIGN
200 if((errno = posix_memalign(&lmngr->base_addr,
201 lmngr->list_alignment,
202 lmngr->list_size * lmngr->list_block_size))
!= 0) {
203 ML_ERROR(("Failed to allocate memory: %s [%d]", errno,
strerror(errno)));
204 return OMPI_ERROR;
205 }
206 #else
207 lmngr->base_addr =
208 malloc(lmngr->list_size * lmngr->list_block_size +
lmngr->list_alignment);
209 if(NULL == lmngr->base_addr) {
210 ML_ERROR(("Failed to allocate memory: %s [%d]", errno,
strerror(errno)));
211 return OMPI_ERROR;
212 }
213
214 lmngr->base_addr =
(void*)OPAL_ALIGN((uintptr_t)lmngr->base_addr,
215 lmngr->list_align, uintptr_t);
216 #endif
   The "#else" code path has multiple problems -- specifically at the
statement on lines 214-215:
   - OPAL_ALIGN needs to be defined (e.g., #include "opal/align.h")
   - uintptr_t need to be defined (e.g., #include "opal_stdint.h")
   - list_align should be list_alignment

I could fix, but need help with...

2)  http://www.open-mpi.org/mtt/index.php?do_redir=2089  Somehow,
coll_ml is getting pulled into libmpi.so.  E.g., this doesn't look right:

   % nm ompi/.libs/libmpi.so | grep mca_coll_ml
   [13161] |   2556704|   172|FUNC |LOCL |0|11
|mca_coll_ml_alloc_op_prog_single_frag_dag
   [13171] |   2555488|   344|FUNC |LOCL |0|11
|mca_coll_ml_buffer_recycling
   [13173] |   2555392|92|FUNC |LOCL |0|11 |mca_coll_ml_err
   [23992] | 0| 0|FUNC |GLOB |0|UNDEF
|mca_coll_ml_memsync_intra

The UNDEF is causing a problem, but I'm guessing all that mca_coll_ml_
stuff shouldn't be in there at all in the first place.  This is on one
Solaris system, while another doesn't see the problem and builds fine.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] r27078 and OMPI build

2012-08-21 Thread Shamis, Pavel


On 8/21/2012 9:31 AM, Ralph Castain wrote:
Looks to me like you just need to add a couple of includes and correct a typo - 
yes?
Right.  This part is under control.

I hope r27100 resolves the 
issue #1

The library issue sounds like something isn't right in the Makefile.am - 
perhaps the syntax has a typo there as well?
I don't know.  This is the part where I could use help.  I took a quick
peek at some Makefile.am files.  I can't see what the essential
difference is between, say, coll/ml/Makefile.am and, say,
coll/sm/Makefile.am (which behaves all right).  Nor do I see why there
would be a difference in coll/ml between one system (happens to be
SPARC, though I don't know that's significant) and another.

I can't reproduce the problem on mac and linux systems

On Aug 21, 2012, at 11:36 AM, Eugene Loh wrote:

r27078 (ML collective component) broke some Solaris OMPI builds.

1)  In ompi/mca/coll/ml/coll_ml_lmngr.c
   199 #ifdef HAVE_POSIX_MEMALIGN
   200 if((errno = posix_memalign(&lmngr->base_addr,
   201 lmngr->list_alignment,
   202 lmngr->list_size * lmngr->list_block_size))
!= 0) {
   203 ML_ERROR(("Failed to allocate memory: %s [%d]", errno,
strerror(errno)));
   204 return OMPI_ERROR;
   205 }
   206 #else
   207 lmngr->base_addr =
   208 malloc(lmngr->list_size * lmngr->list_block_size +
lmngr->list_alignment);
   209 if(NULL == lmngr->base_addr) {
   210 ML_ERROR(("Failed to allocate memory: %s [%d]", errno,
strerror(errno)));
   211 return OMPI_ERROR;
   212 }
   213
   214 lmngr->base_addr =
(void*)OPAL_ALIGN((uintptr_t)lmngr->base_addr,
   215 lmngr->list_align, uintptr_t);
   216 #endif
  The "#else" code path has multiple problems -- specifically at the
statement on lines 214-215:
  - OPAL_ALIGN needs to be defined (e.g., #include "opal/align.h")
  - uintptr_t need to be defined (e.g., #include "opal_stdint.h")
  - list_align should be list_alignment

I could fix, but need help with...

2)  http://www.open-mpi.org/mtt/index.php?do_redir=2089  Somehow,
coll_ml is getting pulled into libmpi.so.  E.g., this doesn't look right:

  % nm ompi/.libs/libmpi.so | grep mca_coll_ml
  [13161] |   2556704|   172|FUNC |LOCL |0|11
|mca_coll_ml_alloc_op_prog_single_frag_dag
  [13171] |   2555488|   344|FUNC |LOCL |0|11
|mca_coll_ml_buffer_recycling
  [13173] |   2555392|92|FUNC |LOCL |0|11 |mca_coll_ml_err
  [23992] | 0| 0|FUNC |GLOB |0|UNDEF
|mca_coll_ml_memsync_intra

The UNDEF is causing a problem, but I'm guessing all that mca_coll_ml_
stuff shouldn't be in there at all in the first place.  This is on one
Solaris system, while another doesn't see the problem and builds fine.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] r27078 and OMPI build

2012-08-23 Thread Shamis, Pavel

Eugene,

Did you have chance to make progress on the issue #2 ? I'm wondering how we 
want to proceed from here.

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Aug 21, 2012, at 2:19 PM, Eugene Loh wrote:

On 8/21/2012 9:31 AM, Ralph Castain wrote:
Looks to me like you just need to add a couple of includes and correct a typo - 
yes?
Right.  This part is under control.
The library issue sounds like something isn't right in the Makefile.am - 
perhaps the syntax has a typo there as well?
I don't know.  This is the part where I could use help.  I took a quick
peek at some Makefile.am files.  I can't see what the essential
difference is between, say, coll/ml/Makefile.am and, say,
coll/sm/Makefile.am (which behaves all right).  Nor do I see why there
would be a difference in coll/ml between one system (happens to be
SPARC, though I don't know that's significant) and another.
On Aug 21, 2012, at 11:36 AM, Eugene Loh wrote:

r27078 (ML collective component) broke some Solaris OMPI builds.

1)  In ompi/mca/coll/ml/coll_ml_lmngr.c
   199 #ifdef HAVE_POSIX_MEMALIGN
   200 if((errno = posix_memalign(&lmngr->base_addr,
   201 lmngr->list_alignment,
   202 lmngr->list_size * lmngr->list_block_size))
!= 0) {
   203 ML_ERROR(("Failed to allocate memory: %s [%d]", errno,
strerror(errno)));
   204 return OMPI_ERROR;
   205 }
   206 #else
   207 lmngr->base_addr =
   208 malloc(lmngr->list_size * lmngr->list_block_size +
lmngr->list_alignment);
   209 if(NULL == lmngr->base_addr) {
   210 ML_ERROR(("Failed to allocate memory: %s [%d]", errno,
strerror(errno)));
   211 return OMPI_ERROR;
   212 }
   213
   214 lmngr->base_addr =
(void*)OPAL_ALIGN((uintptr_t)lmngr->base_addr,
   215 lmngr->list_align, uintptr_t);
   216 #endif
  The "#else" code path has multiple problems -- specifically at the
statement on lines 214-215:
  - OPAL_ALIGN needs to be defined (e.g., #include "opal/align.h")
  - uintptr_t need to be defined (e.g., #include "opal_stdint.h")
  - list_align should be list_alignment

I could fix, but need help with...

2)  http://www.open-mpi.org/mtt/index.php?do_redir=2089  Somehow,
coll_ml is getting pulled into libmpi.so.  E.g., this doesn't look right:

  % nm ompi/.libs/libmpi.so | grep mca_coll_ml
  [13161] |   2556704|   172|FUNC |LOCL |0|11
|mca_coll_ml_alloc_op_prog_single_frag_dag
  [13171] |   2555488|   344|FUNC |LOCL |0|11
|mca_coll_ml_buffer_recycling
  [13173] |   2555392|92|FUNC |LOCL |0|11 |mca_coll_ml_err
  [23992] | 0| 0|FUNC |GLOB |0|UNDEF
|mca_coll_ml_memsync_intra

The UNDEF is causing a problem, but I'm guessing all that mca_coll_ml_
stuff shouldn't be in there at all in the first place.  This is on one
Solaris system, while another doesn't see the problem and builds fine.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] r27078 and OMPI build

2012-08-23 Thread Shamis, Pavel

Evgeny,
I'm wondering if the issue is some how related to the fact that these functions 
are inline. Can you please, try the attached patch and see what happens ?

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Aug 23, 2012, at 12:59 PM, Eugene Loh wrote:

On 8/23/2012 8:58 AM, Shamis, Pavel wrote:
Did you have chance to make progress on the issue #2 ? I'm wondering how we 
want to proceed from here.
First of all, thanks for putting back the fixes for issue #1.  That
build is now successful.

Issue #2?  No.  I don't know what to look at even if I had time to spend
on this.  It appears that mca/coll/ml is being pulled into libmpi.  I
tried comparing this component to others that aren't pulled in (e.g.,
mca/coll/sm) or builds on this system (happens to be SPARC/Solaris, but
I don't know what the key distinction is) versus other systems where
mca/coll/ml is not pulled in.  Nothing jumped out at me.  So, I'm stuck
(lack of ideas and lack of time).

What would make an MCA component get pulled into libmpi?  Again, many
other components are not getting pulled in and this problem appears only
on one system.
On Aug 21, 2012, at 2:19 PM, Eugene Loh wrote:

On 8/21/2012 9:31 AM, Ralph Castain wrote:
The library issue sounds like something isn't right in the Makefile.am
- perhaps the syntax has a typo there as well?

I don't know.  This is the part where I could use help.  I took a quick
peek at some Makefile.am files.  I can't see what the essential
difference is between, say, coll/ml/Makefile.am and, say,
coll/sm/Makefile.am (which behaves all right).  Nor do I see why there
would be a difference in coll/ml between one system (happens to be
SPARC, though I don't know that's significant) and another.


On Aug 21, 2012, at 11:36 AM, Eugene Loh wrote:

r27078 (ML collective component) broke some Solaris OMPI builds.

2)  http://www.open-mpi.org/mtt/index.php?do_redir=2089  Somehow,
coll_ml is getting pulled into libmpi.so.  E.g., this doesn't look right:

  % nm ompi/.libs/libmpi.so | grep mca_coll_ml
  [13161] |   2556704|   172|FUNC |LOCL |0|11
|mca_coll_ml_alloc_op_prog_single_frag_dag
  [13171] |   2555488|   344|FUNC |LOCL |0|11
|mca_coll_ml_buffer_recycling
  [13173] |   2555392|92|FUNC |LOCL |0|11 |mca_coll_ml_err
  [23992] | 0| 0|FUNC |GLOB |0|UNDEF
|mca_coll_ml_memsync_intra

The UNDEF is causing a problem, but I'm guessing all that mca_coll_ml_
stuff shouldn't be in there at all in the first place.  This is on one
Solaris system, while another doesn't see the problem and builds fine.
___
devel mailing list
de...@open-mpi.org<mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel



ml.patch
Description: ml.patch

Re: [OMPI devel] r27078 and OMPI build

2012-08-24 Thread Shamis, Pavel

Maybe there is a chance to get direct access to this system ?

Regards,

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Aug 23, 2012, at 6:09 PM, Eugene Loh wrote:

Thanks for the suggestion, but it didn't seem to help.  The build still
fails on the same problem.

On 8/23/2012 2:14 PM, Shamis, Pavel wrote:
Evgeny,
I'm wondering if the issue is some how related to the fact that these functions 
are inline. Can you please, try the attached patch and see what happens ?

On Aug 23, 2012, at 12:59 PM, Eugene Loh wrote:

On 8/23/2012 8:58 AM, Shamis, Pavel wrote:
Did you have chance to make progress on the issue #2 ? I'm wondering how we 
want to proceed from>  here.
First of all, thanks for putting back the fixes for issue #1.  That
build is now successful.

Issue #2?  No.  I don't know what to look at even if I had time to spend
on this.  It appears that mca/coll/ml is being pulled into libmpi.  I
tried comparing this component to others that aren't pulled in (e.g.,
mca/coll/sm) or builds on this system (happens to be SPARC/Solaris, but
I don't know what the key distinction is) versus other systems where
mca/coll/ml is not pulled in.  Nothing jumped out at me.  So, I'm stuck
(lack of ideas and lack of time).

What would make an MCA component get pulled into libmpi?  Again, many
other components are not getting pulled in and this problem appears only
on one system.
On Aug 21, 2012, at 2:19 PM, Eugene Loh wrote:

On 8/21/2012 9:31 AM, Ralph Castain wrote:
The library issue sounds like something isn't right in the Makefile.am
- perhaps the syntax has a typo there as well?

I don't know.  This is the part where I could use help.  I took a quick
peek at some Makefile.am files.  I can't see what the essential
difference is between, say, coll/ml/Makefile.am and, say,
coll/sm/Makefile.am (which behaves all right).  Nor do I see why there
would be a difference in coll/ml between one system (happens to be
SPARC, though I don't know that's significant) and another.


On Aug 21, 2012, at 11:36 AM, Eugene Loh wrote:

r27078 (ML collective component) broke some Solaris OMPI builds.

2)  http://www.open-mpi.org/mtt/index.php?do_redir=2089  Somehow,
coll_ml is getting pulled into libmpi.so.  E.g., this doesn't look right:

  % nm ompi/.libs/libmpi.so | grep mca_coll_ml
  [13161] |   2556704|   172|FUNC |LOCL |0|11
|mca_coll_ml_alloc_op_prog_single_frag_dag
  [13171] |   2555488|   344|FUNC |LOCL |0|11
|mca_coll_ml_buffer_recycling
  [13173] |   2555392|92|FUNC |LOCL |0|11 |mca_coll_ml_err
  [23992] | 0| 0|FUNC |GLOB |0|UNDEF
|mca_coll_ml_memsync_intra

The UNDEF is causing a problem, but I'm guessing all that mca_coll_ml_
stuff shouldn't be in there at all in the first place.  This is on one
Solaris system, while another doesn't see the problem and builds fine.
___
devel mailing list
de...@open-mpi.org<mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] r27078 and OMPI build

2012-08-25 Thread Shamis, Pavel

d -Bdynamic"  FFLAGS="-xtarget=ultra3 
-m32 -xarch=sparcvis2
-xprefetch -xprefetch_level=2 -xvector=lib -Qoption cg -xregs=no%appl -stackvar 
-xO5"
FCFLAGS="-xtarget=ultra3 -m32 -xarch=sparcvis2 -xprefetch -xprefetch_level=2 
-xvector=lib -Qoption
cg -xregs=no%appl -stackvar -xO5"
--prefix=/workspace/euloh/hpc/mtt-scratch/burl-ct-t2k-3/ompi-tarball-testing/installs/JA08/install
--mandir=${prefix}/man  --bindir=${prefix}/bin  --libdir=${prefix}/lib
--includedir=${prefix}/include   
--with-tm=/ws/ompi-tools/orte/torque/current/shared-install32
--enable-contrib-no-build=vt --with-package-string="Oracle Message Passing 
Toolkit "
--with-ident-string="@(#)RELEASE VERSION 1.9openmpi-1.5.4-r1.9a1r27092"

and the error he gets is:

make[2]: Entering directory
`/workspace/euloh/hpc/mtt-scratch/burl-ct-t2k-3/ompi-tarball-testing/mpi-install/s3rI/src/openmpi-1.9a1r27092/ompi/tools/ompi_info'
  CCLD ompi_info
Undefined   first referenced
 symbol in file
mca_coll_ml_memsync_intra   ../../../ompi/.libs/libmpi.so
ld: fatal: symbol referencing errors. No output written to .libs/ompi_info
make[2]: *** [ompi_info] Error 2
make[2]: Leaving directory
`/workspace/euloh/hpc/mtt-scratch/burl-ct-t2k-3/ompi-tarball-testing/mpi-install/s3rI/src/openmpi-1.9a1r27092/ompi/tools/ompi_info'
make[1]: *** [install-recursive] Error 1
make[1]: Leaving directory
`/workspace/euloh/hpc/mtt-scratch/burl-ct-t2k-3/ompi-tarball-testing/mpi-install/s3rI/src/openmpi-1.9a1r27092/ompi'
make: *** [install-recursive] Error 1

On Aug 24, 2012, at 4:30 PM, Paul Hargrove 
mailto:phhargr...@lbl.gov>> wrote:

I have access to a few different Solaris machines and can offer to build the 
trunk if somebody tells me what configure flags are desired.

-Paul

On Fri, Aug 24, 2012 at 8:54 AM, Ralph Castain 
mailto:r...@open-mpi.org>> wrote:
Eugene - can you confirm that this is only happening on the one Solaris system? 
In other words, is this a general issue or something specific to that one 
machine?

I'm wondering because if it is just the one machine, then it might be something 
strange about how it is setup - perhaps the version of Solaris, or it is 
configuring --enable-static, or...

Just trying to assess how general a problem this might be, and thus if this 
should be a blocker or not.

On Aug 24, 2012, at 8:00 AM, Eugene Loh 
mailto:eugene@oracle.com>> wrote:

> On 08/24/12 09:54, Shamis, Pavel wrote:
>> Maybe there is a chance to get direct access to this system ?
> No.
>
> But I'm attaching compressed log files from configure/make.
>
> ___
> devel mailing list
> de...@open-mpi.org<mailto:de...@open-mpi.org>
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org<mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Paul H. Hargrove  
phhargr...@lbl.gov<mailto:phhargr...@lbl.gov>
Future Technologies Group
Computer and Data Sciences Department Tel: 
+1-510-495-2352
Lawrence Berkeley National Laboratory Fax: 
+1-510-486-6900

___
devel mailing list
de...@open-mpi.org<mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org<mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Paul H. Hargrove  
phhargr...@lbl.gov<mailto:phhargr...@lbl.gov>
Future Technologies Group
Computer and Data Sciences Department Tel: 
+1-510-495-2352
Lawrence Berkeley National Laboratory Fax: 
+1-510-486-6900

--
Paul H. Hargrove  
phhargr...@lbl.gov<mailto:phhargr...@lbl.gov>
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900

___
devel mailing list
de...@open-mpi.org<mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] r27078 and OMPI build

2012-08-29 Thread Shamis, Pavel

The issue #2 was fixed in r27178.
Paul - Thanks for help !!!

Regards,

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Aug 21, 2012, at 11:36 AM, Eugene Loh wrote:

r27078 (ML collective component) broke some Solaris OMPI builds.

1)  In ompi/mca/coll/ml/coll_ml_lmngr.c
199 #ifdef HAVE_POSIX_MEMALIGN
200 if((errno = posix_memalign(&lmngr->base_addr,
201 lmngr->list_alignment,
202 lmngr->list_size * lmngr->list_block_size))
!= 0) {
203 ML_ERROR(("Failed to allocate memory: %s [%d]", errno,
strerror(errno)));
204 return OMPI_ERROR;
205 }
206 #else
207 lmngr->base_addr =
208 malloc(lmngr->list_size * lmngr->list_block_size +
lmngr->list_alignment);
209 if(NULL == lmngr->base_addr) {
210 ML_ERROR(("Failed to allocate memory: %s [%d]", errno,
strerror(errno)));
211 return OMPI_ERROR;
212 }
213
214 lmngr->base_addr =
(void*)OPAL_ALIGN((uintptr_t)lmngr->base_addr,
215 lmngr->list_align, uintptr_t);
216 #endif
   The "#else" code path has multiple problems -- specifically at the
statement on lines 214-215:
   - OPAL_ALIGN needs to be defined (e.g., #include "opal/align.h")
   - uintptr_t need to be defined (e.g., #include "opal_stdint.h")
   - list_align should be list_alignment

I could fix, but need help with...

2)  http://www.open-mpi.org/mtt/index.php?do_redir=2089  Somehow,
coll_ml is getting pulled into libmpi.so.  E.g., this doesn't look right:

   % nm ompi/.libs/libmpi.so | grep mca_coll_ml
   [13161] |   2556704|   172|FUNC |LOCL |0|11
|mca_coll_ml_alloc_op_prog_single_frag_dag
   [13171] |   2555488|   344|FUNC |LOCL |0|11
|mca_coll_ml_buffer_recycling
   [13173] |   2555392|92|FUNC |LOCL |0|11 |mca_coll_ml_err
   [23992] | 0| 0|FUNC |GLOB |0|UNDEF
|mca_coll_ml_memsync_intra

The UNDEF is causing a problem, but I'm guessing all that mca_coll_ml_
stuff shouldn't be in there at all in the first place.  This is on one
Solaris system, while another doesn't see the problem and builds fine.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] r27078 and OMPI build

2012-08-29 Thread Shamis, Pavel

Eugene,

Can you please confirm that the issue is resolved on your setup ?

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Aug 29, 2012, at 10:14 AM, Shamis, Pavel wrote:

The issue #2 was fixed in r27178.
Paul - Thanks for help !!!

Regards,

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Aug 21, 2012, at 11:36 AM, Eugene Loh wrote:

r27078 (ML collective component) broke some Solaris OMPI builds.

1)  In ompi/mca/coll/ml/coll_ml_lmngr.c
   199 #ifdef HAVE_POSIX_MEMALIGN
   200 if((errno = posix_memalign(&lmngr->base_addr,
   201 lmngr->list_alignment,
   202 lmngr->list_size * lmngr->list_block_size))
!= 0) {
   203 ML_ERROR(("Failed to allocate memory: %s [%d]", errno,
strerror(errno)));
   204 return OMPI_ERROR;
   205 }
   206 #else
   207 lmngr->base_addr =
   208 malloc(lmngr->list_size * lmngr->list_block_size +
lmngr->list_alignment);
   209 if(NULL == lmngr->base_addr) {
   210 ML_ERROR(("Failed to allocate memory: %s [%d]", errno,
strerror(errno)));
   211 return OMPI_ERROR;
   212 }
   213
   214 lmngr->base_addr =
(void*)OPAL_ALIGN((uintptr_t)lmngr->base_addr,
   215 lmngr->list_align, uintptr_t);
   216 #endif
  The "#else" code path has multiple problems -- specifically at the
statement on lines 214-215:
  - OPAL_ALIGN needs to be defined (e.g., #include "opal/align.h")
  - uintptr_t need to be defined (e.g., #include "opal_stdint.h")
  - list_align should be list_alignment

I could fix, but need help with...

2)  http://www.open-mpi.org/mtt/index.php?do_redir=2089  Somehow,
coll_ml is getting pulled into libmpi.so.  E.g., this doesn't look right:

  % nm ompi/.libs/libmpi.so | grep mca_coll_ml
  [13161] |   2556704|   172|FUNC |LOCL |0|11
|mca_coll_ml_alloc_op_prog_single_frag_dag
  [13171] |   2555488|   344|FUNC |LOCL |0|11
|mca_coll_ml_buffer_recycling
  [13173] |   2555392|92|FUNC |LOCL |0|11 |mca_coll_ml_err
  [23992] | 0| 0|FUNC |GLOB |0|UNDEF
|mca_coll_ml_memsync_intra

The UNDEF is causing a problem, but I'm guessing all that mca_coll_ml_
stuff shouldn't be in there at all in the first place.  This is on one
Solaris system, while another doesn't see the problem and builds fine.
___
devel mailing list
de...@open-mpi.org<mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Warnings in OMPI trunk and 1.7

2012-09-12 Thread Shamis, Pavel

Ralph,

Please see our comment inline.

> common_allgather.c: In function 'comm_allgather_pml':
> common_allgather.c:45: warning: 'recv_iov[1].iov_len' may be used 
> uninitialized in this function
> common_allgather.c:45: warning: 'send_iov[1].iov_len' may be used 
> uninitialized in this function
> common_allgather.c:45: warning: 'send_iov[1].iov_base' may be used 
> uninitialized in this function
> common_allgather.c:45: warning: 'recv_iov[1].iov_base' may be used 
> uninitialized in this function

This is ours.

> 
> common_netpatterns_knomial_tree.c: In function 
> 'mca_common_netpatterns_setup_recursive_knomial_allgather_tree_node':
> common_netpatterns_knomial_tree.c:38: warning: 'reindex_myid' may be used 
> uninitialized in this function

The same here.

> 
> common_verbs_find_ports.c: In function 'ompi_common_verbs_find_ibv_ports':
> common_verbs_find_ports.c:154: warning: attempt to free a non-heap object 
> 'name'
> 

This is probably Jeff's code.

> 
> coll_ml_module.c: In function 'mca_coll_ml_tree_hierarchy_discovery':
> coll_ml_module.c:1953: warning: 'my_rank_in_remaining_list' may be used 
> uninitialized in this function
> 
> coll_ml_hier_algorithms_setup.c: In function 
> 'ml_coll_barrier_constant_group_data_setup':
> coll_ml_hier_algorithms_setup.c:329: warning: 'value_to_set' may be used 
> uninitialized in this function
> coll_ml_hier_algorithms_setup.c:334: warning: 'constant_group_data' may be 
> used uninitialized in this function
> coll_ml_hier_algorithms_setup.c: In function 'ml_coll_up_and_down_hier_setup':
> coll_ml_hier_algorithms_setup.c:25: warning: 'value_to_set' may be used 
> uninitialized in this function


This is ours

We will work to fix it.
Can you please share with us your configure parameters.

Regards,
Pasha

Re: [OMPI devel] MPI-RMA on uGNI?

2012-10-22 Thread Shamis, Pavel

Paul,

Did you look at   mca_btl_ugni_put / mca_btl_ugni_get functions  in the ugni 
btl ?

-Pasha

I am trying to resolve an odd issue I am seeing with my one uGNI-based code, 
and was hoping to use OMPI's uGNI support as an example of correct usage.  My 
particular interest is in RDMA, but as far as I can tell the uGNI blt in 
ompi-trunk doesn't have the btl_put or blt_get entry points.  So, if I 
understand correctly, that means osc/pt2pt is used for MPI-RMA support on a 
Cray XE.  Is that correct?

-Paul

--
Paul H. Hargrove  
phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] 1.7 rc4 compilation error

2012-10-26 Thread Shamis, Pavel

There is a bug in makefile. The file existing in svn, but it is not listed in 
the Makefile.am. As a result, it wasn't pulled to the tarball.

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Oct 26, 2012, at 2:33 PM, Edgar Gabriel wrote:

we have trouble compiling the 1.7 series on a machine in Dresden.
Specifically, we receive an error message when compiling the
bcol/iboffload component (other infiniband components compile fine).

Any idea/suggestions what we might be doing wrong or what to look for?

make[2]: Entering directory
`/home/h2/gabriel/openmpi-1.7rc4/ompi/mca/bcol/iboffload'
 CC   bcol_iboffload_module.lo
 CC   bcol_iboffload_mca.lo
 CC   bcol_iboffload_endpoint.lo
 CC   bcol_iboffload_frag.lo
In file included from bcol_iboffload_frag.c:16:0:
bcol_iboffload.h:46:36: fatal error: bcol_iboffload_qp_info.h: No such
file or directory
compilation terminated.
make[2]: *** [bcol_iboffload_frag.lo] Error 1
make[2]: *** Waiting for unfinished jobs
In file included from bcol_iboffload_mca.c:18:0:
bcol_iboffload.h:46:36: fatal error: bcol_iboffload_qp_info.h: No such
file or directory
compilation terminated.
make[2]: *** [bcol_iboffload_mca.lo] Error 1
In file included from bcol_iboffload_endpoint.c:23:0:
bcol_iboffload.h:46:36: fatal error: bcol_iboffload_qp_info.h: No such
file or directory
compilation terminated.
make[2]: *** [bcol_iboffload_endpoint.lo] Error 1
In file included from bcol_iboffload_module.c:39:0:
bcol_iboffload.h:46:36: fatal error: bcol_iboffload_qp_info.h: No such
file or directory
compilation terminated.
make[2]: *** [bcol_iboffload_module.lo] Error 1
make[2]: Leaving directory
`/home/h2/gabriel/openmpi-1.7rc4/ompi/mca/bcol/iboffload'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/h2/gabriel/openmpi-1.7rc4/ompi'
make: *** [all-recursive] Error 1

Thanks
Edgar

--
Edgar Gabriel
Associate Professor
Parallel Software Technologies Lab  http://pstl.cs.uh.edu
Department of Computer Science  University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857  Fax: +1 (713) 743-3335

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Trunk warnings in collectives

2012-11-12 Thread Shamis, Pavel

We are looking at this issue...

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Nov 11, 2012, at 8:49 PM, Ralph Castain wrote:

Seeing the following warnings in the trunk:

bcol_ptpcoll_bcast.c: In function 'bcol_ptpcoll_bcast_k_nomial_known_root':
bcol_ptpcoll_bcast.c:606: warning: 'data_src' may be used uninitialized in this 
function
bcol_ptpcoll_bcast.c: In function 'bcol_ptpcoll_bcast_k_nomial_anyroot':
bcol_ptpcoll_bcast.c:129: warning: 'peer' may be used uninitialized in this 
function
coll_ml_hier_algorithms_setup.c: In function 
'ml_coll_barrier_constant_group_data_setup':
coll_ml_hier_algorithms_setup.c:329: warning: 'value_to_set' may be used 
uninitialized in this function
coll_ml_hier_algorithms_setup.c: In function 'ml_coll_up_and_down_hier_setup':
coll_ml_hier_algorithms_setup.c:25: warning: 'value_to_set' may be used 
uninitialized in this function
coll_ml_lex.c:1363: warning: 'input' defined but not used
sbgp_basesmsocket_component.c: In function 'mca_sbgp_basesmsocket_select_procs':
sbgp_basesmsocket_component.c:270: warning: 'my_local_index' may be used 
uninitialized in this function


Can someone please take a look?
Ralph


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

[OMPI devel] Is trunk broken ?

2012-11-12 Thread Shamis, Pavel

I get the following error on the trunk:

base/memchecker_base_close.c: In function 'opal_memchecker_base_close':
base/memchecker_base_close.c:28: error: implicit declaration of function 
'opal_output_close'

I may add #include "opal/util/output.h" to the file, but then it fails in other 
components.
I suspect that the output.h was removed somewhere from the top level.

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory

Re: [OMPI devel] Is trunk broken ?

2012-11-12 Thread Shamis, Pavel

Debug build works.
--with-platform=optimized is broken

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Nov 12, 2012, at 3:44 PM, Shamis, Pavel wrote:

I get the following error on the trunk:

base/memchecker_base_close.c: In function 'opal_memchecker_base_close':
base/memchecker_base_close.c:28: error: implicit declaration of function 
'opal_output_close'

I may add #include "opal/util/output.h" to the file, but then it fails in other 
components.
I suspect that the output.h was removed somewhere from the top level.

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory







___
devel mailing list
de...@open-mpi.org<mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] bcol basesmuma maintainer?

2013-01-02 Thread Shamis, Pavel

Brian,

I will take a look. Thanks for the patch !

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jan 2, 2013, at 4:37 PM, Barrett, Brian W wrote:

Hi all -

Who's maintaining the bcol basesmuma component?  I'd like to commit the
attached patch, which cleans up some usage of process names, but want a
second pair of eyeballs.  The orte_namelist_t type is meant for places
where the orte_process_na me_t needs to be put on a list.  In basesmuma,
it's being used like an rote_process_name_t.  While it doesn't really
matter, it means one more thing that has to be in the API between the
runtime and the MPI layer, so I'd like to clean it up.

Thanks,

Brian

--
 Brian W. Barrett
 Scalable System Software Group
 Sandia National Laboratories




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] bcol basesmuma maintainer?

2013-01-03 Thread Shamis, Pavel

Brian,
The patch looks good. Please go ahead and push it.
Thanks !

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jan 2, 2013, at 4:37 PM, Barrett, Brian W wrote:

Hi all -

Who's maintaining the bcol basesmuma component?  I'd like to commit the
attached patch, which cleans up some usage of process names, but want a
second pair of eyeballs.  The orte_namelist_t type is meant for places
where the orte_process_na me_t needs to be put on a list.  In basesmuma,
it's being used like an rote_process_name_t.  While it doesn't really
matter, it means one more thing that has to be in the API between the
runtime and the MPI layer, so I'd like to clean it up.

Thanks,

Brian

--
 Brian W. Barrett
 Scalable System Software Group
 Sandia National Laboratories




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Compiling OpenMPI 1.7 with LLVM clang or llvm-gcc

2013-01-08 Thread Shamis, Pavel

Ken,

I have no problem to compile OMPI trunk with llvm-gcc-4.2  (os x 10.8)

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jan 7, 2013, at 3:49 PM, Kenneth A. Lloyd  wrote:

> Has anyone experienced any problems compiling OpenMPI 1/7 with the llvm
> compiler and C front ends?
> 
> -- Ken
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Trunk: Link Failure -- multiple definition of ib_address_t_class

2013-04-04 Thread Shamis, Pavel

I pushed a bugfix to trunk (r28289). I don't have an access to a platform with 
XRC (MOFED) installation, so this is a "blind" bugfix. If you have a system 
with XRC, please test this revision. Hopefully this resolves the problem.

Regards,
- Pavel (Pasha) Shamis


On Apr 4, 2013, at 3:28 PM, Ralph Castain  wrote:

> This is being addressed - however, for now, try configuring it with  
> "--disable-openib-connectx-xrc"
> 
> On Apr 4, 2013, at 10:32 AM, Ralph Castain  wrote:
> 
>> Sadly, the IB folks never fixed this - sigh.
>> 
>> I'll fix it in the trunk and then CMR it for 1.7. Unfortunately, it requires 
>> that you have both IB and XRC to see it, which us non-IB-vendors in the 
>> devel community don't.
>> 
>> 
>> On Apr 4, 2013, at 9:44 AM, Ralph Castain  wrote:
>> 
>>> Let me try and dig into it a bit - sadly, my access to IB machines is 
>>> sorely limited at the moment.
>>> 
>>> On Apr 4, 2013, at 9:37 AM, Paul Kapinos  wrote:
>>> 
 Got the same error on all builds (4 compiler, with and without trheading 
 support, 64 and 32bit) on our systems, effectively prohibiting the 
 building of the 1.7 release.
 
 Any idea how to workaround this?
 
 Need more logs?
 
 Best
 
 
 
 On 08/06/12 19:41, Gutierrez, Samuel K wrote:
> Looks like the type is defined twice - once in 
> ompi/mca/common/ofacm/common_ofacm_xoob.h and another time in 
> ./ompi/mca/btl/openib/btl_openib_xrc.h.
> 
> Thanks,
> 
> Sam
> 
> On Aug 6, 2012, at 11:23 AM, Jeff Squyres wrote:
> 
>> I don't have XRC support in my kernels, so it wouldn't show up for me.
>> 
>> Did someone have 2 instances of the ib_address_t class?
>> 
>> 
>> On Aug 6, 2012, at 1:17 PM, Gutierrez, Samuel K wrote:
>> 
>>> Hi,
>>> 
>>> Anyone else seeing this?
>>> 
>>> Creating mpi/man/man3/OpenMPI.3 man page...
>>> CCLD   libmpi.la
>>> mca/btl/openib/.libs/libmca_btl_openib.a(btl_openib_xrc.o):(.data.rel+0x0):
>>>  multiple definition of `ib_address_t_class'
>>> mca/common/ofacm/.libs/libmca_common_ofacm_noinst.a(common_ofacm_xoob.o):(.data.rel+0x0):
>>>  first defined here
>>> collect2: ld returned 1 exit status
>>> 
>>> Thanks,
>>> 
>>> Sam
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
 
 
 -- 
 Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
 RWTH Aachen University, Center for Computing and Communication
 Seffenter Weg 23,  D 52074  Aachen (Germany)
 Tel: +49 241/80-24915
 
 ___
 devel mailing list
 de...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>> 
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Trunk: Link Failure -- multiple definition of ib_address_t_class

2013-04-04 Thread Shamis, Pavel

Paul,

I will prepare a tarball for you.

Thanks !

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Apr 4, 2013, at 5:01 PM, Paul Hargrove 
mailto:phhargr...@lbl.gov>> wrote:

Pasha,

I have at least one system where I can reproduce the problem, but don't have 
up-to-date autotools.
So, I can only test from a tarball.

If somebody can roll me a tarball of r28289 I can test ASAP.
Otherwise I'll try to remember to test from tonight's trunk nightly once it 
appears.

-Paul


On Thu, Apr 4, 2013 at 1:30 PM, Shamis, Pavel 
mailto:sham...@ornl.gov>> wrote:
I pushed a bugfix to trunk (r28289). I don't have an access to a platform with 
XRC (MOFED) installation, so this is a "blind" bugfix. If you have a system 
with XRC, please test this revision. Hopefully this resolves the problem.

Regards,
- Pavel (Pasha) Shamis


On Apr 4, 2013, at 3:28 PM, Ralph Castain 
mailto:r...@open-mpi.org>> wrote:

> This is being addressed - however, for now, try configuring it with  
> "--disable-openib-connectx-xrc"
>
> On Apr 4, 2013, at 10:32 AM, Ralph Castain 
> mailto:r...@open-mpi.org>> wrote:
>
>> Sadly, the IB folks never fixed this - sigh.
>>
>> I'll fix it in the trunk and then CMR it for 1.7. Unfortunately, it requires 
>> that you have both IB and XRC to see it, which us non-IB-vendors in the 
>> devel community don't.
>>
>>
>> On Apr 4, 2013, at 9:44 AM, Ralph Castain 
>> mailto:r...@open-mpi.org>> wrote:
>>
>>> Let me try and dig into it a bit - sadly, my access to IB machines is 
>>> sorely limited at the moment.
>>>
>>> On Apr 4, 2013, at 9:37 AM, Paul Kapinos 
>>> mailto:kapi...@rz.rwth-aachen.de>> wrote:
>>>
>>>> Got the same error on all builds (4 compiler, with and without trheading 
>>>> support, 64 and 32bit) on our systems, effectively prohibiting the 
>>>> building of the 1.7 release.
>>>>
>>>> Any idea how to workaround this?
>>>>
>>>> Need more logs?
>>>>
>>>> Best
>>>>
>>>>
>>>>
>>>> On 08/06/12 19:41, Gutierrez, Samuel K wrote:
>>>>> Looks like the type is defined twice - once in 
>>>>> ompi/mca/common/ofacm/common_ofacm_xoob.h and another time in 
>>>>> ./ompi/mca/btl/openib/btl_openib_xrc.h.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Sam
>>>>>
>>>>> On Aug 6, 2012, at 11:23 AM, Jeff Squyres wrote:
>>>>>
>>>>>> I don't have XRC support in my kernels, so it wouldn't show up for me.
>>>>>>
>>>>>> Did someone have 2 instances of the ib_address_t class?
>>>>>>
>>>>>>
>>>>>> On Aug 6, 2012, at 1:17 PM, Gutierrez, Samuel K wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Anyone else seeing this?
>>>>>>>
>>>>>>> Creating mpi/man/man3/OpenMPI.3 man page...
>>>>>>> CCLD   libmpi.la<http://libmpi.la/>
>>>>>>> mca/btl/openib/.libs/libmca_btl_openib.a(btl_openib_xrc.o):(.data.rel+0x0):
>>>>>>>  multiple definition of `ib_address_t_class'
>>>>>>> mca/common/ofacm/.libs/libmca_common_ofacm_noinst.a(common_ofacm_xoob.o):(.data.rel+0x0):
>>>>>>>  first defined here
>>>>>>> collect2: ld returned 1 exit status
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Sam
>>>>>>> ___
>>>>>>> devel mailing list
>>>>>>> de...@open-mpi.org<mailto:de...@open-mpi.org>
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Jeff Squyres
>>>>>> jsquy...@cisco.com<mailto:jsquy...@cisco.com>
>>>>>> For corporate legal information go to: 
>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>>
>>>>>>
>>>>>> ___
>>>>>> devel mailing list
>>>>>> de...@open-mpi.org<mailto:de...@open-mpi.org>
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
&

Re: [OMPI devel] Trunk: Link Failure -- multiple definition of ib_address_t_class

2013-04-04 Thread Shamis, Pavel

Paul,

Thanks for  the testing and  note ! I already contacted mlnx folks.

Regards,
Pasha

From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On Behalf 
Of Paul Hargrove
Sent: Thursday, April 04, 2013 9:16 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] Trunk: Link Failure -- multiple definition of 
ib_address_t_class

Ralph,

You are welcome.
I am already in "testing mode" today as I am preparing for my own release at 
the end of the month.
With the scripts I have it takes less than a minute to launch a test such as 
these, and later I get email when the test completes.

For anybody else that is interested in how this bug went (relatively) unnoticed:
I found that not only did I need XRC, but I also needed to configure with 
--enable-static to reproduce the problem.
I suspect that if Mellanox added --enable-static to an existing MTT 
configuration this would not have remained unfixed for so long.

-Paul

On Thu, Apr 4, 2013 at 5:52 PM, Ralph Castain 
mailto:r...@open-mpi.org>> wrote:
Thanks Paul!!

As always, much appreciated.

On Apr 4, 2013, at 4:41 PM, Paul Hargrove 
mailto:phhargr...@lbl.gov>> wrote:


Pasha,

Your fix appears to work.

My previous testing that reproduced the problem was against the 1.7 tarball.
So, for good measure I tested BOTH last night's trunk tarball and the one Ralph 
created earlier today:

openmpi-1.9r28284.tar.bz2
FAILS in the manner reported previously:
  CCLD libmpi.la<http://libmpi.la/>
mca/btl/openib/.libs/libmca_btl_openib.a(btl_openib_xrc.o):(.data.rel+0x0): 
multiple definition of `ib_address_t_class'
mca/common/ofacm/.libs/libmca_common_ofacm_noinst.a(common_ofacm_xoob.o):(.data.rel+0x0):
 first defined here
collect2: ld returned 1 exit status

openmpi-1.9r28290.tar.bz2
PASSES
$ make all
$ make install
$ make check

-Paul

On Thu, Apr 4, 2013 at 3:12 PM, Ralph Castain 
mailto:r...@open-mpi.org>> wrote:
Available on the web site now:

http://www.open-mpi.org/nightly/trunk/


On Apr 4, 2013, at 2:13 PM, "Shamis, Pavel" 
mailto:sham...@ornl.gov>> wrote:

> Paul,
>
> I will prepare a tarball for you.
>
> Thanks !
>
> Pavel (Pasha) Shamis
> ---
> Computer Science Research Group
> Computer Science and Math Division
> Oak Ridge National Laboratory
>
>
>
>
>
>
> On Apr 4, 2013, at 5:01 PM, Paul Hargrove 
> mailto:phhargr...@lbl.gov><mailto:phhargr...@lbl.gov<mailto:phhargr...@lbl.gov>>>
>  wrote:
>
> Pasha,
>
> I have at least one system where I can reproduce the problem, but don't have 
> up-to-date autotools.
> So, I can only test from a tarball.
>
> If somebody can roll me a tarball of r28289 I can test ASAP.
> Otherwise I'll try to remember to test from tonight's trunk nightly once it 
> appears.
>
> -Paul
>
>
> On Thu, Apr 4, 2013 at 1:30 PM, Shamis, Pavel 
> mailto:sham...@ornl.gov><mailto:sham...@ornl.gov<mailto:sham...@ornl.gov>>>
>  wrote:
> I pushed a bugfix to trunk (r28289). I don't have an access to a platform 
> with XRC (MOFED) installation, so this is a "blind" bugfix. If you have a 
> system with XRC, please test this revision. Hopefully this resolves the 
> problem.
>
> Regards,
> - Pavel (Pasha) Shamis
>
>
> On Apr 4, 2013, at 3:28 PM, Ralph Castain 
> mailto:r...@open-mpi.org><mailto:r...@open-mpi.org<mailto:r...@open-mpi.org>>>
>  wrote:
>
>> This is being addressed - however, for now, try configuring it with  
>> "--disable-openib-connectx-xrc"
>>
>> On Apr 4, 2013, at 10:32 AM, Ralph Castain 
>> mailto:r...@open-mpi.org><mailto:r...@open-mpi.org<mailto:r...@open-mpi.org>>>
>>  wrote:
>>
>>> Sadly, the IB folks never fixed this - sigh.
>>>
>>> I'll fix it in the trunk and then CMR it for 1.7. Unfortunately, it 
>>> requires that you have both IB and XRC to see it, which us non-IB-vendors 
>>> in the devel community don't.
>>>
>>>
>>> On Apr 4, 2013, at 9:44 AM, Ralph Castain 
>>> mailto:r...@open-mpi.org><mailto:r...@open-mpi.org<mailto:r...@open-mpi.org>>>
>>>  wrote:
>>>
>>>> Let me try and dig into it a bit - sadly, my access to IB machines is 
>>>> sorely limited at the moment.
>>>>
>>>> On Apr 4, 2013, at 9:37 AM, Paul Kapinos 
>>>> mailto:kapi...@rz.rwth-aachen.de><mailto:kapi...@rz.rwth-aachen.de<mailto:kapi...@rz.rwth-aachen.de>>>
>>>>  wrote:
>>>>
>>>>> Got the same error on all builds (4 compiler, with and without trheading 
>>>>> support, 64 and 32bit) on our systems, effectively prohibiting the 
>

Re: [OMPI devel] Trunk: Link Failure -- multiple definition of ib_address_t_class

2013-04-05 Thread Shamis, Pavel

Paul (K.),

I fixed the problem in trunk r28289. Can you please test the revision with your 
build environment.

Regards,
Pavel (Pasha) Shamis






On Apr 5, 2013, at 4:26 AM, Paul Kapinos  wrote:

> Hello,
> 
> On 04/05/13 03:16, Paul Hargrove wrote:
>> I found that not only did I need XRC, but I also needed to configure with
>> --enable-static to reproduce the problem.
>> I suspect that if Mellanox added --enable-static to an existing MTT
>> configuration this would not have remained unfixed for so long.
> 
> Well, AFAIK we do not use --enable-static in our builds and in the config-log 
> --disable-static is seen multiple times. Neverthelesse we run into the error
>> mca/btl/openib/.libs/libmca_btl_openib.a(btl_openib_xrc.o):(.data.rel+0x0):
>> multiple definition of `ib_address_t_class'
> 
> The configure line we're using is something like this:
> 
> ./configure --with-verbs --with-lsf --with-devel-headers 
> --enable-contrib-no-build=vt --enable-heterogeneous --enable-cxx-exceptions 
> --enable-orterun-prefix-by-default --disable-dlopen --disable-mca-dso 
> --with-io-romio-flags='--with-file-system=testfs+ufs+nfs+lustre' 
> --enable-mpi-ext ..
> 
> (adding paths, compiler-specific optimisation things and -m32 or -m64)
> 
> An config.log file attached FYI
> 
> 
> Best
> 
> Paul
> 
> 
> -- 
> Dipl.-Inform. Paul Kapinos   -   High Performance Computing,
> RWTH Aachen University, Center for Computing and Communication
> Seffenter Weg 23,  D 52074  Aachen (Germany)
> Tel: +49 241/80-24915
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

[OMPI devel] OMPI 1.7 - libevent warning

2013-07-01 Thread Shamis, Pavel


Open MPI version:1.7.2 on IB system.

Test: everybody sends to everybody - Irecv, Isend, Wait. In total 1024 process.

Warning:
"[warn] opal_libevent2019 each event_base at once.
[warn] opal_libevent2019_event_base_loop: reentrant invocation.  Only one 
event_base_loop can run on each event_base at once."

The problem doesn't show up with 512 ranks but only with 1024 ranks. 
My guess, we still have somewhere in openib btl
blocking free list allocation  that causes recursive call to progress.

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory

Re: [OMPI devel] RFC: Dead code removal

2013-07-05 Thread Shamis, Pavel

> - coll ml
This one is used.
> 
> - sbgp basemsocket
This one is used as well

-P.

Re: [OMPI devel] Annual OMPI membership review: SVN accounts

2013-07-08 Thread Shamis, Pavel

All ORNL's accounts should stay active as well.

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jul 8, 2013, at 6:32 PM, Jeff Squyres (jsquyres) 
mailto:jsquy...@cisco.com>> wrote:

According to https://svn.open-mpi.org/trac/ompi/wiki/Admistrative%20rules, it 
is time for our annual review of Open MPI SVN accounts of these SVN repos: 
hwloc, mtt, ompi-docs, ompi-tests, ompi-www, ompi.

*** Organizations must reply by COB Friday, 12 July, 2013 ***
*** No reply means: delete all of my organization's SVN accounts

Each organization must reply and specify which of their accounts can stay and 
which should go.  I cross-referenced the SVN logs from all of our SVN 
repositories to see who has not committed anything in the past year.

*** I strongly recommend deleting accounts who have not committed in the last 
year.
*** Other accounts can be deleted, too (e.g., those who have left a given 
organization).

bakeyournoodle.com (???)
==
tonyb:Tony Breeds mailto:t...@bakeyournoodle.com>> 
**NO COMMITS IN LAST YEAR**

Cisco
=
dgoodell: Dave Goodell mailto:dgood...@cisco.com>>
jsquyres: Jeff Squyres mailto:jsquy...@cisco.com>>

Indiana
==
lums: Andrew Lumsdaine mailto:l...@cs.indiana.edu>> 
**NO COMMITS IN LAST YEAR**
adkulkar: Abhishek Kulkarni mailto:adkul...@osl.iu.edu>>
afriedle: Andrew Friedley mailto:afrie...@osl.iu.edu>> 
**NO COMMITS IN LAST YEAR**
timattox: Tim Mattox mailto:timat...@open-mpi.org>> **NO 
COMMITS IN LAST YEAR**

U. Houston
=
edgar:Edgar Gabriel mailto:gabr...@cs.uh.edu>>
vvenkatesan:Vishwanath Venkatesan 
mailto:venka...@cs.uh.edu>>

Mellanox
==
alekseys: Aleksey Senin 
mailto:aleks...@dev.mellanox.co.il>>
kliteyn:  Yevgeny Kliteynik 
mailto:klit...@dev.mellanox.co.il>>
miked:Mike Dubman 
mailto:mi...@dev.mellanox.co.il>>
lennyve:  Lenny Verkhovsky 
mailto:lenny.verkhov...@gmail.com>> **NO COMMITS IN 
LAST YEAR**
yaeld:Yael Dayan mailto:yaeld.mella...@gmail.com>>
vasily:   Vasily Philipov mailto:vas...@mellanox.co.il>>
amikheev: Alex Mikheev mailto:al...@mellanox.com>>
alex: Alexander Margolin mailto:ale...@mellanox.com>>
alinas:   Alina Sklarevich mailto:ali...@mellanox.com>> 
**NO COMMITS IN LAST YEAR**
igoru:Igor Usarov mailto:ig...@mellanox.com>>
jladd:Joshua Ladd mailto:josh...@mellanox.com>>
yosefe:   Yossi mailto:yos...@mellanox.com>>
rlgraham: Rich Graham mailto:rlgra...@ornl.gov>> **NO 
COMMITS IN LAST YEAR**

Tennessee

bosilca:  George Bosilca mailto:bosi...@eecs.utk.edu>>
bouteill: Aurelien Bouteiller 
mailto:boute...@eecs.utk.edu>>
wbland:   Wesley Bland mailto:wbl...@mcs.anl.gov>> **NO 
COMMITS IN LAST YEAR**

hlrs.de
===
shiqing:  Shiqing Fan 
hpcchris: Christoph Niethammer 
rusraink: Rainer Keller  **NO COMMITS IN LAST 
YEAR**

IBM
==
jnysal:   Nysal Jan K A  **NO COMMITS IN LAST YEAR**
cyeoh:Chris Yeoh 
bbenton:  Brad Benton 

INRIA

bgoglin:  Brice Goglin 
arougier: Antoine Rougier 
sthibaul: Samuel Thibault 
mercier:  Guillaume Mercier  **NO COMMITS IN LAST YEAR**
nfurmento:Nathalie Furmento  **NO COMMITS IN LAST 
YEAR**
herault:  Thomas Herault  **NO COMMITS IN LAST YEAR**

LANL

hjelmn:   Nathan Hjelm 
samuel:   Samuel K. Gutierrez 

NVIDIA
==
rolfv:Rolf Vandevaart 

U. Wisconsin La Crosse

jjhursey: Joshua Hursey 

Intel

rhc:  Ralph Castain 

Chelsio / OGC
=
swise:Steve Wise 

Oracle
==
emallove: Ethan Mallove  **NO COMMITS IN LAST YEAR**
eugene:   Eugene Loh 
tdd:  Terry Dontje 

ORNL

manjugv:  Manjunath, Gorentla Venkata 
naughtont:Thomas Naughton 
pasha:Pavel Shamis 

Sandia
==
brbarret: Brian Barrett 
memoryhole:Kyle Wheeler  **NO COMMITS IN LAST YEAR**
ktpedre:  Kevin Pedretti  **NO COMMITS IN LAST YEAR**
mjleven:  Michael Levenhagen  **NO COMMITS IN LAST YEAR**
rbbrigh:  Ron Brightwell  **NO COMMITS IN LAST YEAR**

Dresden
=
knuepfer: Andreas Knuepfer  **NO COMMITS IN 
LAST YEAR**
bwesarg:  Bert Wesarg  **NO COMMITS IN LAST YEAR**
jurenz:   Matthias Jurenz 

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] RFC: Dead code removal

2013-07-11 Thread Shamis, Pavel

Jeff,

I reviewed the changes in the collectives code(ml,bcol,sbgp) - everything looks 
fine.
Thanks for the cleanup.

-P.





On Jul 5, 2013, at 9:56 AM, Jeff Squyres (jsquyres) 
mailto:jsquy...@cisco.com>> wrote:

They are assigned but not used.


On Jul 5, 2013, at 8:47 AM, "Shamis, Pavel" 
mailto:sham...@ornl.gov>> wrote:

- coll ml
This one is used.

- sbgp basemsocket
This one is used as well

-P.


___
devel mailing list
de...@open-mpi.org<mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] OpenSHMEM round 2

2013-08-06 Thread Shamis, Pavel

Josh,
I get 404 error. Probably you have to unlock it.
Best,
-P

From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On Behalf 
Of Joshua Ladd
Sent: Tuesday, August 06, 2013 12:30 PM
To: Open MPI Developers (de...@open-mpi.org)
Subject: [OMPI devel] OpenSHMEM round 2

Dear OMPI Community,

Please find on Bitbucket the latest round of OSHMEM changes based on community 
feedback. Please git and test at your leisure.

https://bitbucket.org/jladd_math/mlnx-oshmem.git

Best regards,

Josh

Joshua S. Ladd, PhD
HPC Algorithms Engineer
Mellanox Technologies

Email: josh...@mellanox.com
Cell: +1 (865) 258 - 8898

Re: [OMPI devel] [EXTERNAL] OpenSHMEM round 2

2013-08-14 Thread Shamis, Pavel

Ralph,

There is OpenSHMEM test suite 
http://bongo.cs.uh.edu/site/sites/default/site_files/openshmem-test-suite-release-1.0d.tar.bz2
The test-suite exercises most of the API.

Best,
Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Aug 14, 2013, at 5:52 PM, Joshua Ladd 
mailto:josh...@mellanox.com>> wrote:

The following simple test code will exercise the following:

start_pes()

shmalloc()

shmem_int_get()

shmem_int_put()

shmem_barrier_all()

To compile:

shmemcc test_shmem.c -o test_shmem

To launch:

shmemrun -np 2  test_shmem

or for those who prefer to launch with SLURM

srun -n 2 test_shmem

Josh


-Original Message-
From: devel [mailto:devel-boun...@open-mpi.org] On 
Behalf Of Ralph Castain
Sent: Wednesday, August 14, 2013 5:32 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] [EXTERNAL] OpenSHMEM round 2

Can you point me to a test program that would exercise it? I'd like to give it 
a try first.

I'm okay with on by default as it builds its own separate library, and with the 
RFC

On Aug 14, 2013, at 2:03 PM, "Barrett, Brian W" 
mailto:bwba...@sandia.gov>> wrote:

Josh -

In general, I don't have a strong opinion of whether OpenSHMEM is on
by default or not.  It might cause unexpected behavior for some users
(like on Crays, where one should really use Cray's SHMEM), but maybe
it's better on other platforms.

I also would have no objection to the RFC, provided the segfaults I
found get resolved.

Brian

On 8/14/13 2:08 PM, "Joshua Ladd" 
mailto:josh...@mellanox.com>> wrote:

Ralph, and Brian

Thanks a bunch for taking the time to review this. It is extremely
helpful. Let me comment of the building of OSHMEM and solicit some
feedback from you guys (along with the rest of the community.)
Originally we had planned to enable OSHMEM to build only if
'--with-oshmem' flag was passed at configure time. However,
(unbeknownst to me) this behavior was changed and now OSHMEM is built by 
default, i.e.
yes, Ralph this is the intended behavior now. I am wondering if this
is such a good idea. Do folks have a strong opinion on this one way
or the other? From my perspective I can see arguments for both sides
of the coin.

Other than cleaning up warnings and resolving the segfault that Brian
observed are we on a good course to getting this upstream? Is it
reasonable to file an RFC for three weeks out?

Josh

-Original Message-
From: devel [mailto:devel-boun...@open-mpi.org] On 
Behalf Of Barrett,
Brian W
Sent: Sunday, August 11, 2013 1:42 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] [EXTERNAL] OpenSHMEM round 2

Ralph -

I think those warnings are just because of when they last synced with
the trunk; it looks like they haven't updated in the last week, when
those (and some usnic fixes) went in.

More concerning is the --enable-picky stuff and the disabling of
SHMEM in the right places.

Brian

On 8/11/13 11:24 AM, "Ralph Castain" 
mailto:r...@open-mpi.org>> wrote:

Turning off the enable_picky, I get it to compile with the following
warnings:

pget_elements_x_f.c:70: warning: no previous prototype for
'ompi_get_elements_x_f'
pstatus_set_elements_x_f.c:70: warning: no previous prototype for
'ompi_status_set_elements_x_f'
ptype_get_extent_x_f.c:69: warning: no previous prototype for
'ompi_type_get_extent_x_f'
ptype_get_true_extent_x_f.c:69: warning: no previous prototype for
'ompi_type_get_true_extent_x_f'
ptype_size_x_f.c:69: warning: no previous prototype for
'ompi_type_size_x_f'

I also found that OpenShmem is still building by default. Is that
intended? I thought you were only going to build if --with-shmem (or
whatever option) was given.

Looks like some cleanup is required

On Aug 10, 2013, at 8:54 PM, Ralph Castain 
mailto:r...@open-mpi.org>> wrote:

FWIW, I couldn't get it to build - this is on a simple Xeon-based
system under CentOS 6.2:

cc1: warnings being treated as errors
spml_yoda_getreq.c: In function 'mca_spml_yoda_get_completion':
spml_yoda_getreq.c:98: error: pointer targets in passing argument 1
of 'opal_atomic_add_32' differ in signedness
../../../../opal/include/opal/sys/amd64/atomic.h:174: note:
expected 'volatile int32_t *' but argument is of type 'uint32_t *'
spml_yoda_getreq.c:98: error: signed and unsigned type in
conditional expression
cc1: warnings being treated as errors
spml_yoda_putreq.c: In function 'mca_spml_yoda_put_completion':
spml_yoda_putreq.c:81: error: pointer targets in passing argument 1
of 'opal_atomic_add_32' differ in signedness
../../../../opal/include/opal/sys/amd64/atomic.h:174: note:
expected 'volatile int32_t *' but argument is of type 'uint32_t *'
spml_yoda_putreq.c:81: error: signed and unsigned type in
conditional expression
make[2]: *** [spml_yoda_getreq.lo] Error 1
make[2]: *** Waiting for unfinished jobs
make[2]: *** [spml_yoda_putreq.lo] Error 1
cc1: warnings being treated as errors
spml_yoda

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r29703 - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib ompi/mca/btl/openib/connect

2013-11-14 Thread Shamis, Pavel

When I looked at the code last time - no.
(The connection state machine is very different)

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Nov 14, 2013, at 11:51 AM, Jeff Squyres (jsquyres) 
mailto:jsquy...@cisco.com>> wrote:

Does XRC work with the UDCM CPC?


On Nov 14, 2013, at 9:35 AM, Ralph Castain 
mailto:r...@open-mpi.org>> wrote:

I think the problems in udcm were fixed by Nathan quite some time ago, but 
never moved to 1.7 as everyone was told that the connect code in openib was 
already deprecated pending merge with the new ofacm common code. Looking over 
at that area, I see only oob and xoob - so if the users of the common ofacm 
code are finding that it works, the simple answer may just be to finally 
complete the switchover.

Meantime, perhaps someone can CMR and review a copying of the udcm cpc to the 
1.7 branch?


On Nov 14, 2013, at 5:14 AM, Joshua Ladd 
mailto:josh...@mellanox.com>> wrote:

Um, no. It's supposed to work with UDCM which doesn't appear to be enabled in 
1.7.

Per Ralph's comment to me last night:

"... you cannot use the oob connection manager. It doesn't work and was 
deprecated. You must use udcm, which is why things are supposed to be set to do 
so by default. Please check the openib connect priorities and correct them if 
necessary."

However, it's never been enabled in 1.7 - don't know what "borked" means, and 
from what Devendar tells me, several UDCM commits that are in the trunk have 
not been pushed over to 1.7:

So, as of this moment, OpenIB BTL is essentially dead-in-the-water in 1.7.



[enable_connectx_xrc="$enableval"], 
[enable_connectx_xrc="yes"])
 #
 # Unconnect Datagram (UD) based connection manager
 #
#AC_ARG_ENABLE([openib-udcm],
#[AC_HELP_STRING([--enable-openib-udcm],
#[Enable datagram connection support in openib BTL 
(default: enabled)])],
#[enable_openib_udcm="$enableval"], 
[enable_openib_udcm="yes"])
 # Per discussion with Ralph and Nathan, disable UDCM for now.
 # It's borked and needs some surgery to get back on its feet.
 enable_openib_udcm=no


Josh


-Original Message-
From: devel [mailto:devel-boun...@open-mpi.org] On 
Behalf Of Jeff Squyres (jsquyres)
Sent: Thursday, November 14, 2013 6:44 AM
To: mailto:de...@open-mpi.org>>
Subject: Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r29703 - in trunk: 
contrib/platform/iu/odin ompi/mca/btl/openib ompi/mca/btl/openib/connect

Does the openib *only* work with RDMACM now?

That's surprising (and bad!).

Did someone ask Mellanox about fixing the OOB and XOOB CPCs?


On Nov 13, 2013, at 11:16 PM, 
svn-commit-mai...@open-mpi.org wrote:

Author: rhc (Ralph Castain)
List-Post: devel@lists.open-mpi.org
Date: 2013-11-13 23:16:53 EST (Wed, 13 Nov 2013)
New Revision: 29703
URL: https://svn.open-mpi.org/trac/ompi/changeset/29703

Log:
Given that the oob and xoob cpc's are no longer operable and haven't been since 
the OOB update, remove them to avoid confusion

cmr:v1.7.4:reviewer=hjelmn:subject=Remove stale cpcs from openib

Deleted:
trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c
trunk/ompi/mca/btl/openib/connect/btl_openib_connect_oob.h
trunk/ompi/mca/btl/openib/connect/btl_openib_connect_xoob.c
trunk/ompi/mca/btl/openib/connect/btl_openib_connect_xoob.h
Text files modified:
trunk/contrib/platform/iu/odin/optimized.conf   | 1
trunk/contrib/platform/iu/odin/static.conf  | 1
trunk/ompi/mca/btl/openib/Makefile.am   |10
trunk/ompi/mca/btl/openib/connect/btl_openib_connect_base.c |14
/dev/null   |   975 
-
/dev/null   |18
/dev/null   |  1150 

/dev/null   |19
8 files changed, 5 insertions(+), 2183 deletions(-)

Modified: trunk/contrib/platform/iu/odin/optimized.conf
==
--- trunk/contrib/platform/iu/odin/optimized.conf   Wed Nov 13 19:34:15 2013
(r29702)
+++ trunk/contrib/platform/iu/odin/optimized.conf   2013-11-13 23:16:53 EST 
(Wed, 13 Nov 2013)  (r29703)
@@ -80,7 +80,6 @@

## Setup OpenIB
btl_openib_want_fork_support = 0
-btl_openib_cpc_include = oob
#btl_openib_receive_queues = 
P,128,256,64,32,32:S,2048,1024,128,32:S,12288,1024,128,32:S,65536,1024,128,32

## Setup TCP

Modified: trunk/contrib/platform/iu/odin/static.conf
==
--- trunk/contrib/platform/iu/odin/static.conf  Wed Nov 13 19:34:15 2013
(r29702)
+++ trunk/contrib/platform/iu/odin/static.conf

Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib ompi/mca/btl/openib/connect

2013-11-14 Thread Shamis, Pavel

There is some confusion in the thread. UDCM is just another CPC, like XOOB, 
OOB, and RDMACM (I think IBCM is officially dead).
XOOB and OOB don't use UDCM, they relay on ORTE out-of-band communication.

OpenIB/connect supports UDCM,XOOB,OOB, and RDMACM
OFACM supports (at least last time when we checked) OOB and XOOB

RDMACM was not moved to OFACM, because of iWarp's "first message" requirement 
that used to break the abstraction.
Moreover RDMACM scalability used to be terrible, as a result no one in IB 
community really used it.
The situation is a bit different today, since ROCEE relays on RDMACM. It worth 
noting that you may setup
ROCEE connections with a regular OOB with a some restrictions (we did it for 
mvapich-1).

The code between ofacm and openib is similar, but NOT the same. We change the 
API in a way that it allows
to hide XRC QP management (there is hash table that manages QP to EP mapping) 
in OFACM instead of OPENIB.
This made openib initialization code a bit cleaner. Here is my old tree with 
openib btl changes https://bitbucket.org/pasha/ofacm

I hope it helps,

Best,
Pasha

On Nov 14, 2013, at 1:17 PM, Joshua Ladd  wrote:

> Unless someone went in and "fixed" the code in common (judging by the 
> comments, fixed seems to imply porting (x)oob to use UDCM, which hasn't been 
> done at all in the context of xoob and is incompletely patched and remains 
> unusable as a replacement for oob in 1.7.4), there is no reason to believe it 
> would work any different than the cpcs under btl/openib/connect. IIRC, it's 
> the same code - copy/pasted - just moved to a common location so Cheetah 
> collectives can do their wireup. So, if oob cpc doesn't work, ofacm oob won't 
> work either and, I guess, by extension, Cheetah IBoffload won't work. Pasha, 
> correct me if you know different. 
> 
> 
> Josh
> 
> 
> -Original Message-
> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph Castain
> Sent: Thursday, November 14, 2013 1:05 PM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 
> - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib 
> ompi/mca/btl/openib/connect
> 
> 
> On Nov 14, 2013, at 9:33 AM, Barrett, Brian W  wrote:
> 
>> On 11/14/13 9:51 AM, "Jeff Squyres (jsquyres)"  wrote:
>> 
>>> Does XRC work with the UDCM CPC?
>>> 
>>> 
>>> On Nov 14, 2013, at 9:35 AM, Ralph Castain  wrote:
>>> 
 I think the problems in udcm were fixed by Nathan quite some time 
 ago, but never moved to 1.7 as everyone was told that the connect 
 code in openib was already deprecated pending merge with the new 
 ofacm common code. Looking over at that area, I see only oob and 
 xoob - so if the users of the common ofacm code are finding that it 
 works, the simple answer may just be to finally complete the switchover.
 
 Meantime, perhaps someone can CMR and review a copying of the udcm 
 cpc to the 1.7 branch?
 
 
 On Nov 14, 2013, at 5:14 AM, Joshua Ladd  wrote:
 
> Um, no. It's supposed to work with UDCM which doesn't appear to be 
> enabled in 1.7.
> 
> Per Ralph's comment to me last night:
> 
> "... you cannot use the oob connection manager. It doesn't work and 
> was deprecated. You must use udcm, which is why things are supposed 
> to be set to do so by default. Please check the openib connect 
> priorities and correct them if necessary."
> 
> However, it's never been enabled in 1.7 - don't know what "borked"
> means, and from what Devendar tells me, several UDCM commits that 
> are in the trunk have not been pushed over to 1.7:
> 
> So, as of this moment, OpenIB BTL is essentially dead-in-the-water 
> in 1.7.
> 
> 
> 
>> 
>> I'm going to start by admitting that I haven't been paying attention 
>> to IB the last couple of months, so I'm out of my league a little bit 
>> here.  I remember discussions of UDCM replacing OOB both because the 
>> OOB CPC had some issues and because it would make it easier to move 
>> the BTLs to the OPAL layer (ie, below the OOB).  But I also thought 
>> that was more future work than it clearly was.  So can someone let me know:
>> 
>> 1) What the status of UDCM is (does it work reliably, does it support 
>> XRC, etc.)
> 
> Seems to be working okay on the IB systems at LANL and IU. Don't know about 
> XRC - I seem to recall the answer is "no"
> 
>> 2) What's the difference between CPCs and OFACM and what's our plans 
>> w.r.t 1.7 there?
> 
> Pasha created ofacm because some of the collective components now need to 
> forge connections. So he created the common/ofacm code to meet those needs, 
> with the intention of someday replacing the openib cpc's with the new common 
> code. However, this was stalled by the iWarp issue, and so it fell off the 
> table.
> 
> We now have two duplicate ways of doing the same thing, but with code in two 
> different places. :-(
> 
>> 3) Someon

Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib ompi/mca/btl/openib/connect

2013-11-14 Thread Shamis, Pavel

I'm a bit outdated. What it the problem with oob / xoob ?
-Pasha

On Nov 14, 2013, at 3:07 PM, "Hjelm, Nathan T"  wrote:

> I don't think so. From what I understand the iboffload component may not live 
> much longer because of
> Mellanox's fork of Cheetah. So, it might not matter.
> 
> -Nathan
> 
> Excuse the *&(#$y Outlook posting-style. OWA sucks.
> 
> From: devel [devel-boun...@open-mpi.org] on behalf of Ralph Castain 
> [r...@open-mpi.org]
> Sent: Thursday, November 14, 2013 12:58 PM
> To: Open MPI Developers
> Subject: Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full]svn:open-mpi  
>   r29703  - in trunk: contrib/platform/iu/odinompi/mca/btl/openib 
> ompi/mca/btl/openib/connect
> 
> The key question, though, is: has anyone checked to see if the ofacm code 
> even works any more??
> 
> Only oob and xoob components appear to be present - so unless someone fixed 
> those since they were originally copied from openib, I doubt ofacm works.
> 
> 
> On Nov 14, 2013, at 11:08 AM, Shamis, Pavel  wrote:
> 
>> There is some confusion in the thread. UDCM is just another CPC, like XOOB, 
>> OOB, and RDMACM (I think IBCM is officially dead).
>> XOOB and OOB don't use UDCM, they relay on ORTE out-of-band communication.
>> 
>> OpenIB/connect supports UDCM,XOOB,OOB, and RDMACM
>> OFACM supports (at least last time when we checked) OOB and XOOB
>> 
>> RDMACM was not moved to OFACM, because of iWarp's "first message" 
>> requirement that used to break the abstraction.
>> Moreover RDMACM scalability used to be terrible, as a result no one in IB 
>> community really used it.
>> The situation is a bit different today, since ROCEE relays on RDMACM. It 
>> worth noting that you may setup
>> ROCEE connections with a regular OOB with a some restrictions (we did it for 
>> mvapich-1).
>> 
>> The code between ofacm and openib is similar, but NOT the same. We change 
>> the API in a way that it allows
>> to hide XRC QP management (there is hash table that manages QP to EP 
>> mapping) in OFACM instead of OPENIB.
>> This made openib initialization code a bit cleaner. Here is my old tree with 
>> openib btl changes https://bitbucket.org/pasha/ofacm
>> 
>> I hope it helps,
>> 
>> Best,
>> Pasha
>> 
>> On Nov 14, 2013, at 1:17 PM, Joshua Ladd  wrote:
>> 
>>> Unless someone went in and "fixed" the code in common (judging by the 
>>> comments, fixed seems to imply porting (x)oob to use UDCM, which hasn't 
>>> been done at all in the context of xoob and is incompletely patched and 
>>> remains unusable as a replacement for oob in 1.7.4), there is no reason to 
>>> believe it would work any different than the cpcs under btl/openib/connect. 
>>> IIRC, it's the same code - copy/pasted - just moved to a common location so 
>>> Cheetah collectives can do their wireup. So, if oob cpc doesn't work, ofacm 
>>> oob won't work either and, I guess, by extension, Cheetah IBoffload won't 
>>> work. Pasha, correct me if you know different.
>>> 
>>> 
>>> Josh
>>> 
>>> 
>>> -Original Message-
>>> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph Castain
>>> Sent: Thursday, November 14, 2013 1:05 PM
>>> To: Open MPI Developers
>>> Subject: Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi 
>>> r29703 - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib 
>>> ompi/mca/btl/openib/connect
>>> 
>>> 
>>> On Nov 14, 2013, at 9:33 AM, Barrett, Brian W  wrote:
>>> 
>>>> On 11/14/13 9:51 AM, "Jeff Squyres (jsquyres)"  wrote:
>>>> 
>>>>> Does XRC work with the UDCM CPC?
>>>>> 
>>>>> 
>>>>> On Nov 14, 2013, at 9:35 AM, Ralph Castain  wrote:
>>>>> 
>>>>>> I think the problems in udcm were fixed by Nathan quite some time
>>>>>> ago, but never moved to 1.7 as everyone was told that the connect
>>>>>> code in openib was already deprecated pending merge with the new
>>>>>> ofacm common code. Looking over at that area, I see only oob and
>>>>>> xoob - so if the users of the common ofacm code are finding that it
>>>>>> works, the simple answer may just be to finally complete the switchover.
>>>>>> 
>>>>>> Meantime, perhaps someone can CMR and review a copying of th

Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib ompi/mca/btl/openib/connect

2013-11-14 Thread Shamis, Pavel

Comments inline.

> 
> 3. Pasha moved the openib/connect to common/ofacm but excluded the rdmacm in 
> that move.  Never changed openib to use ofacm/common.

Pasha: This is not entirely true.  I changed openib btl ~3 year ago before my 
departure from Mellanox.  (I sent link to the code earlier).
We (community) were not able to integrate the code because of "first message" 
issue in iWarp.

> 
> Given Nathan's comments a second ago about ORNL not supporting the IB Offload 
> component, it barely makes sense to keep common/ofacm. 

Pasha: We have no intend to remove iboffload support. Obviously if Mellanox 
stops to support CORE-Direct technology, it make sense to remove it.

Best,
-Pasha

Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib ompi/mca/btl/openib/connect

2013-11-14 Thread Shamis, Pavel

> 
> 1. Ralph made the OOB asynchronous.
> 

Ralph,

I'm not familiar with details of the change. If out-of-band communication is 
supported, it should not be
that huge change for XOOB/OOB.

Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib ompi/mca/btl/openib/connect

2013-11-14 Thread Shamis, Pavel


> So far as I can tell, the issue is one of blocking. The OOB handshake is now 
> async - i.e., you post a non-blocking recv at the beginning of time, and then 
> do a non-blocking send to the other side when you want to create a 
> connection. The question is: how do you know when that connection is ready?

As you describe, the new behavior is identical to original one. We post 
non-blocking (persistent) receive during initialization. Later OMPI has barrier 
in the flow to ensure that all processes reached the point.
On first send, we use a non-blocking oob-send to initialize the connection 
(QPs). The receive triggers callback that handles the connection setup. OOB / 
XOOB communication semantics is a fully non-blocking.

We don't really block anywhere. 
We use   ompi_rte_recv_buffer_nb and ompi_rte_send_buffer_nb functions only.

Best,
Pasha

Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib ompi/mca/btl/openib/connect

2013-11-14 Thread Shamis, Pavel


> The only change is that the receive callback is now occurring in the ORTE 
> event thread, and so perhaps someone needs to look at a way to pass that back 
> into the OMPI event base (which I guess is the OPAL event base)? Just 
> glancing at the code, it looks like that could be the issue - but I honestly 
> have no idea what event base someone wants to switch to, or if they want to 
> resolve it some other way. There are clearly some things happening in the 
> ofacm oob code that involve thread locking etc., but I don't know what those 
> areas are trying to do.

I see. In this mode do you enable thread safety support  in all library (mpi)?

Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib ompi/mca/btl/openib/connect

2013-11-14 Thread Shamis, Pavel

Well, this is major change in a behavior.
 
Since openib calls communication calls from the callback
it pretty much requires to enable thread safety on openib btl level.

But we may move the queue flush operation from the callback to main thread, so 
the progress engine will wait on a signal from callback. 

How does it work for other parts of OMPI (sm, communicator) ? 
I guess they don't do anything in the callbacks ? 

Best,
Pasha

On Nov 14, 2013, at 6:35 PM, Ralph Castain  wrote:

> 
> On Nov 14, 2013, at 3:33 PM, Shamis, Pavel  wrote:
> 
>> 
>>> The only change is that the receive callback is now occurring in the ORTE 
>>> event thread, and so perhaps someone needs to look at a way to pass that 
>>> back into the OMPI event base (which I guess is the OPAL event base)? Just 
>>> glancing at the code, it looks like that could be the issue - but I 
>>> honestly have no idea what event base someone wants to switch to, or if 
>>> they want to resolve it some other way. There are clearly some things 
>>> happening in the ofacm oob code that involve thread locking etc., but I 
>>> don't know what those areas are trying to do.
>> 
>> I see. In this mode do you enable thread safety support  in all library 
>> (mpi)?
> 
> Only if the user configures to do so - ORTE doesn't require it as we use the 
> event library's thread safety and do everything inside events.
> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib ompi/mca/btl/openib/connect

2013-11-14 Thread Shamis, Pavel

For Iboffload this should not be an issue since our connection manager is 
blocking (I have to double-check )

For openib, this should not be such huge change. The code is pretty much 
standalone, we only have to move it to 
main thread and add signaling mechanism.

I will take a look.

Best,
-Pasha




On Nov 14, 2013, at 7:25 PM, Ralph Castain  wrote:

> 
> On Nov 14, 2013, at 4:22 PM, Shamis, Pavel  wrote:
> 
>> Well, this is major change in a behavior.
>> 
>> Since openib calls communication calls from the callback
>> it pretty much requires to enable thread safety on openib btl level.
> 
> Ah, yes - could well be true. Or else separate the two like we do elsewhere - 
> transfer the recv callback to the openib thread and let it do the rest.
> 
>> 
>> But we may move the queue flush operation from the callback to main thread, 
>> so 
>> the progress engine will wait on a signal from callback. 
> 
> Yep - that's what we do elsewhere
> 
>> 
>> How does it work for other parts of OMPI (sm, communicator) ? 
>> I guess they don't do anything in the callbacks ? 
> 
> Correct - they immediately transfer the info to their local progress engine 
> (in whatever form).
> 
>> 
>> Best,
>> Pasha
>> 
>> On Nov 14, 2013, at 6:35 PM, Ralph Castain  wrote:
>> 
>>> 
>>> On Nov 14, 2013, at 3:33 PM, Shamis, Pavel  wrote:
>>> 
>>>> 
>>>>> The only change is that the receive callback is now occurring in the ORTE 
>>>>> event thread, and so perhaps someone needs to look at a way to pass that 
>>>>> back into the OMPI event base (which I guess is the OPAL event base)? 
>>>>> Just glancing at the code, it looks like that could be the issue - but I 
>>>>> honestly have no idea what event base someone wants to switch to, or if 
>>>>> they want to resolve it some other way. There are clearly some things 
>>>>> happening in the ofacm oob code that involve thread locking etc., but I 
>>>>> don't know what those areas are trying to do.
>>>> 
>>>> I see. In this mode do you enable thread safety support  in all library 
>>>> (mpi)?
>>> 
>>> Only if the user configures to do so - ORTE doesn't require it as we use 
>>> the event library's thread safety and do everything inside events.
>>> 
>>>> 
>>>> ___
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] [EXTERNAL] Re: [OMPI svn-full] svn:open-mpi r29703 - in trunk: contrib/platform/iu/odin ompi/mca/btl/openib ompi/mca/btl/openib/connect

2013-11-21 Thread Shamis, Pavel

>>> 3. Pasha moved the openib/connect to common/ofacm but excluded the rdmacm 
>>> in that move.  Never changed openib to use ofacm/common.
>> Pasha: This is not entirely true.  I changed openib btl ~3 year ago before 
>> my departure from Mellanox.  (I sent link to the code earlier).
>> We (community) were not able to integrate the code because of "first 
>> message" issue in iWarp.
> 
> Hey Pasha,
> 
> We can get rid of the "first message" code altogether.   If its easy for 
> you to move the rdmacm into ofacm and get it to compile, then I could 
> take if from there and test/fix any issues.
> 
> Whatchathink?

It definitely simplifies a lot of things (and cleans the code)!
Let me see how much work we have to do there. I don't have a lot of cycles, but 
I definitely may guide Mellanox team (or any other team :-) ).

Best,
-Pasha

Re: [OMPI devel] RFC: OB1 optimizations

2014-01-07 Thread Shamis, Pavel

Overall it looks good. It would be helpful to validate performance numbers for 
other interconnects as well.
-Pasha

> -Original Message-
> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Nathan
> Hjelm
> Sent: Tuesday, January 07, 2014 6:45 PM
> To: Open MPI Developers List
> Subject: [OMPI devel] RFC: OB1 optimizations
> 
> What: Push some ob1 optimizations to the trunk and 1.7.5.
> 
> What: This patch contains two optimizations:
> 
>   - Introduce a fast send path for blocking send calls. This path uses
> the btl sendi function to put the data on the wire without the need
> for setting up a send request. In the case of btl/vader this can
> also avoid allocating/initializing a new fragment. With btl/vader
> this optimization improves small message latency by 50-200ns in
> ping-pong type benchmarks. Larger messages may take a small hit in
> the range of 10-20ns.
> 
>   - Use a stack-allocated receive request for blocking recieves. This
> optimization saves the extra instructions associated with accessing
> the receive request free list. I was able to get another 50-200ns
> improvement in the small-message ping-pong with this optimization. I
> see no hit for larger messages.
> 
> When: These changes touch the critical path in ob1 and are targeted for
> 1.7.5. As such I will set a moderately long timeout. Timeout set for
> next Friday (Jan 17).
> 
> Some results from osu_latency on haswell:
> 
> hjelmn@cn143 pt2pt]$ mpirun -n 2 --bind-to core -mca btl vader,self
> ./osu_latency
> # OSU MPI Latency Test v4.0.1
> # Size  Latency (us)
> 0   0.11
> 1   0.14
> 2   0.14
> 4   0.14
> 8   0.14
> 16  0.14
> 32  0.15
> 64  0.18
> 128 0.36
> 256 0.37
> 512 0.46
> 10240.56
> 20480.80
> 40961.12
> 81921.68
> 16384   2.98
> 32768   5.10
> 65536   8.12
> 131072 14.07
> 262144 25.30
> 524288 47.40
> 104857691.71
> 2097152   195.56
> 4194304   487.05
> 
> 
> Patch Attached.
> 
> -Nathan

Re: [OMPI devel] Still getting 100% trunk failure on 32 bit platform: coll ml

2014-01-30 Thread Shamis, Pavel

Let me know if you need y help.

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jan 30, 2014, at 10:27 AM, Nathan Hjelm 
mailto:hje...@lanl.gov>> wrote:

Ok. Looks like I need to fix one more. Will take a look now.

-Nathan

On Thu, Jan 30, 2014 at 01:25:44PM +, Jeff Squyres (jsquyres) wrote:
MTT shows 100% trunk failure on 32 bit platform:

   http://mtt.open-mpi.org/index.php?do_redir=2144

It's seg faulting in mca_coll_ml_comm_query().

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Bcol/mcol violations

2014-02-07 Thread Shamis, Pavel

Can you please give a try to the attached hot-fix.
It unrolls most of the spaghetti, except the iboffload component (which is 
anyway disabled).
Sorry for the mess.

Best,
Pasha

On Feb 7, 2014, at 10:52 AM, Nathan Hjelm 
mailto:hje...@lanl.gov>> wrote:

On Fri, Feb 07, 2014 at 07:46:03AM -0800, Ralph Castain wrote:
The issue in 1.7 is all the cross-integration, which means we violate our 
normal behavior when it comes to no-building and user-directed component 
selection. Jeff and I just discussed how this could be resolved using the 
PML-BTL model, but (a) that is not what we have in 1.7, and (b) it isn't clear 
to me how hard it will be to do, and when it might be ready.

However, we don't have the problem of incorrect results that we do in the 
trunk, so we do have a little more latitude.

So the situation with respect to 1.7 is pretty clear: if we can get a PML-BTL 
model in place within the next week, then we can let things continue as-is. If 
we can't, then we remove the coll/ml component and the bcol framework from 1.7, 
leaving the door open to reinstatement whenever the code is actually ready.

Should be ready today. The use of that coll/ml structure is unnecessary
at this time. I am removing it in bcol right now. In the future we will
put in a better fix but this should work for 1.7.x/1.8.x.

-Nathan
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


p4.patch
Description: p4.patch

Re: [OMPI devel] Bcol/mcol violations

2014-02-07 Thread Shamis, Pavel

Exchange is evil….
Attached.

Best,
P




p4.patch.gz
Description: p4.patch.gz


On Feb 7, 2014, at 12:41 PM, Nathan Hjelm <hje...@lanl.gov> wrote:Can you gzip the patch. The local exchange server has a habit ofconverting LF to CRLF.-NathanOn Fri, Feb 07, 2014 at 12:14:02PM -0500, Shamis, Pavel wrote:Can you please give a try to the attached hot-fix.It unrolls most of the spaghetti, except the iboffload component (which is anyway disabled).Sorry for the mess.Best,PashaOn Feb 7, 2014, at 10:52 AM, Nathan Hjelm <hje...@lanl.gov<mailto:hje...@lanl.gov>> wrote:On Fri, Feb 07, 2014 at 07:46:03AM -0800, Ralph Castain wrote:The issue in 1.7 is all the cross-integration, which means we violate our normal behavior when it comes to no-building and user-directed component selection. Jeff and I just discussed how this could be resolved using the PML-BTL model, but (a) that is not what we have in 1.7, and (b) it isn't clear to me how hard it will be to do, and when it might be ready.However, we don't have the problem of incorrect results that we do in the trunk, so we do have a little more latitude.So the situation with respect to 1.7 is pretty clear: if we can get a PML-BTL model in place within the next week, then we can let things continue as-is. If we can't, then we remove the coll/ml component and the bcol framework from 1.7, leaving the door open to reinstatement whenever the code is actually ready.Should be ready today. The use of that coll/ml structure is unnecessaryat this time. I am removing it in bcol right now. In the future we willput in a better fix but this should work for 1.7.x/1.8.x.-Nathan___devel mailing listde...@open-mpi.org<mailto:de...@open-mpi.org>http://www.open-mpi.org/mailman/listinfo.cgi/devel___devel mailing listde...@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/devel___devel mailing listde...@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] 1.7.5 end-of-week status report

2014-03-17 Thread Shamis, Pavel

> 
> I thought ORNL had addresed the cross-linkage as well. I am sure they
> will get a fix for that in the next couple of days.

This was unused h file. I fixed it.
-Pasha

Re: [OMPI devel] 答复: 答复: doubt on latency result with OpenMPI library

2014-03-28 Thread Shamis, Pavel

> On Mar 27, 2014, at 11:45 PM, "Wang,Yanfei(SYS)"  
> wrote:
> 
>> 1. In the RoCE, we cannot use OOB(via tcp socket) for RDMA connection.  
> 
> More specifically, RoCE QPs can only be made using the RDMA connection 
> manager.

Technically you may setup RoCE connection without RDMA CM. 

The version of the RoCE support that I implemented (in an alternative MPI 
implementation) did it through the regular OOB
channel. As I remember the only difference is the fact that you have to 
exchange mac instead of guid plus some other small tricks. The problem
of this approach is in the VLAN support, which is more challenging to implement 
this way. Therefore RDMACM is sort of "preferred" method.

-Pasha


> 
>> However, as I known, mellanox HCA supporting RoCE can make rdma and tcp/ip 
>> work simultaneously. whether some other HCAs can only work on RoCE and 
>> normal Ethernet individually,
> 
> FYI: Mellanox is the only RoCE vendor.
> 
>> so that OMPI cannot user OOB(like tcp socket) to build rdma connection 
>> except RDMA_CM?   
> 
> You're mixing two different things: having the ability to run an OS IP stack 
> over a RoCE-capable NIC is orthogonal to whether you can use some out-of-band 
> method to make RoCE RC QPs.
> 
> I think you're misunderstanding what OMPI's "oob" QP connection mechanism 
> did.  Here's what it did:
> 
> 1. MPI processes A and B (on different servers) would create half a QP
> 2. they would then extract the QP connection information from the 
> half-created QP data structures (e.g., the unique QP number) -- A would extra 
> Aa and B would extra Bb
> 3. A and B would exchange this information
> 4. A would use Bb to finish creating its QP, and B would use Aa to finish 
> creating its QP.  This is a LOCAL operation -- it's effectively just filling 
> in some data structures.
> 5. Now A and B have fully formed QPs and can use them to send/receive to each 
> other.
> 
> The fact that #3 used TCP sockets to exchange information is irrelevant -- 
> you could very well have printed out that information on a screen and 
> hand-typed the information in at the peer.
> 
> The only important aspect is that the information had to be exchanged.  It 
> doesn't matter whether you use TCP sockets or the actual RDMA CM.
> 
> *** Also keep in mind that OMPI's "oob" connection method for IB RC QPs in 
> the openib BTL has been deleted, and has been wholly replaced with the "udcm" 
> connection method (which uses UD QPs for #3, which act very much like UDP 
> datagrams).
> 
> For IB, this method of "exchange critical connection information via an 
> out-of-band method" works fine.  For RoCE, it's not possible -- there's 
> additional, kernel-level (and possibly hardware-level? I don't know/remember 
> offhand) information that cannot be extracted by userspace and exchanged via 
> an out-of-band method.  Hence, you HAVE to use the RDMA CM to make RoCE QPs.
> 
> Let me make this totally clear: the fact that you have to use the RCMA CM to 
> make RoCE RC QPs is not an OMPI choice.  It's mandated by how the RoCE 
> technology works.  IB technology allows the "workaround" of extracting the 
> necessary connection information such that we can use our "udcm" and not RDMA 
> CM.
> 
>> I think, If OOB(like tcp) can run simultaneously with ROCE, the rdma 
>> connection management would benefit from tcp socket's scalabitly , right?  
>> 
>> 2. Scalability of RDMA_CM.  
>> Previously I also have few doubts on RDMA_CM ' scalability,  when I go deep 
>> insight into source code of RDMA_CM library and corresponding kernel module, 
>> eg, the shared single QP1 for connection requestion and response, which 
>> could introduce severe lock contention if huge rdma connections exist and 
>> remote NUMA memory access at multi-core platform; also lots of shared 
>> session management data structures which could cause additional contention; 
>> However, if the connection are not frequently destroyed and rebuilt, does 
>> the scalability still have highly dependency on RDMA_CM?   
>> To get further aware of UDCM, I would like to have a deep understanding on 
>> rdma_CM's disadvantage.  
> 
> You'll have to ask Mellanox / the OpenFabrics community for insights about 
> the RDMA CM.  To OMPI, that's the lower layer and we're just a consumer of it.
> 
> Keep in mind that the CM is only used during QP connection establishment -- 
> it's not used after that.  So if it's a little less efficient, it usually 
> doesn't matter (if it's a LOT less efficient, then it does matter, of course).
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/03/14418.php

Re: [OMPI devel] [OMPI svn] svn:open-mpi r31302 - in trunk: opal/mca/base orte/tools/orterun

2014-04-03 Thread Shamis, Pavel


> mca param file treats any key=val as mca parameter only.
> In order to add parser support for something that is not mca param, will 
> require change file syntax and it will look bad, i.e.:
> 
> mca btl = sm,self,openib
> env DISPLAY = console:0
> 
> I think the current implementation is less intrusive and re-uses existing 
> infra in the most elegant way.
> The param file syntax change is too big effort to justify this feature (IMHO) 
> which can be provided with existing infra w/o breaking anything.


IMHO this is a useful parameter option to have. If we may consolidate these two 
parameters (-x and the new one) into
single one, it might be even more helpful.

Best,
Pasha.

Re: [OMPI devel] [devel-core] OMPI MCA components - track external libs versions

2014-04-14 Thread Shamis, Pavel

+1. This is very helpful info to have.

Best,
Pavel (Pasha) Shamis

On Apr 14, 2014, at 2:57 PM, Mike Dubman 
mailto:mi...@dev.mellanox.co.il>> wrote:

sure, lets discuss it on the next telecon in 1w (Mellanox IL is OOO for 
holidays and Josh is on vacation).

I think it is very good feature from enhancing OMPI usability point of view.

See it as a programmable version of release notes, i.e.

example:

- In release notes vendors often specify that OpenMPI-SHMEM with PMI2 requires 
mxm 2.1, slurm 2.6.2+, libibverbs 2.2+, etc.
- The user/site/sysadmin can compile OpenMPI-SHMEM package with libibverbs 2.1, 
mxm 1.5 and slurm 2.6.1 which is perfectly valid and will work w/o any issues, 
but not certified by vendor because of some known issues with this mix.

- vendor can provide script (or site admin can write one based on site local 
certification) to check with help of ompi_info,oshmem_info the current setup 
version which was compiled with OMPI and get a warning and save hassle of 
running into well-known issues.

I think (+know) that many production environments and OMPI users will be happy 
to have it.




On Mon, Apr 14, 2014 at 6:07 PM, Ralph Castain 
mailto:r...@open-mpi.org>> wrote:
Perhaps this is something best discussed on the weekly telecon? I think you are 
misunderstanding what I'm saying. I'm not heavily against it, but I still don't 
see the value, and dislike making disruptive changes that span the code base 
without first ensuring there is no other viable alternative.

FWIW: Most libraries remain ABI compliant across major releases for exactly the 
reasons you cite. We don't actually support building against one library 
version and running against another for these very reasons - if users do that, 
it is at their own risk. Your change won't resolve that problem as ompi_info is 
just as likely to barf when confronted by that situation - remember, in order 
to register the component, ompi_info has to *load* it first. So any library 
incompatibility may well have already caused a problem.


On Apr 14, 2014, at 7:59 AM, Mike Dubman 
mailto:mi...@dev.mellanox.co.il>> wrote:

There is no correlation between built_with and running_with versions of 
external libraries supported by OMPI.

The next release of external library does not mean we should remove code in 
ompi for all previous supported releases for the same library.

vendor/site can certify slurm version 2.6.1 while latest is 2.6.6.
SLURM is not ABI compliant between releases, so site would like to know what is 
active version vs. certified to issue an early warning.

Why are you so against it? I don`t see any issue with printing ext lib version 
number in the MCA description, something that can improve 
sysadmin/user-experience.




On Mon, Apr 14, 2014 at 5:47 PM, Ralph Castain 
mailto:r...@open-mpi.org>> wrote:

On Apr 14, 2014, at 7:34 AM, Mike Dubman 
mailto:mi...@dev.mellanox.co.il>> wrote:

it is unrelated:

1. The OMPI can support and built with many different (or all) versions of 
external library (for example: libmxm or libslurm).

Not true - we do indeed check the library version in all cases where it 
matters. For example, the case you cite as your true story could easily have 
been prevented by using OMPI_CHECK_PACKAGE to verify that the libmxm had the 
required function in it

2. The OMPI utility ompi_info can expose the currently available version of 
libmxm/libslurm.

Yes - but what good does that do? Bottom line is that you shouldn't have built 
if that library version isn't supported


3. The vendor or end-user wants to certify specific version of libmxm or 
libslurm to be used in the customer environment.
4. The current way - put a note into libmxm/libslurm Relase Notes, which is not 
a guarantee that site user/sysadmin will pay attention in production 
environment.

Again, that's the whole purpose of the configure logic. You are supposed to 
check the library to ensure it is compatible, not just blindly build and then 
make the user figure it out

5. The suggestion is to use #2 to write script by user or vendor which will 
match currently available versions with supported/certified and let admin/user 
know that there is a mismatch between running and supported version.

Like I said, that's the developer's responsibility to get the configure logic 
correct - not the user's responsibility to figure it out after-the-fact.


P.S. based on the true story :)



On Mon, Apr 14, 2014 at 5:19 PM, Ralph Castain 
mailto:r...@open-mpi.org>> wrote:


I'm still confused - how is that helpful? How was the build allowed to complete 
if the external library version isn't supported?? You should either quietly 
not-build the affected component, or error out if the user specifically 
requested that component be built.

This sounds to me like you have a weakness in your configure logic, and are 
trying to find a bandaid. Perhaps a better solution (that wouldn't cause us to 
change every component in the code base) would be to just add appro

Re: [OMPI devel] SHMEM symmetric objects in shared libraries

2014-07-29 Thread Shamis, Pavel

> 
> On 05/10/2014 02:46 PM, Bert Wesarg wrote:
>> Hi,
>> 
>> Btw, I'm pretty confident, that this Open SHMEM implementation does not
>> recognize global or static variables in shared libraries as symmetric
>> objects. It is probably wise to note this somewhere to the users.
> 
> I've never got an reply to this query. Any comments on it?

it is not supported by OpenSHMEM specification (v1.1 page 3 lines 34-35). 
(it has never been supported by shmem)

Best,
Pasha



signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [OMPI devel] SHMEM symmetric objects in shared libraries

2014-07-29 Thread Shamis, Pavel


then in your main example below do a shmem_long_fadd on my_dso_val.
It won’t work unless you’ve put smarts in the shmem library to go through
the segments of loaded shared libraries and register them with the same
mechanism used for the data segment of the executable.


In this case the "smart" part will be pretty complex.

Best,
Pasha


Howard


From: devel 
[mailto:devel-boun...@open-mpi.org] On 
Behalf Of Joshua Ladd
Sent: Tuesday, July 29, 2014 10:57 AM
To: Open MPI Developers
Subject: Re: [OMPI devel] SHMEM symmetric objects in shared libraries

Are you claiming that in the following test, the static variable "val" will not 
be seen as a symmetric object?
#include "shmem.h"

int main( int argc, char **argv){
long my_pe, npes, master;

start_pes(0);
my_pe = shmem_my_pe();
npes = shmem_n_pes();

master = npes - 1;

   /* only used on master */
   static long val = 0;


   if(my_pe != master ){
   shmem_long_fadd(&val,1,master);
   }

   shmem_barrier_all();
  return 0;
}
Josh


On Tue, Jul 29, 2014 at 11:27 AM, Bert Wesarg 
mailto:bert.wes...@tu-dresden.de>> wrote:
Hi,

On 05/10/2014 02:46 PM, Bert Wesarg wrote:
Hi,

Btw, I'm pretty confident, that this Open SHMEM implementation does not
recognize global or static variables in shared libraries as symmetric
objects. It is probably wise to note this somewhere to the users.

I've never got an reply to this query. Any comments on it?

Bert

Kind regards,
Bert Wesarg

--
Dipl.-Inf. Bert Wesarg
wiss. Mitarbeiter

Technische Universität Dresden
Zentrum für Informationsdienste und Hochleistungsrechnen (ZIH)
01062 Dresden
Tel.: +49 (351) 463-42451
Fax: +49 (351) 463-37773
E-Mail: bert.wes...@tu-dresden.de


___
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2014/07/15305.php


___
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2014/07/15313.php

___
devel mailing list
de...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2014/07/15314.php

Re: [OMPI devel] SHMEM symmetric objects in shared libraries

2014-07-29 Thread Shamis, Pavel

 
 Btw, I'm pretty confident, that this Open SHMEM implementation does not
 recognize global or static variables in shared libraries as symmetric
 objects. It is probably wise to note this somewhere to the users.
>>> 
>>> I've never got an reply to this query. Any comments on it?
>> 
>> it is not supported by OpenSHMEM specification (v1.1 page 3 lines 34-35).
>> (it has never been supported by shmem)
> 
> thanks Pasha, that should have crossed my eyes.

Not a problem ! In last month at least three OpenSHMEM users stepped on exactly 
the same issue :)
- Pasha



signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [OMPI devel] SHMEM symmetric objects in shared libraries

2014-07-29 Thread Shamis, Pavel

Is v1.1 posted somewhere? I don't see it up on the LBNL site.

www.openshmem.org<http://www.openshmem.org>

"Get documentation" - > "Specification"

(For some reason I can not get direct link)

Pasha


Josh


On Tue, Jul 29, 2014 at 2:05 PM, Shamis, Pavel 
mailto:sham...@ornl.gov>> wrote:
>>>>
>>>> Btw, I'm pretty confident, that this Open SHMEM implementation does not
>>>> recognize global or static variables in shared libraries as symmetric
>>>> objects. It is probably wise to note this somewhere to the users.
>>>
>>> I've never got an reply to this query. Any comments on it?
>>
>> it is not supported by OpenSHMEM specification (v1.1 page 3 lines 34-35).
>> (it has never been supported by shmem)
>
> thanks Pasha, that should have crossed my eyes.

Not a problem ! In last month at least three OpenSHMEM users stepped on exactly 
the same issue :)
- Pasha


___
devel mailing list
de...@open-mpi.org<mailto:de...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2014/07/15318.php

___
devel mailing list
de...@open-mpi.org<mailto:de...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2014/07/15319.php

Re: [OMPI devel] v1.5: sigsegv in case of extremely low settings in the SRQs

2010-06-23 Thread Shamis, Pavel

Good catch. The patch looks ok for me.


Regards
---
Pavel Shamis (Pasha)
sham...@ornl.gov

On Jun 18, 2010, at 11:10 AM, nadia.derbey wrote:

> Hi,
> 
> Reference is the v1.5 branch
> 
> If an SRQ has the following settings: S,,4,2,1
> 
> 1) setup_qps() sets the following:
> mca_btl_openib_component.qp_infos[qp].u.srq_qp.rd_num=4
> mca_btl_openib_component.qp_infos[qp].u.srq_qp.rd_init=rd_num/4=1
> 
> 2) create_srq() sets the following:
> openib_btl->qps[qp].u.srq_qp.rd_curr_num = 1 (rd_init value)
> openib_btl->qps[qp].u.srq_qp.rd_low_local = rd_curr_num - (rd_curr_num
>>> 2) = rd_curr_num = 1
> 
> 3) if mca_btl_openib_post_srr() is called with rd_posted=1:
> rd_posted > rd_low_local is false
> num_post=rd_curr_num-rd_posted=0
> the loop is not executed
> wr is never initialized (remains NULL)
> wr->next: address not mapped
> ==> SIGSEGV
> 
> The attached patch solves the problem by ensuring that we'll actually
> enter the loop and leave otherwise.
> Can someone have a look please: the patch solves the problem with my
> reproducer, but I'm not sure the fix covers all the situations.
> 
> Regards,
> Nadia
> <001_openib_low_rd_num.patch>

Re: [OMPI devel] autogen.sh improvements

2010-08-31 Thread Shamis, Pavel

Jeff,

Is the autogen changes are public available? I would like to see the code.

Thanks.



On Aug 16, 2010, at 10:55 AM, Jeff Squyres wrote:

> I just wanted to give the community a heads up that Ralph, Brian, and I are 
> revamping autogen in a Mercurial branch.  I don't know the exact timeline to 
> completion, but it won't be *too* far in the future.  
> 
> We made some core changes, and then made some other changes that necessitated 
> minor edits to many component Makefile.am's and configure.m4's.  So the 
> overall commit will look *much* bigger than it really is.  But it's all good 
> stuff.  :-)
> 
> Here's a list of the intended high-level changes:
> 
> Improvements:
> -
> 1. "autogen.sh" is now "autogen.pl" (i.e., autogen is now in perl, not Bourne)
> --> We can put a sym link in SVN so that the old name still works, if it's 
> important to people
> 2. the project/framework/component discovery is quite a bit faster
> 3. the perl code is a LOT easier to maintain (and add features to)
> 4. autogen.pl defaults to --no-ompi if ompi/ is not present (which is good 
> for OPAL+ORTE tarballs)
> 5. ompi_mca.m4 has been cleaned up a bit, allowing autogen.pl to be a little 
> dumber than autogen.sh
> 6. vprotocol components now live in ompi/mca/vprotocol (instead of 
> ompi/mca/pml/v/mca/vprotocol)
> 7. a few more "OMPI" name cleanups (e.g., s/ompi/mca/gi and s/ompi/opal/gi 
> where relevant)
> 
> New features:
> -
> 1. configure.params won't be necessary for components that have no 
> configure.m4 and only have a single Makefile.am
> 2. configure.params won't be necessary for components that call 
> AC_CONFIG_FILES themselves in their configure.m4 file
> 3. added --enable-mca-only-build= option (opposite of 
> --enable-mca-no-build)
> 4. autogen.pl accepts --platform= argument, just like configure
> 
> Dropped features:
> -
> 1. component configure.stub files are no longer supported
> 2. component compile-time priorities are no longer supported (or necessary)
> 3. SVK is no longer supported
> 4. it is not possible to run autogen.pl from a component directory
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] openib btl - fatal errors don't abort the job

2010-09-07 Thread Shamis, Pavel

On Sep 3, 2010, at 8:14 AM, Jeff Squyres wrote:

> On Sep 1, 2010, at 4:47 PM, Steve Wise wrote:
> 
>> I was wondering what the logic is behind allowing an MPI job to continue in 
>> the presence of a fatal qp error?
> 
> It's a feature...?

The idea was that in some near future we will be able to recover from such kind 
of error. (reopen qp, etc...)
But the feature has never been implemented for ompi. 
(BTW, not sure that it is tree anymore, since SUN/ORACLE pushed some code, that 
supposed to handle such cases...)

So, maybe it worth to handle it like device fatal case - abort everything.

Pasha

Re: [OMPI devel] coll/ml without hwloc (?)

2014-08-26 Thread Shamis, Pavel

Theoretically, we may make it functional (with good performance) even without 
hwloc.
As it is today, I would suggest to disable ML if hwloc is disabled.

Best,
Pasha

> -Original Message-
> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Gilles
> Gouaillardet
> Sent: Tuesday, August 26, 2014 4:38 AM
> To: Open MPI Developers
> Subject: [OMPI devel] coll/ml without hwloc (?)
> 
> Folks,
> 
> i just commited r32604 in order to fix compilation (pmix) when ompi is
> configured with --without-hwloc
> 
> now, even a trivial hello world program issues the following output
> (which is a non fatal, and could even be reported as a warning) :
> 
> [soleil][[32389,1],0][../../../../../../src/ompi-
> trunk/ompi/mca/coll/ml/coll_ml_module.c:1496:ml_discover_hierarchy]
> COLL-ML Error: (size of mca_bcol_base_components_in_use = 3) != (size of
> mca_sbgp_base_components_in_use = 2) or zero.
> [soleil][[32389,1],1][../../../../../../src/ompi-
> trunk/ompi/mca/coll/ml/coll_ml_module.c:1496:ml_discover_hierarchy]
> COLL-ML Error: (size of mca_bcol_base_components_in_use = 3) != (size of
> mca_sbgp_base_components_in_use = 2) or zero.
> 
> 
> in my understanding, coll/ml somehow relies on the topology information
> (reported by hwloc) so i am wondering whether we should simply
> *not* compile coll/ml or set its priority to zero if ompi is configured
> with --without-hwloc
> 
> any thoughts ?
> 
> Cheers,
> 
> Gilles
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-
> mpi.org/community/lists/devel/2014/08/15708.php

Re: [OMPI devel] segfault in openib component on trunk

2014-08-29 Thread Shamis, Pavel

I was under impression that mca_tl_openib_tune_endpoint supposed to handle the 
miss-match between tunings of different devices.
Few years ago we did some "extreme" inter-operability testing and ompi handled 
all cases really well.

I'm not sure if I understand correctly what is the "core" issue.


Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Aug 29, 2014, at 4:12 AM, Gilles Gouaillardet 
mailto:gilles.gouaillar...@iferc.org>> wrote:

Ralph,


r32639 and r32642 fixes bugs that do exist in both trunk and v1.8, and they can 
be considered as independent of the issue that is discussed in this thread and 
the one you pointed.

so imho, they should land v1.8 even if they do not fix the issue we are now 
discussing here

Cheers,

Gilles


On 2014/08/29 16:42, Ralph Castain wrote:

This is the email thread which sparked the problem:

http://www.open-mpi.org/community/lists/devel/2014/07/15329.php

I actually tried to apply the original CMR and couldn't get it to work in the 
1.8 branch - just kept having problems, so I pushed it off to 1.8.3. I'm leery 
to accept either of the current CMRs for two reasons: (a) none of the preceding 
changes is in the 1.8 series yet, and (b) it doesn't sound like we still have a 
complete solution.

Anyway, I just wanted to point to the original problem that was trying to be 
addressed.


On Aug 28, 2014, at 10:01 PM, Gilles Gouaillardet 
 wrote:



Howard and Edgar,

i fixed a few bugs (r32639 and r32642)

the bug is trivial to reproduce with any mpi hello world program

mpirun -np 2 --mca btl openib,self hello_world

after setting the mca param in the $HOME/.openmpi/mca-params.conf

$ cat ~/.openmpi/mca-params.conf
btl_openib_receive_queues = S,12288,128,64,32:S,65536,128,64,3

good news is the program does not crash with a glory SIGSEGV any more
bad news is the program will (nicely) abort for an incorrect reason :

--
The Open MPI receive queue configuration for the OpenFabrics devices
on two nodes are incompatible, meaning that MPI processes on two
specific nodes were unable to communicate with each other.  This
generally happens when you are using OpenFabrics devices from
different vendors on the same network.  You should be able to use the
mca_btl_openib_receive_queues MCA parameter to set a uniform receive
queue configuration for all the devices in the MPI job, and therefore
be able to run successfully.

 Local host:   node0
 Local adapter:mlx4_0 (vendor 0x2c9, part ID 4099)
 Local queues: S,12288,128,64,32:S,65536,128,64,3

 Remote host:  node0
 Remote adapter:   (vendor 0x2c9, part ID 4099)
 Remote queues:
P,128,256,192,128:S,2048,1024,1008,64:S,12288,1024,1008,64:S,65536,1024,1008,64

the root cause is the remote host did not send its receive_queues to the
local host
(and hence the local host believes the remote hosts uses the default value)

the logic was revamped vs v1.8, that is why v1.8 does not have such issue.

i am still thinking what should be the right fix :
- one option is to send the receive queues
- an other option would be to differenciate value overrided in
mca-params.conf (should be always ok) of value overrided in the .ini
 (might want to double check local and remote values match)

Cheers,

Gilles

On 2014/08/29 7:02, Pritchard Jr., Howard wrote:


Hi Edgar,

Could you send me your conf file?  I'll try to reproduce it.

Maybe run with --mca btl_base_verbose 20 or something to
see what the code that is parsing this field in the conf file
is finding.


Howard


-Original Message-
From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Edgar Gabriel
Sent: Thursday, August 28, 2014 3:40 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] segfault in openib component on trunk

to add another piece of information that I just found, the segfault only occurs 
if I have a particular mca parameter set in my mca-params.conf file, namely

btl_openib_receive_queues = S,12288,128,64,32:S,65536,128,64,3

Has the syntax for this parameter changed, or should/can I get rid of it?

Thanks
Edgar

On 08/28/2014 04:19 PM, Edgar Gabriel wrote:


we are having recently problems running trunk with openib component
enabled on one of our clusters. The problem occurs right in the
initialization part, here is the stack right before the segfault:

---snip---
(gdb) where
#0  mca_btl_openib_tune_endpoint (openib_btl=0x762a40,
endpoint=0x7d9660) at btl_openib.c:470
#1  0x7f1062f105c4 in mca_btl_openib_add_procs (btl=0x762a40,
nprocs=2, procs=0x759be0, peers=0x762440, reachable=0x7fff22dd16f0) at
btl_openib.c:1093
#2  0x7f106316102c in mca_bml_r2_add_procs (nprocs=2,
procs=0x759be0, reachable=0x7fff22dd16f0) at bml_r2.c:201
#3  0x7f10615c0dd5 in mca_pml_ob1_add_procs (procs=0x70dc00,
nprocs=2) at pml_ob1.c:334
#4  0x7f106823ed84 in ompi_mpi_init (arg

Re: [OMPI devel] Need to know your Github ID

2014-09-10 Thread Shamis, Pavel

Jeff,
pasha -> shamisp

> -Original Message-
> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Jeff
> Squyres (jsquyres)
> Sent: Wednesday, September 10, 2014 6:46 AM
> To: Open MPI Developers List
> Subject: [OMPI devel] Need to know your Github ID
> 
> As the next step of the planned migration to Github, I need to know:
> 
> - Your Github ID (so that you can be added to the new OMPI git repo)
> - Your SVN ID (so that I can map SVN->Github IDs, and therefore map Trac
> tickets to appropriate owners)
> 
> Here's the list of SVN IDs who have committed over the past year -- I'm
> guessing that most of these people will need Github IDs:
> 
>  adrian
>  alekseys
>  alex
>  alinas
>  amikheev
>  bbenton
>  bosilca (done)
>  bouteill
>  brbarret
>  bwesarg
>  devendar
>  dgoodell (done)
>  edgar
>  eugene
>  ggouaillardet
>  hadi
>  hjelmn
>  hpcchris
>  hppritcha
>  igoru
>  jjhursey (done)
>  jladd
>  jroman
>  jsquyres (done)
>  jurenz
>  kliteyn
>  manjugv
>  miked (done)
>  mjbhaskar
>  mpiteam (done)
>  naughtont
>  osvegis
>  pasha
>  regrant
>  rfaucett
>  rhc (done)
>  rolfv (done)
>  samuel
>  shiqing
>  swise
>  tkordenbrock
>  vasily
>  vvenkates
>  vvenkatesan
>  yaeld
>  yosefe
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: http://www.open-
> mpi.org/community/lists/devel/2014/09/15788.php

Re: [OMPI devel] [OMPI users] simple mpi hello world segfaults when coll ml not disabled

2015-06-25 Thread Shamis, Pavel

As I read this thread - this issue is not related to the ML bootstrap itself, 
but the naming conflict between public functions in HCOLL and ML. 

Did I get it right ?

If this the case, we can work with Mellanox folks to resolve this conflict.

Best,

Pavel (Pasha) Shamis
---
Computer Science Research Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jun 25, 2015, at 10:34 AM, Jeff Squyres (jsquyres)  
wrote:

> Gilles --
> 
> Can you send a stack trace from one of these crashes?
> 
> I am *guessing* that the following is happening:
> 
> 1. coll selection begins
> 2. coll ml is queried, and disqualifies itself (but is not dlclosed yet)
> 3. coll hcol is queried, which ends up calling down into libhcol.  libhcol 
> calls a coll_ml_* symbol (which is apparently in a different .o file in the 
> library), but the linker has already resolved that coll_ml_* symbol in the 
> coll ml DSO.  So the execution transfers back up into the coll ml DSO, and 
> ... kaboom.
> 
> A simple stack trace will confirm this -- it should show execution going down 
> into libhcol and then back up into coll ml.
> 
> 
> 
> 
>> On Jun 25, 2015, at 1:03 AM, Gilles Gouaillardet  wrote:
>> 
>> Folks,
>> 
>> this is a followup on an issue reported by Daniel on the users mailing list :
>> OpenMPI is built with hcoll from Mellanox.
>> the coll ml module has default priority zero.
>> 
>> on my cluster, it works just fine
>> on Daniel's cluster, it crashes.
>> 
>> i was able to reproduce the crash by tweaking mca_base_component_path and 
>> ensure
>> the coll ml module is loaded first.
>> 
>> basically, i found two issues :
>> 1) libhcoll.so (vendor lib provided by Mellanox, i tested 
>> hpcx-v1.3.336-gcc-OFED-1.5.4.1-redhat6.2-x86_64) seems to include its own 
>> coll ml, since there are some *public* symbols that are common to this 
>> module (ml_open, ml_coll_hier_barrier_setup, ...)
>> 2) coll ml priority is zero, and even if the library is dlclose'd, it seems 
>> this is uneffective
>> (nothing changed in /proc/xxx/maps before and after dlclose)
>> 
>> 
>> there are two workarounds :
>> mpirun --mca coll ^ml
>> or
>> mpirun --mca coll ^hcoll ... (probably not what is needed though ...)
>> 
>> is it expected the library is not unloaded after dlclose ?
>> 
>> Mellanox folks,
>> can you please double check how libhcoll is built ?
>> i guess it would work if the ml_ symbols were private to the library.
>> if not, the only workaround is to mpirun --mca coll ^ml
>> otherwise, it might crash (if coll_ml is loaded before coll_hcoll, which is 
>> really system dependent)
>> 
>> Cheers,
>> 
>> Gilles
>> On 6/25/2015 10:46 AM, Gilles Gouaillardet wrote:
>>> Daniel,
>>> 
>>> thanks for the logs.
>>> 
>>> an other workaround is to
>>> mpirun --mca coll ^hcoll ...
>>> 
>>> i was able to reproduce the issue, and it surprisingly occurs only if the 
>>> coll_ml module is loaded *before* the hcoll module.
>>> /* this is not the case on my system, so i had to hack my 
>>> mca_base_component_path in order to reproduce the issue */
>>> 
>>> as far as i understand, libhcoll is a proprietary software, so i cannot dig 
>>> into it.
>>> that being said, i noticed libhcoll defines some symbols (such as 
>>> ml_coll_hier_barrier_setup) that are also defined by the coll_ml module, so 
>>> it is likely hcoll coll_ml and openmpi coll_ml are not binary compatible 
>>> hence the error.
>>> 
>>> i will dig a bit more and see if this is even supposed to happen (since 
>>> coll_ml_priority is zero, why is the module still loaded ?)
>>> 
>>> as far as i am concerned, you *have to* mpirun --mca coll ^ml or update 
>>> your user/system wide config file to blacklist the coll_ml module to ensure 
>>> this is working.
>>> 
>>> Mike and Mellanox folks, could you please comment on that ?
>>> 
>>> Cheers,
>>> 
>>> Gilles
>>> 
>>> 
>>> 
>>> On 6/24/2015 5:23 PM, Daniel Letai wrote:
 Gilles,
 
 Attached the two output logs.
 
 Thanks,
 Daniel
 
 On 06/22/2015 08:08 AM, Gilles Gouaillardet wrote:
> Daniel,
> 
> i double checked this and i cannot make any sense with these logs.
> 
> if coll_ml_priority is zero, then i do not any way how 
> ml_coll_hier_barrier_setup can be invoked.
> 
> could you please run again with --mca coll_base_verbose 100
> with and without --mca coll ^ml
> 
> Cheers,
> 
> Gilles
> 
> On 6/22/2015 12:08 AM, Gilles Gouaillardet wrote:
>> Daniel,
>> 
>> ok, thanks
>> 
>> it seems that even if priority is zero, some code gets executed
>> I will confirm this tomorrow and send you a patch to work around the 
>> issue if that if my guess is proven right
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> On Sunday, June 21, 2015, Daniel Letai  wrote:
>> MCA coll: parameter "coll_ml_priority" (current value: "0", data source: 
>> default, level: 9 dev/all, type: int)
>> 
>> Not sure how to read th

1 2 >

1 - 100 of 105 matches

Mail list logo