Re: [OMPI devel] OMPI 1.4.3 hangs in gather

2011-01-13 Thread Nysal Jan
Try manually specifying the collective component "-mca coll tuned" You seem to be using the "sync" collective component, any stale mca param files lying around ? --Nysal On Tue, Jan 11, 2011 at 6:28 PM, Doron Shoham wrote: > Hi > > All machines on the setup are IDataPlex with Nehalem 12 cores p

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r24449

2011-02-23 Thread Nysal Jan
Brian, this is the bug report - https://bugzilla.redhat.com/show_bug.cgi?id=679489 --Nysal On Thu, Feb 24, 2011 at 3:45 AM, Barrett, Brian W wrote: > George - > > You're right, I misread the patch. I've run into the same issue with gcc > before, but not on x86. > > Jay, can you point us to the

Re: [OMPI devel] Error message improvement

2009-09-09 Thread Nysal Jan
__FUNCTION__ is not portable. __func__ is but it needs a C99 compliant compiler. --Nysal On Tue, Sep 8, 2009 at 9:06 PM, Lenny Verkhovsky wrote: > fixed in r21952 > thanks. > > On Tue, Sep 8, 2009 at 5:08 PM, Arthur Huillet wrote: > >> Lenny Verkhovsky wrote: >> >>> Why not using __FUNCTION__

Re: [OMPI devel] PML csum: checksum for RDMA transfers?

2010-01-26 Thread Nysal Jan
If I remember correctly the RDMA write based data transfers are checksummed. The checksums are sent to the receiver's side via the FIN fragment sent after the RDMA. --Nysal On Tue, Jan 26, 2010 at 9:53 AM, Jeff Squyres wrote: > On Jan 19, 2010, at 9:59 AM, Sebastian Rinke wrote: > > > I'm using

Re: [OMPI devel] How to disable paffinity

2010-07-05 Thread Nysal Jan
The wiki(https://svn.open-mpi.org/trac/ompi/wiki) has some useful information for developers. For creating a component - https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateComponent Regards --Nysal 2010/7/6 张晶 > Hi Ralph , > > It is really a bad news that vxworks even doesn't include rsh serv

Re: [OMPI devel] 1.5rc5: opal_path_nfs test failure on GPFS filesystem

2010-08-26 Thread Nysal Jan
Thanks for the report. Fixed in r23669 - https://svn.open-mpi.org/trac/ompi/changeset/23669 I will file a CMR to move this to 1.5 branch --Nysal On Wed, Aug 25, 2010 at 11:55 AM, Paul H. Hargrove wrote: > Testing 1.5rc5 on Linux/PPC64 I get a test failure in "make check" that > probably relates

[OMPI devel] btl_openib_max_btls

2006-09-22 Thread Nysal Jan
The ompi_info command shows the following description for "btl_openib_max_btls" parameter MCA btl: parameter "btl_openib_max_btls" (current value: "-1") Maximum number of HCA ports to use (-1 = use all available, otherwise must be >= 1) Even though I specify "mpirun --mca btl_openib_max_btls 1 .

Re: [OMPI devel] btl_openib_max_btls

2006-09-27 Thread Nysal Jan
hould not be a problem in the released v1.1 series. Can you confirm that you were using the OMPI trunk or the v1.2 branch? If you're seeing this in the v1.1 series, then we need to look at this a bit closer... On 9/22/06 1:25 PM, "Nysal Jan" wrote: > The ompi_info comm

Re: [OMPI devel] MPI between amd64 and x86

2006-11-04 Thread Nysal Jan
come from the BTL headers where the fields do not have the same alignment inside. The original question was asked by Nysal Jan on an email with the subject "SEGV in EM64T <--> PPC64 communication" on Oct. 11 2006. Unfortunately, we still have the same problem. I'm forwardin

Re: [OMPI devel] MPI between amd64 and x86

2006-11-04 Thread Nysal Jan
I have opened a ticket http://svn.open-mpi.org/trac/ompi/ticket/587 --Nysal On 11/4/06, Adrian Knoth wrote: On Sat, Nov 04, 2006 at 02:07:58PM +0530, Nysal Jan wrote: > >come from the BTL headers where the fields do not have the same > >alignment inside. The original question

Re: [OMPI devel] jnysal-openib-wireup branch

2007-06-06 Thread Nysal Jan
Hi Jeff, 1. The logic for if_exclude was not correct. I committed a fix for it. https://svn.open-mpi.org/trac/ompi/changeset/14748 Thanks 2. I'm a bit confused on a) how the new MCA params mca_num_hcas and map_num_procs_per_hca are supposed to be used and b) what their default values shou

Re: [OMPI devel] jnysal-openib-wireup branch

2007-06-07 Thread Nysal Jan
n the Very Near Future. :-) On Jun 6, 2007, at 7:02 AM, Nysal Jan wrote: > Hi Jeff, > > 1. The logic for if_exclude was not correct. I committed a fix for > it. https://svn.open-mpi.org/trac/ompi/changeset/14748 > > Thanks > > 2. I'm a bit confused on a)

Re: [OMPI devel] Problem with openib on demand connection bring up.

2007-06-13 Thread Nysal Jan
I was just bitten yesterday by a problem that I've known about for a while but had never gotten around to looking into (I could have sworn that there was an open trac ticket on this, but I can't find one anywhere). I have 2 hosts: one with 3 active ports and one with 2 active ports. If I run an

Re: [OMPI devel] Master assert failure on Linux/PPC64

2015-02-06 Thread Nysal Jan K A
It seems the ompi_free_list_init() in libnbc_open() failed for some reason. That would explain why mca_coll_libnbc_component.active_requests is not initialized and hence crash in libnbc_close(). This might help, but still doesn't explain why the free list initialization failed: diff --git a/ompi/m

Re: [OMPI devel] Master assert failure on Linux/PPC64

2015-02-09 Thread Nysal Jan K A
I opened a github issue to track this - https://github.com/open-mpi/ompi/issues/383 --Nysal On Fri, Feb 6, 2015 at 11:36 AM, Nysal Jan K A wrote: > It seems the ompi_free_list_init() in libnbc_open() failed for some > reason. That would explain why mca_coll_libnbc_component.active_reque

Re: [OMPI devel] "maybe" issue in 1.8.5rc[23]

2015-04-24 Thread Nysal Jan K A
Yeah, I remember this one. Its a bug in that specific version of the compiler. I had reported it to the compiler team a couple of years back. Quoting from the email I sent them: The "stw r0,0(r31)" probably overwrites the previous stack pointer ? static inline int opal_atomic_cmpset_32(volati

Re: [OMPI devel] 1.8.5....going once...going twice...

2015-04-27 Thread Nysal Jan K A
Opened PR# 260. Would be good to have that included in 1.8.5 Regards --Nysal On Fri, Apr 24, 2015 at 10:22 PM, Ralph Castain wrote: > Any last minute issues people need to report? Otherwise, this baby is > going to ship > > Paul: I will include your README suggestions as they relate to 1.8.5. >

[OMPI devel] opal_progress() and finalize

2015-10-06 Thread Nysal Jan K A
In v1.8 there is a RTE barrier in finalize. OMPI_LAZY_WAIT_FOR_COMPLETION waits for the barrier to complete. Internally opal_progress() is invoked. In the master branch we call PMIX fence instead. PMIX_WAIT_FOR_COMPLETION seems to only call usleep. How will ompi progress outstanding operations ? R

Re: [OMPI devel] opal_progress() and finalize

2015-10-06 Thread Nysal Jan K A
ce to fix it right away. > > > > On Oct 6, 2015, at 11:17 AM, Nysal Jan K A wrote: > > > > In v1.8 there is a RTE barrier in finalize. > OMPI_LAZY_WAIT_FOR_COMPLETION waits for the barrier to complete. Internally > opal_progress() is invoked. In the master branch we cal

Re: [OMPI devel] PMIX deadlock

2015-11-08 Thread Nysal Jan K A
In listen_thread(): 194 while (pmix_server_globals.listen_thread_active) { 195 FD_ZERO(&readfds); 196 FD_SET(pmix_server_globals.listen_socket, &readfds); 197 max = pmix_server_globals.listen_socket; Is it possible that pmix_server_globals.listen_thread_active can be fa

Re: [OMPI devel] Trunk is broken

2016-02-17 Thread Nysal Jan K A
So this seems to be still broken. mca_btl_openib.so: undefined symbol: opal_memory_linux_malloc_set_alignment I built with "--with-memory-manager=none" Regards --Nysal On Tue, Feb 16, 2016 at 10:19 AM, Ralph Castain wrote: > It is very easy to reproduce - configure with: > enable_mem_debug=no

Re: [OMPI devel] Trunk is broken

2016-02-18 Thread Nysal Jan K A
Probably should - looks like this may take more thought and probably >>>> will be handled in >>>> discussions next week >>>> >>>>> On Feb 17, 2016, at 11:26 AM, Howard Pritchard wrote: >>>>> >>>>> Hi Folks, >>>