We should have given more of a "heads up" here. We recognize that the trunk
may well become unstable as we can't test all the variations, and clearly
some timing issues are going to arise with this change. Our hope is that we
can iron them out quickly. If not, then we'll revert and try again.

You also may find that you need to disable coll/ml - that is one we've
identified here and Nathan should have a fix for shortly.



On Wed, Jun 25, 2014 at 1:11 AM, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

> Mike,
>
> could you try again with
>
> OMPI_MCA_btl=vader,self,openib
>
> it seems the sm module causes a hang
> (which later causes the timeout sending a SIGSEGV)
>
> Cheers,
>
> Gilles
>
> On 2014/06/25 14:22, Mike Dubman wrote:
> > Hi,
> > The following commit broke trunk in jenkins:
> >
> >>>> Per the OMPI developer conference, remove the last vestiges of
> > OMPI_USE_PROGRESS_THREADS
> >
> > *22:15:09* +
> LD_LIBRARY_PATH=/scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/oshm_install2/lib*22:15:09*
> > + OMPI_MCA_scoll_fca_enable=1*22:15:09* +
> > OMPI_MCA_scoll_fca_np=0*22:15:09* + OMPI_MCA_pml=ob1*22:15:09* +
> > OMPI_MCA_btl=sm,self,openib*22:15:09* + OMPI_MCA_spml=yoda*22:15:09* +
> > OMPI_MCA_memheap_mr_interleave_factor=8*22:15:09* +
> > OMPI_MCA_memheap=ptmalloc*22:15:09* +
> > OMPI_MCA_btl_openib_if_include=mlx4_0:1*22:15:09* +
> > OMPI_MCA_rmaps_base_dist_hca=mlx4_0*22:15:09* +
> > OMPI_MCA_memheap_base_hca_name=mlx4_0*22:15:09* +
> > OMPI_MCA_rmaps_base_mapping_policy=dist:mlx4_0*22:15:09* +
> > MXM_RDMA_PORTS=mlx4_0:1*22:15:09* +
> > SHMEM_SYMMETRIC_HEAP_SIZE=1024M*22:15:09* + timeout -s SIGSEGV 3m
> >
> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/oshm_install2/bin/oshrun
> > -np 8
> /scrap/jenkins/scrap/workspace/hpc-ompi-shmem/label/hpc-test-node/examples/hello_shmem*22:15:09*
> > [vegas12:08101] *** Process received signal ****22:15:09*
> > [vegas12:08101] Signal: Segmentation fault (11)*22:15:09*
> > [vegas12:08101] Signal code: Address not mapped (1)*22:15:09*
> > [vegas12:08101] Failing at address: (nil)*22:15:09* [vegas12:08101] [
> >
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2014/06/15055.php
>

Reply via email to