Larry,

Thanks for following with us on this. I think your patch is cleaner than what 
we currently have in the trunk, so I went ahead and push it in the trunk 
(25461). I will request a push in 1.5 and 1.4 as well.

  Regards,
    george.

On Nov 8, 2011, at 13:57 , Larry Baker wrote:

> The good news is that the issue reported in R25290 is fixed in the latest 
> Intel compilers release (2011.7.256).  The bad news is that both the 
> 2011.6.233 and 2011.7.256 releases identify themselves as V12.1.0 from the 
> command line.  (I reported this bug to Intel already.)  They can only be 
> reliably distinguished using the predefined __INTEL_COMPILER_BUILD_DATE 
> macro.  I verified that the build dates for all three compilers we have -- 
> Linux, Mac OS X, and Windows -- are the same.
> 
> I developed a more targeted patch (attached) for OpenMPI 1.4.3 
> opal/mca/memory/ptmalloc2/malloc.c which disables vectorization for 
> _int_malloc() only if an Intel compiler with the 2011.6.233 release build 
> date is found (__INTEL_COMPILER_BUILD_DATE == 20110811).  This patch could 
> presumably make its way into all the copies of 
> opal/mca/memory/ptmalloc2/malloc.c in the various versions of OpenMPI that 
> are still being maintained.
> 
> Larry Baker
> US Geological Survey
> 650-329-5608
> ba...@usgs.gov
> 
> On 17 Oct 2011, at 8:18 PM, George Bosilca wrote:
> 
>> Larry,
>> 
>> Sorry for not updating this thread. The issue was identified and fixed by 
>> Rainer in r25290 (https://svn.open-mpi.org/trac/ompi/changeset/25290). 
>> Please read the comments and the linked thread on the Intel forum for more 
>> info about.
>> 
>> I couldn't find a trace of this being fixed in the 1.4 series, so I would 
>> wait upgrading until this issue gets resolved.
>> 
>>   Thanks,
>>     george.
>> 
>> On Oct 17, 2011, at 23:00 , Larry Baker wrote:
>> 
>>> George,
>>> 
>>> I have not had time to look over the 1.4.3 make check failure for Intel 
>>> 2011.6.233 compilers.  Have you?
>>> 
>>> I had planned to get 1.4.3 compiled on all six of our compilers using the 
>>> latest compiler releases.  I was putting off upgrading to 1.4.4 or 1.5.x 
>>> until after that to minimize the number of things that could go wrong.  Do 
>>> you recommend otherwise?
>>> 
>>> Larry Baker
>>> US Geological Survey
>>> 650-329-5608
>>> ba...@usgs.gov
>>> 
>>> On 7 Oct 2011, at 6:46 PM, George Bosilca wrote:
>>> 
>>>> The may_alias attribute was part of a forward-looking attribute checking, 
>>>> at a time where few compiler supported them. This explains why they are 
>>>> not widely used in the library itself. Moreover, as they do not affect the 
>>>> compilation itself (as your test highlights this is not the issue with the 
>>>> icc 2011.6.233 compiler), there is no urge to remove the may_alias support.
>>>> 
>>>> I just got that particular version of the compiler installed on one of our 
>>>> machines. I'll give it a try over the weekend.
>>>> 
>>>>   george.
>>>> 
>>>> On Oct 7, 2011, at 20:21 , Larry Baker wrote:
>>>> 
>>>>> The test for the __may_alias_ attribute uses the following short code 
>>>>> snippet:
>>>>> 
>>>>>> int * p_value __attribute__ ((__may_alias__));
>>>>>> int
>>>>>> main ()
>>>>>> {
>>>>>> 
>>>>>>   ;
>>>>>>   return 0;
>>>>>> }
>>>>> 
>>>>> Indeed, for Intel 2011 compilers prior to 2011.6.233, this results in a 
>>>>> warning:
>>>>> 
>>>>>> root@hydra openmpi-1.4.3]# module load compilers/intel/2011.5.220
>>>>>> [root@hydra openmpi-1.4.3]# icc -c may_alias_test.c 
>>>>>> may_alias_test.c(123): warning #1292: attribute "__may_alias__" ignored
>>>>>>   int * p_value __attribute__ ((__may_alias__));
>>>>>>                                 ^
>>>>>> 
>>>>>> [root@hydra openmpi-1.4.3]# module unload compilers/intel/2011.5.220
>>>>> 
>>>>>> [root@hydra openmpi-1.4.3]# module load compilers/intel/2011.6.233
>>>>>> [root@hydra openmpi-1.4.3]# icc -c may_alias_test.c 
>>>>> 
>>>>> 
>>>>> I modified ./configure to force
>>>>> 
>>>>>> ompi_cv___attribute__may_alias=0
>>>>> 
>>>>> 
>>>>> Then I compiled and tested the library.  Unfortunately, the results were 
>>>>> exactly the same:
>>>>> 
>>>>>> make  check-TESTS
>>>>>> make[3]: Entering directory 
>>>>>> `/state/partition1/root/src/openmpi-1.4.3/test/datatype'
>>>>>> /bin/sh: line 4: 26326 Segmentation fault      ${dir}$tst
>>>>>> FAIL: checksum
>>>>>> /bin/sh: line 4: 26359 Segmentation fault      ${dir}$tst
>>>>>> FAIL: position
>>>>>> ========================================================
>>>>>> 2 of 2 tests failed
>>>>>> Please report to http://www.open-mpi.org/community/help/
>>>>>> ========================================================
>>>>> 
>>>>> 
>>>>> I could not find any use of the may_alias attribute, other than in a 
>>>>> #define in opal/include/opal_config_bottom.h.  Is 
>>>>> OMPI_HAVE_ATTRIBUTE_MAY_ALIAS just cruft that can be removed?
>>>>> 
>>>>> Larry Baker
>>>>> US Geological Survey
>>>>> 650-329-5608
>>>>> ba...@usgs.gov
>>>>> 
>>>>> On 7 Oct 2011, at 11:08 AM, Larry Baker wrote:
>>>>> 
>>>>>> I ran into a problem this past week trying to upgrade our OpenMPI 1.4.3 
>>>>>> for the latest Intel 2011 compiler, 2011.6.233.
>>>>>> 
>>>>>> make check fails with Segmentation Fault errors:
>>>>>> 
>>>>>>> [root@hydra openmpi-1.4.3]# tail -20 
>>>>>>> ../openmpi-1.4.3-check-intel.6.233.log
>>>>>>> /bin/sh ../../libtool --tag=CC   --mode=link icc  -DNDEBUG -g -O3 
>>>>>>> -finline-functions -fno-strict-aliasing -restrict -pthread 
>>>>>>> -fvisibility=hidden -shared-intel -export-dynamic -shared-intel  -o 
>>>>>>> ddt_pack ddt_pack.o ../../ompi/libmpi.la -lnsl -lutil  
>>>>>>> libtool: link: icc -DNDEBUG -g -O3 -finline-functions 
>>>>>>> -fno-strict-aliasing -restrict -pthread -fvisibility=hidden 
>>>>>>> -shared-intel -shared-intel -o .libs/ddt_pack ddt_pack.o 
>>>>>>> -Wl,--export-dynamic  ../../ompi/.libs/libmpi.so 
>>>>>>> /usr/local/src/openmpi-1.4.3/orte/.libs/libopen-rte.so 
>>>>>>> /usr/local/src/openmpi-1.4.3/opal/.libs/libopen-pal.so -ldl -lnsl 
>>>>>>> -lutil -pthread -Wl,-rpath -Wl,/usr/local/lib
>>>>>>> make[3]: Leaving directory 
>>>>>>> `/state/partition1/root/src/openmpi-1.4.3/test/datatype'
>>>>>>> make  check-TESTS
>>>>>>> make[3]: Entering directory 
>>>>>>> `/state/partition1/root/src/openmpi-1.4.3/test/datatype'
>>>>>>> /bin/sh: line 4:  6322 Segmentation fault      ${dir}$tst
>>>>>>> FAIL: checksum
>>>>>>> /bin/sh: line 4:  6355 Segmentation fault      ${dir}$tst
>>>>>>> FAIL: position
>>>>>>> ========================================================
>>>>>>> 2 of 2 tests failed
>>>>>>> Please report to http://www.open-mpi.org/community/help/
>>>>>>> ========================================================
>>>>>>> make[3]: *** [check-TESTS] Error 1
>>>>>>> make[3]: Leaving directory 
>>>>>>> `/state/partition1/root/src/openmpi-1.4.3/test/datatype'
>>>>>>> make[2]: *** [check-am] Error 2
>>>>>>> make[2]: Leaving directory 
>>>>>>> `/state/partition1/root/src/openmpi-1.4.3/test/datatype'
>>>>>>> make[1]: *** [check-recursive] Error 1
>>>>>>> make[1]: Leaving directory 
>>>>>>> `/state/partition1/root/src/openmpi-1.4.3/test'
>>>>>>> make: *** [check-recursive] Error 1
>>>>>> 
>>>>>> 
>>>>>> Before trying to track down the problem, I thought I'd describe what I 
>>>>>> see here in case someone recognizes what might be happening.
>>>>>> 
>>>>>> We have been using OpenMPI 1.4.3 compiled with the Intel 2011.3.174 
>>>>>> compiler.  I've updated the Intel 2011 compilers as they have come out 
>>>>>> with new versions: 2011.4.191, 2011.5.220, and now 2011.6.233.  However, 
>>>>>> I've not recompiled OpenMPI 1.4.3 until now.
>>>>>> 
>>>>>> Since the original compilation of OpenMPI 1.4.3 with the Intel 
>>>>>> 2011.3.174 compilers, I have installed libnuma and libnuma-devel RPMs on 
>>>>>> our cluster front end.  I noticed that changed the OpenMPI 1.4.3 
>>>>>> ./configure output.  To test that this was not the cause of the problem, 
>>>>>> I recompiled OpenMPI 1.4.3 using both the CentOS/Rocks GNU compilers and 
>>>>>> the Intel 2011.3.174 compilers.  They both passed all the make check 
>>>>>> tests.
>>>>>> 
>>>>>> To find out when this problem first occurs, I systematically configured, 
>>>>>> compiled, and checked OpenMPI 1.4.3 with all four versions of the Intel 
>>>>>> 2011 compilers we have.  We use the modules package to load the compiler 
>>>>>> environment:
>>>>>> 
>>>>>>> [root@hydra openmpi-1.4.3]# env | grep /opt/intel
>>>>>>> LD_LIBRARY_PATH=/opt/intel/composer_xe_2011_sp1.6.233/compiler/lib/intel64:/opt/intel/composer_xe_2011_sp1.6.233/mkl/lib/intel64
>>>>>>> PATH=/opt/intel/composer_xe_2011_sp1.6.233/bin/intel64:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/java/latest/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/eclipse:/opt/ganglia/bin:/opt/ganglia/sbin:/opt/maui/bin:/opt/torque/bin:/opt/torque/sbin:/opt/rocks/bin:/opt/rocks/sbin:/root/bin
>>>>>> 
>>>>>> 
>>>>>> Here's the steps I use to make and test OpenMPI 1.4.3 (I use a patched 
>>>>>> version to accommodate the six compilers we have; I've submitted those 
>>>>>> patches here in the past):
>>>>>> 
>>>>>>> # cd /usr/local/src
>>>>>>> # tar -xjf openmpi-1.4.3-patched.tar.bz2
>>>>>>> # cd openmpi-1.4.3
>>>>>>> # module load compilers/intel/2011.6.233
>>>>>>> # ./configure >../openmpi-1.4.3-configure-intel.6.233.log 2>&1 
>>>>>>> --with-tm --with-openib --without-valgrind --without-udapl 
>>>>>>> --enable-contrib-no-build=vt --with-wrapper-ldflags="-shared-intel" 
>>>>>>> CC="icc" CFLAGS="-g -O3" CXX="icpc" CXXFLAGS="-g -O3" FC="ifort" 
>>>>>>> FCFLAGS="-g -O3" F77="ifort" FFLAGS="-g -O3" LDFLAGS="-shared-intel"
>>>>>>> # make >../openmpi-1.4.3-make-intel.6.233.log 2>&1
>>>>>>> # make check >../openmpi-1.4.3-check-intel.6.233.log 2>&1
>>>>>> 
>>>>>> (When I generate the OpenMPI 1.4.3 library we actually use, I also add a 
>>>>>> --prefix.  But, that complicates diff's of the stdout files for these 
>>>>>> steps, so it is not used here.  Thus, I do NOT proceed to make install 
>>>>>> any of these libraries.)
>>>>>> 
>>>>>> The three earlier versions of the Intel 2011 compilers all pass the make 
>>>>>> check tests.  When I compare the ./configure stdout files, they are all 
>>>>>> identical.  However, the ./configure stdout file for the Intel 
>>>>>> 2011.6.233 compilers has one difference:
>>>>>> 
>>>>>>> [root@hydra openmpi-1.4.3]# diff 
>>>>>>> ../openmpi-1.4.3-configure-intel.{5.220,6.233}.log
>>>>>>> 178c178
>>>>>>> < checking for __attribute__(may_alias)... no
>>>>>>> ---
>>>>>>> > checking for __attribute__(may_alias)... yes
>>>>>> 
>>>>>> That is obviously where I will start looking for the source of the 
>>>>>> problem.
>>>>>> 
>>>>>> Maybe someone reading this list knows what the purpose of that test is, 
>>>>>> whether the Intel 2011 compilers are expected to have this feature 
>>>>>> enabled, and whether the code this enables can cause this problem if the 
>>>>>> Intel 2011.6.233 compilers do not fully support whatever this test is 
>>>>>> intended to discern.
>>>>>> 
>>>>>> Larry Baker
>>>>>> US Geological Survey
>>>>>> 650-329-5608
>>>>>> ba...@usgs.gov
>>>>>> 
>>>>>> _______________________________________________
>>>>>> devel mailing list
>>>>>> de...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>>> 
>>>>> _______________________________________________
>>>>> devel mailing list
>>>>> de...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>> 
>>>> _______________________________________________
>>>> devel mailing list
>>>> de...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> <Intel20110811Fix.patch.txt>

Reply via email to