Thomas,

can you please elaborate ?
I checked the code of opal_os_dirpath_create and could not find where such
a thing can happen

Thanks,

Gilles

On Wednesday, July 29, 2015, Thomas Jahns <ja...@dkrz.de> wrote:

> Hello,
>
> On 07/28/15 17:34, Schlottke-Lakemper, Michael wrote:
>
>> That’s what I suspected. Thank you for your confirmation.
>>
>
> you are mistaken, the allocation is 51 bytes long, i.e. valid bytes are at
> offsets 0 to 50. But since the read of 4 bytes starts at offset 48, the
> bytes at offsets 48, 49, 50 and 51 get read, the last of which is illegal.
> It probably does no harm at the moment in practice, because virtually all
> allocators always add some padding to the next multiple of some power of 2.
> But still this means the program is incorrect in terms of any programming
> language definition involved (might be C, C++ or Fortran).
>
> Regards, Thomas
>
>  On 25 Jul 2015, at 16:10 , Ralph Castain <r...@open-mpi.org
>>> <mailto:r...@open-mpi.org>> wrote:
>>>
>>> Looks to me like a false positive - we do malloc some space, and do
>>> access
>>> different parts of it. However, it looks like we are inside the space at
>>> all
>>> times.
>>>
>>> I’d suppress it
>>>
>>>
>>>  On Jul 23, 2015, at 12:47 AM, Schlottke-Lakemper, Michael
>>>> <m.schlottke-lakem...@aia.rwth-aachen.de
>>>> <mailto:m.schlottke-lakem...@aia.rwth-aachen.de>> wrote:
>>>>
>>>> Hi folks,
>>>>
>>>> recently we’ve been getting a Valgrind error in PMPI_Init for our suite
>>>> of
>>>> regression tests:
>>>>
>>>> ==5922== Invalid read of size 4
>>>> ==5922==    at 0x61CC5C0: opal_os_dirpath_create (in
>>>> /aia/opt/openmpi-1.8.7/lib64/libopen-pal.so.6.2.2)
>>>> ==5922==    by 0x5F207E5: orte_session_dir (in
>>>> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
>>>> ==5922==    by 0x5F34F04: orte_ess_base_app_setup (in
>>>> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
>>>> ==5922==    by 0x7E96679: rte_init (in
>>>> /aia/opt/openmpi-1.8.7/lib64/openmpi/mca_ess_env.so)
>>>> ==5922==    by 0x5F12A77: orte_init (in
>>>> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
>>>> ==5922==    by 0x509883C: ompi_mpi_init (in
>>>> /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2)
>>>> ==5922==    by 0x50B843A: PMPI_Init (in
>>>> /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2)
>>>> ==5922==    by 0xEBA79C: ZFS::run() (in
>>>>
>>>> /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production)
>>>> ==5922==    by 0x4DC243: main (in
>>>>
>>>> /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production)
>>>> ==5922==  Address 0x710f670 is 48 bytes inside a block of size 51
>>>> alloc'd
>>>> ==5922==    at 0x4C29110: malloc (in
>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
>>>> ==5922==    by 0x61CC572: opal_os_dirpath_create (in
>>>> /aia/opt/openmpi-1.8.7/lib64/libopen-pal.so.6.2.2)
>>>> ==5922==    by 0x5F207E5: orte_session_dir (in
>>>> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
>>>> ==5922==    by 0x5F34F04: orte_ess_base_app_setup (in
>>>> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
>>>> ==5922==    by 0x7E96679: rte_init (in
>>>> /aia/opt/openmpi-1.8.7/lib64/openmpi/mca_ess_env.so)
>>>> ==5922==    by 0x5F12A77: orte_init (in
>>>> /aia/opt/openmpi-1.8.7/lib64/libopen-rte.so.7.0.6)
>>>> ==5922==    by 0x509883C: ompi_mpi_init (in
>>>> /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2)
>>>> ==5922==    by 0x50B843A: PMPI_Init (in
>>>> /aia/opt/openmpi-1.8.7/lib64/libmpi.so.1.6.2)
>>>> ==5922==    by 0xEBA79C: ZFS::run() (in
>>>>
>>>> /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production)
>>>> ==5922==    by 0x4DC243: main (in
>>>>
>>>> /aia/r018/scratch/mic/.zfstester/.zacc_cron/zacc_cron_r9063/zfs_gnu_production)
>>>> ==5922==
>>>>
>>>> What is weird is that it seems to depend on the pbs/torque session
>>>> we’re in:
>>>> sometimes the error does not occur and all and all tests run fine (this
>>>> is in
>>>> fact the only Valgrind error we’re having at the moment). Other times
>>>> every
>>>> single test we’re running has this error.
>>>>
>>>> Has anyone seen this or might be able to offer an explanation? If it is
>>>> a
>>>> false-positive, I’d be happy to suppress it :)
>>>>
>>>> Thanks a lot in advance
>>>>
>>>> Michael
>>>>
>>>> P.S.: This error is not covered/suppressed by the default ompi
>>>> suppression
>>>> file in $PREFIX/share/openmpi.
>>>>
>>>>
>>>> --
>>>> Michael Schlottke-Lakemper
>>>>
>>>> SimLab Highly Scalable Fluids & Solids Engineering
>>>> Jülich Aachen Research Alliance (JARA-HPC)
>>>> RWTH Aachen University
>>>> Wüllnerstraße 5a
>>>> 52062 Aachen
>>>> Germany
>>>>
>>>> Phone: +49 (241) 80 95188
>>>> Fax: +49 (241) 80 92257
>>>> Mail: m.schlottke-lakem...@aia.rwth-aachen.de
>>>> <mailto:m.schlottke-lakem...@aia.rwth-aachen.de>
>>>> Web: http://www.jara.org/jara-hpc
>>>>
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post:
>>>> http://www.open-mpi.org/community/lists/users/2015/07/27303.php
>>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2015/07/27328.php
>>>
>>
>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/07/27348.php
>>
>>
>
> --
> Thomas Jahns
> HD(CP)^2
> Abteilung Anwendungssoftware
>
> Deutsches Klimarechenzentrum GmbH
> Bundesstraße 45a • D-20146 Hamburg • Germany
>
> Phone:  +49 40 460094-151
> Fax:    +49 40 460094-270
> Email:  Thomas Jahns <ja...@dkrz.de>
> URL:    www.dkrz.de
>
> Geschäftsführer: Prof. Dr. Thomas Ludwig
> Sitz der Gesellschaft: Hamburg
> Amtsgericht Hamburg HRB 39784
>
>

Reply via email to