Re: [OMPI devel] 1.6.4rc1 has been posted

2013-01-17 Thread Paul Hargrove
On Thu, Jan 17, 2013 at 5:36 PM, Paul Hargrove  wrote:

> On Thu, Jan 17, 2013 at 4:37 PM, Paul Hargrove  wrote:
> [snip]
>
>> I just now ran tests on OpenBSD-5.2/i386 and OpenBSD-5.2/amd64, using
>> Clang-3.1.
>> Unfortunately, there is a mass of linker error building libmpi_cxx.la(on 
>> both systems)
>> I am trying again with --disable-mpi-cxx and will report my results later.
>>
> [snip]
>
> Using  --disable-mpi-cxx I still have linker problems, now from the C++
> lib(s) in VT.
> So, I've just gone ahead and tried CC=clang CXX=g++, which worked fine.
>
> Given the VT failure, I am guessing that the issue is clang++, rather than
> something in OMPI "proper".
> OR, perhaps is because my Clang install pre-dates my upgrade from
> OpenBSD-5.1 to 5.2.
> I'll re-install Clang and post new results when I have them.
>

Re-installing clang made no difference.
This failure (clang++-3,1 on OpenBSD-5.2) doesn't bother me one iota.
However, if somebody wants to look into it, let me know and I can provide
the details (on- or off-list as determined by the requester).

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] MPI-2.2 status #2223, #3127

2013-01-17 Thread Kawashima, Takahiro
Hi,

Fujitsu is interested in completing MPI-2.2 on Open MPI and Open MPI
-based Fujitsu MPI.

We've read wiki and tickets. These two tickets seem to be almost done
but need testing and bug fixing.

  https://svn.open-mpi.org/trac/ompi/ticket/2223
  MPI-2.2: MPI_Dist_graph_* functions missing

  https://svn.open-mpi.org/trac/ompi/ticket/3127
  MPI-2.2: Add reduction support for MPI_C_*COMPLEX and MPI::*COMPLEX

My colleagues are planning to work on these. They will write test codes
and try to fix bugs. Test codes and patches can be contributed to the
community. If they cannot fix some bugs, we will report details. They
are planning to complete them in around March.

With that two questions.

The latest statuses written in these ticket comments are correct?
Is there any more progress?

Where are the latest codes?
In ticket #2223 says it is on Jeff's ompi-topo-fixes bitbucket branch.
  https://bitbucket.org/jsquyres/ompi-topo-fixes
But Jeff seems to have one more branch with a similar name.
  https://bitbucket.org/jsquyres/ompi-topo-fixes-fixed
Ticket #3127 says it is on Jeff's mpi22-c-complex bitbucket branch.
But there is no such branch now.
  https://bitbucket.org/jsquyres/mpi22-c-complex

Best regards,
Takahiro Kawashima,
MPI development team,
Fujitsu


Re: [OMPI devel] 1.6.4rc1 has been posted

2013-01-17 Thread Paul Hargrove
On Thu, Jan 17, 2013 at 4:37 PM, Paul Hargrove  wrote:
[snip]

> I just now ran tests on OpenBSD-5.2/i386 and OpenBSD-5.2/amd64, using
> Clang-3.1.
> Unfortunately, there is a mass of linker error building libmpi_cxx.la (on
> both systems)
> I am trying again with --disable-mpi-cxx and will report my results later.
>
[snip]

Using  --disable-mpi-cxx I still have linker problems, now from the C++
lib(s) in VT.
So, I've just gone ahead and tried CC=clang CXX=g++, which worked fine.

Given the VT failure, I am guessing that the issue is clang++, rather than
something in OMPI "proper".
OR, perhaps is because my Clang install pre-dates my upgrade from
OpenBSD-5.1 to 5.2.
I'll re-install Clang and post new results when I have them.

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.6.4rc1 has been posted

2013-01-17 Thread Jeff Squyres (jsquyres)
Sweet. :)

Sent from my phone. No type good.

On Jan 17, 2013, at 6:59 PM, "Paul Hargrove" 
mailto:phhargr...@lbl.gov>> wrote:

On Thu, Jan 17, 2013 at 2:26 PM, Paul Hargrove 
mailto:phhargr...@lbl.gov>> wrote:
[snip]
The BAD news is a new failure (SEGV in orted at exit) on OpenBSD-5.2/amd64, 
which I will report in a separate email once I've completed some triage.
[snip]

You can disregard the "BAD news" above.
Everything was fine with gcc, but fails with llvm-gcc.
Looking deeper (details upon request) the SEGV appears to be caused by a bug in 
llvm-gcc.

-Paul

--
Paul H. Hargrove  
phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


Re: [OMPI devel] 1.6.4rc1 has been posted

2013-01-17 Thread Paul Hargrove
Kenneth,

I've built ompi w/ clang on other platforms before (incl. MacOSX Mountain
Lion), but not on OpenBSD.
I just now ran tests on OpenBSD-5.2/i386 and OpenBSD-5.2/amd64, using
Clang-3.1.
Unfortunately, there is a mass of linker error building libmpi_cxx.la (on
both systems)
I am trying again with --disable-mpi-cxx and will report my results later.

Also, I had no problem with llvm-gcc on OpenBSD-5.2/i386; only on amd64 did
I see a problem.

-Paul


On Thu, Jan 17, 2013 at 4:12 PM, Kenneth A. Lloyd  wrote:

> Paul,
>
> ** **
>
> Have you tried llvm with clang?
>
> ** **
>
> Ken
>
> ** **
>
> *From:* devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] *On
> Behalf Of *Paul Hargrove
> *Sent:* Thursday, January 17, 2013 4:58 PM
> *To:* Open MPI Developers
> *Subject:* Re: [OMPI devel] 1.6.4rc1 has been posted
>
> ** **
>
> On Thu, Jan 17, 2013 at 2:26 PM, Paul Hargrove  wrote:
> 
>
> [snip]
>
> The BAD news is a new failure (SEGV in orted at exit) on
> OpenBSD-5.2/amd64, which I will report in a separate email once I've
> completed some triage.
>
> [snip]
>
> ** **
>
> You can disregard the "BAD news" above.
>
> Everything was fine with gcc, but fails with llvm-gcc.
>
> Looking deeper (details upon request) the SEGV appears to be caused by a
> bug in llvm-gcc.
>
> ** **
>
> -Paul
>
> ** **
>
> -- 
>
> Paul H. Hargrove  phhargr...@lbl.gov
>
> Future Technologies Group
>
> Computer and Data Sciences Department Tel: +1-510-495-2352
>
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.6.4rc1 has been posted

2013-01-17 Thread Kenneth A. Lloyd
Paul,



Have you tried llvm with clang?



Ken



From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On
Behalf Of Paul Hargrove
Sent: Thursday, January 17, 2013 4:58 PM
To: Open MPI Developers
Subject: Re: [OMPI devel] 1.6.4rc1 has been posted



On Thu, Jan 17, 2013 at 2:26 PM, Paul Hargrove  wrote:

[snip]

The BAD news is a new failure (SEGV in orted at exit) on OpenBSD-5.2/amd64,
which I will report in a separate email once I've completed some triage.

[snip]



You can disregard the "BAD news" above.

Everything was fine with gcc, but fails with llvm-gcc.

Looking deeper (details upon request) the SEGV appears to be caused by a bug
in llvm-gcc.



-Paul



-- 

Paul H. Hargrove  phhargr...@lbl.gov

Future Technologies Group

Computer and Data Sciences Department Tel: +1-510-495-2352

Lawrence Berkeley National Laboratory Fax: +1-510-486-6900



Re: [OMPI devel] 1.6.4rc1 has been posted

2013-01-17 Thread Paul Hargrove
On Thu, Jan 17, 2013 at 2:26 PM, Paul Hargrove  wrote:
[snip]

> The BAD news is a new failure (SEGV in orted at exit) on
> OpenBSD-5.2/amd64, which I will report in a separate email once I've
> completed some triage.
>
[snip]

You can disregard the "BAD news" above.
Everything was fine with gcc, but fails with llvm-gcc.
Looking deeper (details upon request) the SEGV appears to be caused by a
bug in llvm-gcc.

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] [patch] MPI-2.2: Ordering of attribution deletion callbacks on MPI_COMM_SELF

2013-01-17 Thread KAWASHIMA Takahiro
Jeff,

OK. I'll try implementing George's idea and then you can compare which
one is simpler.

Regards,
KAWASHIMA Takahiro

> Not that I'm aware of; that would be great.
> 
> Unlike George, however, I'm not concerned about converting to linear 
> operations for attributes.
> 
> Attributes are not used often, but when they are:
> 
> a) there aren't many of them (so a linear penalty is trivial)
> b) they're expected to be low performance
> 
> So if it makes the code simpler, I certainly don't mind linear operations.
> 
> 
> 
> On Jan 17, 2013, at 9:32 AM, KAWASHIMA Takahiro 
>  wrote:
> 
> > George,
> > 
> > Your idea makes sense.
> > Is anyone working on it? If not, I'll try.
> > 
> > Regards,
> > KAWASHIMA Takahiro
> > 
> >> Takahiro,
> >> 
> >> Thanks for the patch. I deplore the lost of the hash table in the 
> >> attribute management, as the potential of transforming all attributes 
> >> operation to a linear complexity is not very appealing.
> >> 
> >> As you already took the decision C, it means that at the communicator 
> >> destruction stage the hash table is not relevant anymore. Thus, I would 
> >> have converted the hash table to an ordered list (ordered by the creation 
> >> index, a global entity atomically updated every time an attribute is 
> >> created), and proceed to destroy the attributed in the desired order. Thus 
> >> instead of having a linear operation for every operation on attributes, we 
> >> only have a single linear operation per communicator (and this during the 
> >> destruction stage).
> >> 
> >>  George.
> >> 
> >> On Jan 16, 2013, at 16:37 , KAWASHIMA Takahiro  
> >> wrote:
> >> 
> >>> Hi,
> >>> 
> >>> I've implemented ticket #3123 "MPI-2.2: Ordering of attribution deletion
> >>> callbacks on MPI_COMM_SELF".
> >>> 
> >>> https://svn.open-mpi.org/trac/ompi/ticket/3123
> >>> 
> >>> As this ticket says, attributes had been stored in unordered hash.
> >>> So I've replaced opal_hash_table_t with opal_list_t and made necessary
> >>> modifications for it. And I've also fixed some multi-threaded concurrent
> >>> (get|set|delete)_attr call issues.
> >>> 
> >>> By this modification, following behavior changes are introduced.
> >>> 
> >>> (A) MPI_(Comm|Type|Win)_(get|set|delete)_attr function may be slower
> >>> for MPI objects that has many attributes attached.
> >>> (B) When the user-defined delete callback function is called, the
> >>> attribute is already removed from the list. In other words,
> >>> if MPI_(Comm|Type|Win)_get_attr is called by the user-defined
> >>> delete callback function for the same attribute key, it returns
> >>> flag = false.
> >>> (C) Even if the user-defined delete callback function returns non-
> >>> MPI_SUCCESS value, the attribute is not reverted to the list.
> >>> 
> >>> (A) is due to a sequential list search instead of a hash. See find_value
> >>> function for its implementation.
> >>> (B) and (C) are due to an atomic deletion of the attribute to allow
> >>> multi-threaded concurrent (get|set|delete)_attr call in 
> >>> MPI_THREAD_MULTIPLE.
> >>> See ompi_attr_delete function for its implementation. I think this does
> >>> not matter because MPI standard doesn't specify behavior in such cases.
> >>> 
> >>> The patch for Open MPI trunk is attached. If you like it, take in
> >>> this patch.
> >>> 
> >>> Though I'm a employee of a company, this is my independent and private
> >>> work at my home. No intellectual property from my company. If needed,
> >>> I'll sign to Individual Contributor License Agreement.


Re: [OMPI devel] 1.6.4rc1 has been posted

2013-01-17 Thread Paul Hargrove
Since I see OpenBSD twice on the list of changes, I've fired off my
automated testing on my OpenBSD platforms.
Since the "MPI datatype issues on OpenBSD" I reported against 1.7.0rc5 also
appeared on FreeBSD-6.3, I've tested that platform as well.

The good news is that the problems I've reported in the past appear to be
resolved.

The BAD news is a new failure (SEGV in orted at exit) on OpenBSD-5.2/amd64,
which I will report in a separate email once I've completed some triage.

-Paul


On Thu, Jan 17, 2013 at 12:49 PM, Jeff Squyres (jsquyres) <
jsquy...@cisco.com> wrote:

> In the usual location:
>
> http://www.open-mpi.org/software/ompi/v1.6/
>
> Here's a list of changes since 1.6.3:
>
> - Added performance improvements to the OpenIB (OpenFabrics) BTL.
> - Improved error message when process affinity fails.
> - Fixed MPI_MINLOC on man pages for MPI_REDUCE(_LOCAL).  Thanks to Jed
>   Brown for noticing the problem and supplying a fix.
> - Made malloc hooks more friendly to IO interprosers.  Thanks to the
>   bug report and suggested fix from Darshan maintainer Phil Carns.
> - Restored ability to direct launch under SLURM without PMI support.
> - Fixed MPI datatype issues on OpenBSD.
> - Major VT update to 5.14.2.
> - Support FCA v3.0+.
> - Fixed header file problems on OpenBSD.
> - Fixed issue with MPI_TYPE_CREATE_F90_REAL.
> - Fix an issue with using external libltdl installations.  Thanks to
>   opolawski for identifying the problem.
> - Fixed MPI_IN_PLACE case for MPI_ALLGATHER for FCA.
> - Allow SLURM PMI support to look in lib64 directories.  Thanks to
>   Guillaume Papaure for the patch.
> - Restore "use mpi" ABI compatibility with the rest of the 1.5/1.6
>   series (except for v1.6.3, where it was accidentally broken).
> - Fix a very old error in opal_path_access(). Thanks to Marco Atzeri
>   for chasing it down.
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] 1.6.4rc1 has been posted

2013-01-17 Thread Jeff Squyres (jsquyres)
In the usual location:

http://www.open-mpi.org/software/ompi/v1.6/

Here's a list of changes since 1.6.3:

- Added performance improvements to the OpenIB (OpenFabrics) BTL.
- Improved error message when process affinity fails.
- Fixed MPI_MINLOC on man pages for MPI_REDUCE(_LOCAL).  Thanks to Jed
  Brown for noticing the problem and supplying a fix.
- Made malloc hooks more friendly to IO interprosers.  Thanks to the
  bug report and suggested fix from Darshan maintainer Phil Carns.
- Restored ability to direct launch under SLURM without PMI support.
- Fixed MPI datatype issues on OpenBSD.
- Major VT update to 5.14.2.
- Support FCA v3.0+.
- Fixed header file problems on OpenBSD.
- Fixed issue with MPI_TYPE_CREATE_F90_REAL.
- Fix an issue with using external libltdl installations.  Thanks to
  opolawski for identifying the problem.
- Fixed MPI_IN_PLACE case for MPI_ALLGATHER for FCA.
- Allow SLURM PMI support to look in lib64 directories.  Thanks to
  Guillaume Papaure for the patch.
- Restore "use mpi" ABI compatibility with the rest of the 1.5/1.6
  series (except for v1.6.3, where it was accidentally broken).
- Fix a very old error in opal_path_access(). Thanks to Marco Atzeri
  for chasing it down.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI devel] sanity check on 1.6.4 .so versions

2013-01-17 Thread Jeff Squyres (jsquyres)
Given that we've screwed this up before, could someone please sanity check the 
.so versions I'm planning on using for the 1.6.4 release.  Only 2 libraries 
changed: libmpi and libopen-pal (most other changes were in VT and various 
components).  No interfaces changed.

Is this right?

## libmpi changed
# was: libmpi_so_version=1:6:0
libmpi_so_version=1:7:0
libmpi_cxx_so_version=1:1:0
libmpi_f77_so_version=1:6:0
libmpi_f90_so_version=4:0:3
libopen_rte_so_version=4:3:0
##  opal changed
# was: libopen_pal_so_version=4:3:0
libopen_pal_so_version=4:4:0

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] [patch] MPI-2.2: Ordering of attribution deletion callbacks on MPI_COMM_SELF

2013-01-17 Thread Jeff Squyres (jsquyres)
Not that I'm aware of; that would be great.

Unlike George, however, I'm not concerned about converting to linear operations 
for attributes.

Attributes are not used often, but when they are:

a) there aren't many of them (so a linear penalty is trivial)
b) they're expected to be low performance

So if it makes the code simpler, I certainly don't mind linear operations.



On Jan 17, 2013, at 9:32 AM, KAWASHIMA Takahiro 
 wrote:

> George,
> 
> Your idea makes sense.
> Is anyone working on it? If not, I'll try.
> 
> Regards,
> KAWASHIMA Takahiro
> 
>> Takahiro,
>> 
>> Thanks for the patch. I deplore the lost of the hash table in the attribute 
>> management, as the potential of transforming all attributes operation to a 
>> linear complexity is not very appealing.
>> 
>> As you already took the decision C, it means that at the communicator 
>> destruction stage the hash table is not relevant anymore. Thus, I would have 
>> converted the hash table to an ordered list (ordered by the creation index, 
>> a global entity atomically updated every time an attribute is created), and 
>> proceed to destroy the attributed in the desired order. Thus instead of 
>> having a linear operation for every operation on attributes, we only have a 
>> single linear operation per communicator (and this during the destruction 
>> stage).
>> 
>>  George.
>> 
>> On Jan 16, 2013, at 16:37 , KAWASHIMA Takahiro  
>> wrote:
>> 
>>> Hi,
>>> 
>>> I've implemented ticket #3123 "MPI-2.2: Ordering of attribution deletion
>>> callbacks on MPI_COMM_SELF".
>>> 
>>> https://svn.open-mpi.org/trac/ompi/ticket/3123
>>> 
>>> As this ticket says, attributes had been stored in unordered hash.
>>> So I've replaced opal_hash_table_t with opal_list_t and made necessary
>>> modifications for it. And I've also fixed some multi-threaded concurrent
>>> (get|set|delete)_attr call issues.
>>> 
>>> By this modification, following behavior changes are introduced.
>>> 
>>> (A) MPI_(Comm|Type|Win)_(get|set|delete)_attr function may be slower
>>> for MPI objects that has many attributes attached.
>>> (B) When the user-defined delete callback function is called, the
>>> attribute is already removed from the list. In other words,
>>> if MPI_(Comm|Type|Win)_get_attr is called by the user-defined
>>> delete callback function for the same attribute key, it returns
>>> flag = false.
>>> (C) Even if the user-defined delete callback function returns non-
>>> MPI_SUCCESS value, the attribute is not reverted to the list.
>>> 
>>> (A) is due to a sequential list search instead of a hash. See find_value
>>> function for its implementation.
>>> (B) and (C) are due to an atomic deletion of the attribute to allow
>>> multi-threaded concurrent (get|set|delete)_attr call in MPI_THREAD_MULTIPLE.
>>> See ompi_attr_delete function for its implementation. I think this does
>>> not matter because MPI standard doesn't specify behavior in such cases.
>>> 
>>> The patch for Open MPI trunk is attached. If you like it, take in
>>> this patch.
>>> 
>>> Though I'm a employee of a company, this is my independent and private
>>> work at my home. No intellectual property from my company. If needed,
>>> I'll sign to Individual Contributor License Agreement.
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] [patch] MPI-2.2: Ordering of attribution deletion callbacks on MPI_COMM_SELF

2013-01-17 Thread KAWASHIMA Takahiro
George,

Your idea makes sense.
Is anyone working on it? If not, I'll try.

Regards,
KAWASHIMA Takahiro

> Takahiro,
> 
> Thanks for the patch. I deplore the lost of the hash table in the attribute 
> management, as the potential of transforming all attributes operation to a 
> linear complexity is not very appealing.
> 
> As you already took the decision C, it means that at the communicator 
> destruction stage the hash table is not relevant anymore. Thus, I would have 
> converted the hash table to an ordered list (ordered by the creation index, a 
> global entity atomically updated every time an attribute is created), and 
> proceed to destroy the attributed in the desired order. Thus instead of 
> having a linear operation for every operation on attributes, we only have a 
> single linear operation per communicator (and this during the destruction 
> stage).
> 
>   George.
> 
> On Jan 16, 2013, at 16:37 , KAWASHIMA Takahiro  
> wrote:
> 
> > Hi,
> > 
> > I've implemented ticket #3123 "MPI-2.2: Ordering of attribution deletion
> > callbacks on MPI_COMM_SELF".
> > 
> >  https://svn.open-mpi.org/trac/ompi/ticket/3123
> > 
> > As this ticket says, attributes had been stored in unordered hash.
> > So I've replaced opal_hash_table_t with opal_list_t and made necessary
> > modifications for it. And I've also fixed some multi-threaded concurrent
> > (get|set|delete)_attr call issues.
> > 
> > By this modification, following behavior changes are introduced.
> > 
> >  (A) MPI_(Comm|Type|Win)_(get|set|delete)_attr function may be slower
> >  for MPI objects that has many attributes attached.
> >  (B) When the user-defined delete callback function is called, the
> >  attribute is already removed from the list. In other words,
> >  if MPI_(Comm|Type|Win)_get_attr is called by the user-defined
> >  delete callback function for the same attribute key, it returns
> >  flag = false.
> >  (C) Even if the user-defined delete callback function returns non-
> >  MPI_SUCCESS value, the attribute is not reverted to the list.
> > 
> > (A) is due to a sequential list search instead of a hash. See find_value
> > function for its implementation.
> > (B) and (C) are due to an atomic deletion of the attribute to allow
> > multi-threaded concurrent (get|set|delete)_attr call in MPI_THREAD_MULTIPLE.
> > See ompi_attr_delete function for its implementation. I think this does
> > not matter because MPI standard doesn't specify behavior in such cases.
> > 
> > The patch for Open MPI trunk is attached. If you like it, take in
> > this patch.
> > 
> > Though I'm a employee of a company, this is my independent and private
> > work at my home. No intellectual property from my company. If needed,
> > I'll sign to Individual Contributor License Agreement.


Re: [OMPI devel] [patch] MPI-2.2: Ordering of attribution deletion callbacks on MPI_COMM_SELF

2013-01-17 Thread George Bosilca
Takahiro,

Thanks for the patch. I deplore the lost of the hash table in the attribute 
management, as the potential of transforming all attributes operation to a 
linear complexity is not very appealing.

As you already took the decision C, it means that at the communicator 
destruction stage the hash table is not relevant anymore. Thus, I would have 
converted the hash table to an ordered list (ordered by the creation index, a 
global entity atomically updated every time an attribute is created), and 
proceed to destroy the attributed in the desired order. Thus instead of having 
a linear operation for every operation on attributes, we only have a single 
linear operation per communicator (and this during the destruction stage).

  George.

On Jan 16, 2013, at 16:37 , KAWASHIMA Takahiro  
wrote:

> Hi,
> 
> I've implemented ticket #3123 "MPI-2.2: Ordering of attribution deletion
> callbacks on MPI_COMM_SELF".
> 
>  https://svn.open-mpi.org/trac/ompi/ticket/3123
> 
> As this ticket says, attributes had been stored in unordered hash.
> So I've replaced opal_hash_table_t with opal_list_t and made necessary
> modifications for it. And I've also fixed some multi-threaded concurrent
> (get|set|delete)_attr call issues.
> 
> By this modification, following behavior changes are introduced.
> 
>  (A) MPI_(Comm|Type|Win)_(get|set|delete)_attr function may be slower
>  for MPI objects that has many attributes attached.
>  (B) When the user-defined delete callback function is called, the
>  attribute is already removed from the list. In other words,
>  if MPI_(Comm|Type|Win)_get_attr is called by the user-defined
>  delete callback function for the same attribute key, it returns
>  flag = false.
>  (C) Even if the user-defined delete callback function returns non-
>  MPI_SUCCESS value, the attribute is not reverted to the list.
> 
> (A) is due to a sequential list search instead of a hash. See find_value
> function for its implementation.
> (B) and (C) are due to an atomic deletion of the attribute to allow
> multi-threaded concurrent (get|set|delete)_attr call in MPI_THREAD_MULTIPLE.
> See ompi_attr_delete function for its implementation. I think this does
> not matter because MPI standard doesn't specify behavior in such cases.
> 
> The patch for Open MPI trunk is attached. If you like it, take in
> this patch.
> 
> Though I'm a employee of a company, this is my independent and private
> work at my home. No intellectual property from my company. If needed,
> I'll sign to Individual Contributor License Agreement.
> 
> Regards,
> KAWASHIMA Takahiro
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel