[petsc-dev] Generality of VecScatter

Mark F. Adams Fri, 25 Nov 2011 10:38:49 -0500

On Nov 24, 2011, at 2:12 PM, Jed Brown wrote:

> On Thu, Nov 24, 2011 at 12:40, Mark F. Adams <mark.adams at columbia.edu> 
> wrote:
> OK, that makes sense at least abstractly.  I'm skeptical that you are going 
> to be able to do much with all this one sided stuff; it seems to be that you 
> would always have the two sided data at some point and then you might setup 
> one sided maps for repeated operations like for a MatVec that are efficient.  
> But this is kind of trivial.
> 
> Yeah, that's easy. The update for MatMult() just involves each process 
> calling one MPI_Get() for each remote process that it needs data from. What 
> to get is encoded in an MPI_Datatype so there is no user-visible packing on 
> either side. These calls are all non-blocking and you finish the update with 
> MPI_Win_fence().
> 
> You can test performance of this approach using -vecscatter_window (although 
> it's slightly different because it does user-visible packing).
> 
> Of course we can easily build a two-sided representation, but it's not 
> necessary. In the one-sided model above, the owner of a global point does not 
> know how many local parts accessed it (and nowhere inside the MPI 
> implementation or network stack is there storage proportional to the number 
> of accessors, so there isn't a scalability problem with globally coupled 
> dofs).
>  
> 
> I would suggest writing a non-trivail algorithm with your model.  I have two 
> ideas:
> 
> 1) when a coarse AMG (use aggregation MG to me concrete) grid is constructed 
> and you have a map (bipartite graph) with one edge for each fine grid point 
> to a coarse grid point (not necessarily on your processor).  I have too few 
> vertices per processor so I want to reduce the number of processors used in 
> the coarse grid.
> 
> Do you assume the availability of a partitioner at this stage,


No, lets keep it simple for now.

> or you just want to do something simple like rank-order aggregation?

Thats fine.

> In my model, the owner of a point would somehow (e.g. by calling a 
> partitioner) determine the new owner of that point.

My point is that at some point the rubber has to hit the road and you need two 
side information.  You can then set up pointers for some fast one-sided stuff, 
and that is fine but you seem to agree that this is trivial.  Its not clear to 
me that your model is not simply removing some (redundant) data from the user 
space, while it is probably in the MPI library.

> This information is communicated to the ghosters of that point (again without 
> knowing how many there are),

But you knew how many there were, and chose to forget it, and the MPI layer 
must know how many there are so it knows when its done communicating.  I'm not 
sure how you insure correctness with your model.

> any other node/connectivity data is communicated from old owner to new owner 
> (this is a 1-1 relation, thus trivial with these primitives), and finally the 
> ghosters address the new location when they need updates.
>  
>  I can assume that my fine grid was well partitioned, and to keep the process 
> local assume the fine grid was partitioned with a space filling curve, or 
> something, such that when I aggregate processors (eg, (0-3), (4-7) ... if I 
> reduced the number of active processor by 4) I still have decent data 
> locality.  Eventually I would like a local algorithm to improve load balance 
> at this point, but that is more than you might want to start with.
> 
> If you have a local graph partitioner that can tell you what to move for 
> better load balance, then I think I can move it efficiently and without 
> redundant storage or specification.

How do you load balance with a local partitioner?

Mark

>  
> 
> 2) Any AMR algorithm that you like.  Again, you might want to punt on load 
> balancing.
> 
> I think the primitives are fine for this. There is an algorithmic question of 
> how to modify boundary zones with minimal communication, but if you decide 
> how much you want to communicate when (e.g. on the chalk board), I will 
> figure out how to implement it.
> 
> I think redistribution for load balancing is represented pretty elegantly, 
> but it would be good to work out the details.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20111125/c6caee99/attachment.html>

[petsc-dev] Generality of VecScatter

Reply via email to