Re: [Paraview] Parallel Data Redistribution

burlen Wed, 16 Dec 2009 13:23:57 -0800

oops typo: The ooc reader is a vtkObject.

burlen wrote:

Hey John,
Also : for dynamic load balancing, I'd like to instruct severalreader to read the same piece - since the algorithm controls (forexample) the particles the algorithm can internally communicateinformation about what to do amongst its processes, but it can't talkupstream to the readers and fudge them.
I am wondering if there is any way of supporting this kind of thingusing the current information keys and my instinct says no.
I guess you can kind of do this with the current "request update"stuff but thanks to the flexibility of the pipeline informationkey,values you can also roll your own very easy.
I recently implemented dynamic load balancing in a new stream linetracer. To get the work load balanced its crucial that each processhave to have on demand access to the entire data set. I accomplishedit with information keys and by using a "meta-reader" in place of thetraditional paraview reader. The meta reader does two things, itpopulates the new keys and it gives PV a dummy dataset that is onecell per process such that the bounds, shape, and array names are thesame as the real dataset which is not read during the meta-readerexecution. When the stream tracer executes downstream of themeta-reader he picks the keys out of the pipeline information. Theimportant key,value is an out-of-core (ooc) reader. so that it can bepassed through the information. Once the stream tracer has it he canmake repeated IO requests as particles move through the dataset asneeded. My interface accepts a point and returns a chunk of data. Theooc reader internally handles caching and memory management. In thisway you can keep all processes busy all the time when tracing streamlines. The approach worked out well and was very simple to implement,with no modification to the executive. Also the filter has control ofcaching, and can free all the memory at the end of its execution whichreduces significantly the memory footprint compared to the traditionalPV reader. And I need not worry if PV or some upstream filter uses MPIcommunications in between during my IO requests. There is a littlemore to our scheduling algorithm which I wont discus now but so farfor making poincare maps we scaled well up to 2E7 stream lines perframe and 96 processes and we minimize the memory footprint which isimportant to us.
Berk and Ken already basically gave you all the options you need but Iadd this because it shows how flexible and powerful the pipelineinformation really is.
Burlen

Biddiscombe, John A. wrote:
Berk,
We had a discussion back in 2008, which resides herehttp://www.cmake.org/pipermail/paraview/2008-May/008170.html
Continuing from this, my question of the other day, touches on thesame problem.
I'd like to manipulate the piece number read by each reader. Asmentioned before, UPDATE_PIECE is not passed into RequestInformationat first (since nobody knows how many pieces there are yet!), so Ican't (directly) generate information in the reader which is 'piecedependent'. And I can't be sure that someone doing streaming won'tinterfere with piece numbers when using the code differently.
For the particle tracer (for example), I'd like to tell the upstreampipeline to read no pieces when certain processes are empty ofparticles (currently they update and generate{=read} data when theydon't need to). I may be able to suppress the forward upstreamsomehow, but I don't know of an easy way for the algorithm to say"Stop" to the executive to prevent it updating if the timestepchanges, but the algorithm has determined that no processing isrequired (ForwardUpstream of Requests continues unabated). I'd liketo set the UPdatePiece to -1 to tell the executive to stop operating.
Also : for dynamic load balancing, I'd like to instruct severalreader to read the same piece - since the algorithm controls (forexample) the particles the algorithm can internally communicateinformation about what to do amongst its processes, but it can't talkupstream to the readers and fudge them.
I am wondering if there is any way of supporting this kind of thingusing the current information keys and my instinct says no. It seemslike the update pice and numpieces were really intended for streamingand we need two kinds of 'pieces', one for streaming, another forsplitting in _parallel_ because they aren't quite the same. (Pleasenote that I haven't actually tried changing piece requests in thealgorithms yet, so I'm only guessing that it won't work properly)
<cough>
UPDATE_STREAM_PIECE
UPDATE_PARALLEL_PIECE <\cough>

Comments?

JB
I would have the reader (most parallel readers do this) generate empty
data on all processes of id >= N. Then your filter can redistribute
from those N processes to all M processes. I am pretty sure
RedistributePolyData can do this for polydata as long as you set the
weight to 1 on all processes. Ditto for D3.

-berk
On Fri, Dec 11, 2009 at 4:13 PM, Biddiscombe, John A.<biddi...@cscs.ch>
wrote:
Berk
It sounds like M is equal to the number of processors (pipelines) and
M >> N. Is that correct?
Yes, That's the idea. N blocks, broken (in place) into M newblocks, then
fanned out to the M processes downstream where they can be processed
separately . If it were on a single node, then each block could be a
separate 'connection' to a downstream filter, but distributed, anexplicit
send is needed.
JB
-berk
On Fri, Dec 11, 2009 at 10:40 AM, Biddiscombe, John A.<biddi...@cscs.ch>
wrote:
Berk
The data will be UnstructuredGrid for now. Multiblock, butactually, I
don't really care what each block is, only that I accept one block on
each
of N processes, split it into more pieces, and the next filteraccepts
one
(or more if the numbers don't match up nicely) blocks and processthem.
The
redistribution shouldn't care what data types, only how manyblocks in
and
out.
Looking at RedistributePolyData makes me realize my initial ideais no
good. In my mind I had a pipeline where multiblock datasets arepassed
down
the pipeline and simply the number of pieces is manipulated toachieve
what
I wanted - but I see now that if I have M pieces downstream mapped
upstream
to N pieces, what will happen is the readers will be effectively
duplicated
and M/N readers will read the same pieces. I don't want this tohappen as
IO
will be a big problem if readers read the same blocks M/N times.
I was hoping there was a way of simply instructing the pipeline to
manage
the pieces, but I see now that this won't work, as there needs tobe a
specific Send from each N to their M/N receivers (because the data is
physically in another process, so the pipeline can't see it). This is
very
annoying as there must be a class which already does this (block
redistribution, rather than polygon level redistribution), and Iwould
like
it to be more 'pipeline integrated' so that the user doesn't have to
explicitly send each time an algorithm needs it.
I'll go through RedistributePolyData in depth and see what I canpull
out
of it - please feel free to steer me towards another possibility :)
JB
-----Original Message-----
From: Berk Geveci [mailto:berk.gev...@kitware.com]
Sent: 11 December 2009 16:09
To: Biddiscombe, John A.
Cc: paraview@paraview.org
Subject: Re: [Paraview] Parallel Data Redistribution
What is the data type? vtkRedistributePolyData and itssubclasses dothis for polydata. It can do load balancing (where you canspecify a
weight for each processor) as well.

-berk

On Fri, Dec 11, 2009 at 9:59 AM, Biddiscombe, John A.
<biddi...@cscs.ch>
wrote:
I have a filter pipeline which reads N blocks from disk, thisworks
fine
on N processors.
I now wish to subdivide those N blocks (using a custom filter) to
produce
new data which will consist of M blocks - where M >> N.
I wish to run the algorithm on M processors and have the piece
information
transformed between the two filters (reader -> splitter), so that
blocks
are
distributed correctly. The reader will Read N blocks (leaving M-N
processes
unoccupied), but the filter which splits them up needs to output a
different
number of pieces and have the full M processes receiving data.
I have a reasonably good idea of how to implement this, but I'm
wondering
if any filters already do something similar. I will of course take
apart
the
D3 filter for ideas, but I don't need to do a parallel spatial
decomposition
since my blocks are already discrete - I just want toredistribute theblocks around and more importantly change the numbers of thembetween
filters.
If anyone can suggest examples which do this already, please do

Thanks

JB

--
John Biddiscombe,                            email:biddisco @
cscs.ch
http://www.cscs.ch/
CSCS, Swiss National Supercomputing Centre  | Tel:  +41 (91)
610.82.07
Via Cantonale, 6928 Manno, Switzerland      | Fax:  +41 (91)
610.82.82
_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html
Please keep messages on-topic and check the ParaView Wiki at:
http://paraview.org/Wiki/ParaView
Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview
_______________________________________________
Powered by www.kitware.com
Visit other Kitware open-source projects athttp://www.kitware.com/opensource/opensource.html
Please keep messages on-topic and check the ParaView Wiki at:http://paraview.org/Wiki/ParaView
Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview


_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at 
http://www.kitware.com/opensource/opensource.html

Please keep messages on-topic and check the ParaView Wiki at: 
http://paraview.org/Wiki/ParaView

Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview

Re: [Paraview] Parallel Data Redistribution

Reply via email to