Many thanks for the insight, Ken. Yes, we used transparency of 0.23 to render 
the channel wall transparent so that we can see how vorticity flows pass 
through the wall. Looks like data redistribution for visibility reordering for 
transparency effect was to blame for the performance overhead. We also rendered 
three views (top, side, and front) for each time point with transparent walls 
in each view, which also contributes to the performance overhead. We are also 
planning to move vorticity computation to the solver from ParaView, which 
should help alleviate coprocessing tax a little bit.

We will see whether we can get by without rendering transparent walls. In the 
meantime, let me know if you have any suggestions to work around the data 
redistribution part for rendering transparent surfaces.

Thanks again and best regards,

Hong

________________________________
From: Moreland, Kenneth [kmo...@sandia.gov]
Sent: Wednesday, November 27, 2013 12:06 PM
To: Berk Geveci; Hong Yi
Cc: paraview@paraview.org
Subject: Re: [Paraview] In-situ file/image output on Titan with 18k cores

I also wonder if you are trying to render anything transparent (volume 
rendering, opacity < 1, or transparency in the color bars). If you do that, 
then ParaView will redistribute the data to create a visibility reordering. 
Although the rendering itself has been scaled past 18K cores, the data 
redistribution (using the D3 filter) has not. In fact, I would not be super 
surprised that it could take that long even with very little data.

In short, try rendering only opaque surfaces if you have not yet tried that.

-Ken

From: Berk Geveci <berk.gev...@kitware.com<mailto:berk.gev...@kitware.com>>
Date: Wednesday, November 27, 2013 5:46 AM
To: Hong Yi <hon...@renci.org<mailto:hon...@renci.org>>
Cc: "paraview@paraview.org<mailto:paraview@paraview.org>" 
<paraview@paraview.org<mailto:paraview@paraview.org>>
Subject: [EXTERNAL] Re: [Paraview] In-situ file/image output on Titan with 18k 
cores

Hi Hong,

> 1.       It appears IceT-based image compositing for 18k cores takes such a 
> long time that it becomes unpractical to output images in-situ.
> Specifically, in our case, it takes about 14 minutes for coprocessing for one 
> time point that output a composited image while simulation
> alone for one time point only takes about 7 seconds. I have also done a 
> simulation run with in-situ visualization on Titan with 64 cores on a
> much lower resolution mesh (10 million element mesh as opposed on 167 million 
> element mesh for 18k core run), in which case
> coprocessing with image output for 64 cores takes about 25 seconds. Question: 
> is there any way to improve performance of image
> compositing for 18k cores for in-situ visualization?

This doesn't make a lot of sense. Image compositing performance is not strongly 
tied to the number of polygons. It is much more related to the number of cores 
and the image size. So 64K cores with small data should not perform so much 
better than 18K cores with large data. Since Ice-T takes bounding boxes into 
account when compositing, there may be performance gains when rendering less 
geometry but not to the extent that you are describing.

On the other hand, I can see Mesa rendering performance being an issue. The 18K 
run probably has significantly more polygons per MPI rank, specially if the 
polygons are not distributed somewhat evenly. This is definitely worthwhile 
investigating. Do you have cycles to run a few more cases? We can instrument 
things a bit better to see what is taking this much time.

> 2.       I also tried to avoid image output, but output polydata extracts 
> using XMLPPolyDataWriter instead on 18k cores. In this case, in-situ
> coprocessing only takes about 20 seconds (compared to 14 minutes with image 
> output). However, too many files are generated to a point
> that breaks the hard limit on maximal number of files in a directory since 
> the parallel writer writes a vtp file from each of 18k cores. So the
> output data files have to be broken up into different directories. However, I 
> got “cannot find file” error when I put a directory name as a
> parameter in coprocessor.CreateWriter() function call in my python script. I 
> tried initially to put “data/vorticity_%t.pvtp” as a parameter, but it
> fails with “cannot find file” error. Not sure whether this is a bug or I need 
> to put absolute full path in rather than a relative path to the current
> directory. Another question is whether there are ways to composite these 
> files generated from different cores into one single file while doing
> coprocessing so only one composite file is generated rather than a huge 
> number of files when running on large number of cores.

We are working on ADIOS based readers and writers that will allow for writing 
to a single bp file. This should be ready sometime in January. This should 
makes things much better.

-berk

On Tue, Nov 26, 2013 at 10:31 AM, Hong Yi 
<hon...@renci.org<mailto:hon...@renci.org>> wrote:
>
> I have done several simulation runs linked with ParaView Catalyst for in-situ 
> visualization on Titan with 18k cores and have the following 
> observations/questions hoping to seek input from this list.
>
>
>
> 1.       It appears IceT-based image compositing for 18k cores takes such a 
> long time that it becomes unpractical to output images in-situ. Specifically, 
> in our case, it takes about 14 minutes for coprocessing for one time point 
> that output a composited image while simulation alone for one time point only 
> takes about 7 seconds. I have also done a simulation run with in-situ 
> visualization on Titan with 64 cores on a much lower resolution mesh (10 
> million element mesh as opposed on 167 million element mesh for 18k core 
> run), in which case coprocessing with image output for 64 cores takes about 
> 25 seconds. Question: is there any way to improve performance of image 
> compositing for 18k cores for in-situ visualization?
>
> 2.       I also tried to avoid image output, but output polydata extracts 
> using XMLPPolyDataWriter instead on 18k cores. In this case, in-situ 
> coprocessing only takes about 20 seconds (compared to 14 minutes with image 
> output). However, too many files are generated to a point that breaks the 
> hard limit on maximal number of files in a directory since the parallel 
> writer writes a vtp file from each of 18k cores. So the output data files 
> have to be broken up into different directories. However, I got “cannot find 
> file” error when I put a directory name as a parameter in 
> coprocessor.CreateWriter() function call in my python script. I tried 
> initially to put “data/vorticity_%t.pvtp” as a parameter, but it fails with 
> “cannot find file” error. Not sure whether this is a bug or I need to put 
> absolute full path in rather than a relative path to the current directory. 
> Another question is whether there are ways to composite these files generated 
> from different cores into one single file while doing coprocessing so only 
> one composite file is generated rather than a huge number of files when 
> running on large number of cores.
>
> Thanks for any input, suggestions, and comments!
>
>
>
> Regards,
>
> Hong
>
>
> _______________________________________________
> Powered by www.kitware.com<http://www.kitware.com>
>
> Visit other Kitware open-source projects at 
> http://www.kitware.com/opensource/opensource.html
>
> Please keep messages on-topic and check the ParaView Wiki at: 
> http://paraview.org/Wiki/ParaView
>
> Follow this link to subscribe/unsubscribe:
> http://www.paraview.org/mailman/listinfo/paraview
>
_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at 
http://www.kitware.com/opensource/opensource.html

Please keep messages on-topic and check the ParaView Wiki at: 
http://paraview.org/Wiki/ParaView

Follow this link to subscribe/unsubscribe:
http://www.paraview.org/mailman/listinfo/paraview

Reply via email to