Parallelization of the rendering pipeline for faster rendering/painting
operations isn't the primary goal we had in GeoWave with the distributed
rendering request; rather we wanted to short circuit the actual reading of
features off of disk/memory based on "not coloring pixels twice for the
same FeatureTypeStyle rule")
Consider the case where a system had 100GB of data on disk (ignoring memory
caches for now). The requirement to read all that data from disk, at say
250MB/sec (SSD, raid, etc) still requires almost 7 minutes - regardless of
how fast the rendering is.
If instead we could organize the data in such a way that a sequential
stream gave us data with geographic locality (so I "fill up" a given tile
first), and then cancel the read operation when that tile is full we could
greatly reduce the amount of data read off of the disk. Here's a quick
graphic I put together, that hopefully helps:
https://dl.dropboxusercontent.com/u/6649380/distributed2d.png
We originally id'd this for point features. Working with the constraint
that 1 point = 1 pixel (which was only valid for some symbolizers - but
for dense data was a fairly safe limit) Rich implemented the render
transform method he described earlier - this did the skipping in the
graphic above, but the just returned the feature to geoserver to render
normally. That worked pretty well - here's an example with the full OSMGpx
data set loaded and rendering in real time
~37 billion key-value pairs in accumulo (I think it came out to 30-50GB on
disk, would have to check again)
https://dl.dropboxusercontent.com/u/6649380/decimation-1.png
level 0 overview
https://dl.dropboxusercontent.com/u/6649380/decimation-2.png
Rendered in real time, no caching (< 4 seconds on a small 5 node accumulo
cluster). Without decimation we would have been limited to streaming the
30-50GB off of disk first
zooming in
https://dl.dropboxusercontent.com/u/6649380/decimation-3.png
Shows the data detail resolve based on the pixel <-> geo mapping
Get Feature Info
https://dl.dropboxusercontent.com/u/6649380/decimation-4.png
Shows all the underlying points still available for a single pixel
(note, source data visualized in those screenshots copyright OpenStreetMap
& contributes, CC-BY-SA )
It's that data size independence we would like to extend to lines and
polygons. Spreading out the rendering task might be a side benefit, but
that's actually somewhat debateable, and might be overshadowded by the need
to render a tile fore each feature type style rule (and I glossed over it,
but there probably would be multiple tiles rendered for each feature type
style rule depending on how many servers a particular tile's data was
distributed over).
Hope that helps some.
On Sat, Feb 14, 2015 at 2:42 PM, Andrea Aime <andrea.a...@geo-solutions.it>
wrote:
> (...)
>
> Long story short, a layer parallel renderer seems suitable for desktop
> usage (where you have one request at a time, and 8-16GB of memory to play
> with),
> but if you want to go server side, you either have relatively low load, or
> a truckload of memory (and once you go there, you need to change JVM, and
> probably use some non free one like Azul).
> And I have to contrast this with people that keep on asking me if a 2/4GB
> VMWare machine/Amazon cloud instance is enough to run GeoServer ;-)
>
> And you first need to make parallel painting go fast, by adding Marlin
> into the mix, but this is more a problem of being willing/authorized.
>
> Do you have any more experience/observation on the subject?
>
> Cheers
> Andrea
>
> --
> ==
> GeoServer Professional Services from the experts! Visit
> http://goo.gl/NWWaa2 for more information.
> ==
>
> Ing. Andrea Aime
> @geowolf
> Technical Lead
>
> GeoSolutions S.A.S.
> Via Poggio alle Viti 1187
> 55054 Massarosa (LU)
> Italy
> phone: +39 0584 962313
> fax: +39 0584 1660272
> mob: +39 339 8844549
>
> http://www.geo-solutions.it
> http://twitter.com/geosolutions_it
> (...)
>
------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel