[geos-devel] CoordinateArraySequence == CoordinateSequence

Paul Ramsey Tue, 23 Aug 2022 12:22:09 -0700

One of the things that tinkering with the SegmenString layer of overlay brought 
out to me was the extent to which we construct CoordinateSequence almost 
exclusively out of CoordinateArraySequence. Like, all the time. At yet, because 
we handle those CoordinateArraySequence at the API level almost exclusively as  
CoordinateSequence we lose the ability to do some handy optimizations.


Like, if one were going to (as one does on every single CoordinateSequence that 
enters the overlay code) 
(1) test if there are repeated points and 
(2a) remove any if there are
(2b) just return the untouched CoordinateSequence if there aren't
a useful pattern would be for ::hasRepeatedPoints() to return/populate a list 
of indexes at which repeated points appear and for ::removeRepeatedPoints() to 
do bulk copies of all the points in between those indexes. This is foreclosed 
by the CoordinateSequence API, you can play this trick nicely with a 
std::vector living underneath, but the API doesn't let us see that (in fact) 
that's what we have 99.9% of the time.

So, one obvious thing to do would be to remove the virtual methods in 
CoordinateSequence and pull the implementation up to that level, std::vector 
and all, and give up on the idea of an abstract interface that we don't 
actually use. For a handful of use cases, where data access cost is greater 
than computation cost (area, length, distance(?), some others (?)) this might 
be "bad" in some theoretical way, but note that currently we still don't 
actually have that abstract layer in place for a zero copy computation. 
Removing the virtual methods and inheritance from CoordinateSequence would 
foreclose an option that (a) we seem unlikely to ever deliver on and (b) has 
narrow performance benefits even if we did deliver on it.

Meanwhile, the flip case seems to likely have a *lot* of performance benefits 
just hanging around waiting to be harvested. Coordinate access without going 
through the inheritance structure; access to some bulk operations like the 
repeated points case.

For the "zero copy" crew, I feel like a big chunk of gains for them could be 
harvested by ensuring that point-based operations are available and don't 
require construction of a full Point() object. So things like 
PreparedGeometry->intersects(x, y). Sure, you still have to copy in your 
polygon feature and prepare it, but much of the overhead in that would still 
exist in a "zero copy" paradigm (all the internal index buildings). Meanwhile 
you'd no longer need to create a full Point() to do a point-in-poly test, and 
that would hopefully be a big win for most users.

Random thoughs on a sunny day,
P

_______________________________________________
geos-devel mailing list
geos-devel@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/geos-devel

[geos-devel] CoordinateArraySequence == CoordinateSequence

Reply via email to