Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?

Marco Hugentobler Tue, 23 Jun 2015 07:44:33 -0700

Hi Nyall

I'm still a bit unclear as to where
you are taking these classes though. Is the intention that eventually
ALL operations will be done using QgsGeos?

In the end, yes, everything should go through QgsGeometryEngineinterface. There are two possibilities:

1. client code creates QgsGeos, calls prepare() on it and makes repeadedintersection(), etc. directly on QgsGeometryEngine. This can be donewith master branch as is.

2. client code uses QgsGeometry and calls intersection() on it. Tobenefit from prepared geometries, we would need to enhance QgsGeometryto cache QgsGeos and call prepare() the first time it is used.

Thinking about it, version 1 might be better. We cannot know if thecached QgsGeos is really needed a second time and it might use morememory if not. In the first case, the client code knows if it isintended to do repeaded operations on the same geometry.


What do you think?

Could we put a giant "NOT STABLE API AND MAY CHANGE" for
all these classes, so that we can address any unforeseen issues in
2.12?


That would be ok.

Regards,
Marco



On 23.06.2015 13:59, Nyall Dawson wrote:

On 23 June 2015 at 16:15, Marco Hugentobler
<marco.hugentob...@sourcepole.ch> wrote:

Hi Nyall

This potentially has huge impacts on the performance of common tasks such
as selecting all features which intersect a geometry.

Are you really sure the impact is huge? If you select features on the map,
you get a new one from provider each time. So also with the old geometry
class, it was necessary to convert to geos for each intersect.

That's why the subject line says "potentially" ;)

But yes, if several geos operations are done in sequence on really the same
geometry instance, there is an overhead now. However, I would assume the
geos conversion takes far less time than the geos operation itself. Did you
observe cases where the geos conversion really takes very long time?

My biggest concern was operations such as the "Select by location"
operations, where a single feature is compared against all the other
features in a second layer. In this case there would be benefit from
retaining the geos representation between each comparison. However,
I've just run some tests and there's no appreciable difference in the
time required for selecting intersecting features from large layers
using this algorithm between 2.8->2.10, so you're correct in that the
geos conversion isn't very expensive. (Unfortunately, as Martin has
pointed out, memory usage in 2.10 was up by 50%).

Lets assume the geos conversion indeed takes a long time. Couldn't we just
store the pointer to QgsGeos (or better QgsGeometryEngine class) inside
QgsGeometry? This would be straightforward to do. Furthermore, it is
possible to prepare QgsGeos. A prepared geos geometry can do _much_ faster
intersects/touches/ etc. operations if called repeatedly. We can cache a
prepared QgsGeos now with the new engine, but we couldn't do that with the
old QgsGeometry class. So for repeated predicate calculations, the new
geometry is much better, also performance wise.

Agreed - when PostGIS flipped their intersects operation to utilise
prepared geometries the difference was huge. I'm excited to see this
taken advantage of in QGIS too! I'm still a bit unclear as to where
you are taking these classes though. Is the intention that eventually
ALL operations will be done using QgsGeos? Because if not, then I
can't see anyway to actually utilise the prepared geometries as it
stands now - none of the methods in QgsGeometry utilise these.

Please don't take my reservations as attacking this work. I think it's
great stuff, and generally has been quite painless since it landed,
which is surprising given the extent of the work. I just want to make
sure 2.10 is in the best shape it can be in, and I'm doubtful that we
are release ready yet.

My biggest concern is honestly having the current QgsGeometryV2 API
locked in while there are still questions regarding how it's
implemented. Could we put a giant "NOT STABLE API AND MAY CHANGE" for
all these classes, so that we can address any unforeseen issues in
2.12?

Nyall


Regards,
Marco


On 23.06.2015 04:09, Nyall Dawson wrote:

Hi all,

Unfortunately, we've become aware of a serious performance regression caused
by the new geometry engine. Basically, the situation is that for all
geometry operations which rely on geos (think buffers, splits, spatial
relation operations such as intersects and within,... ) the geometry now
needs to be converted into a geos representation with *every* operation. In
the old geometry engine this conversion was done once, and the result stored
so that follow up operations would not need to recalculate it. This
potentially has huge impacts on the performance of common tasks such as
selecting all features which intersect a geometry.

I've had a look, and unfortunately it's not trivial to fix this. I think the
correct solution to this is to:

- make QgsGeometryEngine accept and return QgsGeometry containers, not
QgsAbstractGeometryV2
- store the generated geos representation of geometries within
QgsGeometryPrivate inside the QgsGeometry container. This way it will be
reusable between different geometry operations, and shared when QgsGeometry
objects are copied. This will also have the benefit that if a geometry is
prepared using geos then subsequent geos operations performed on that
QgsGeometry and its shared copies will be much faster.
- make QgsGeometry a friend class of QgsGeo, so that it can access
QgsGeometryPrivate to retrieve or set the geos representation of the
geometry as required

An alternative (short term) solution would be to just cache the geos
representation when geometry operations are called through the older
QgsGeometry modification/relationship operations. This would be easier, but
means that the API of QgsGeometryEngine will be stuck with the current
design, and we won't be able to properly fix this until breaking the api for
3.0.

Either way, I doubt this can be addressed within the remaining 3 days we
have until release. Should we delay to address this? Release with the
regression? Or am I missing something and there's an easier solution we
could implement? Or even possibly this additional cost of recalculating the
geos representation is trivial and can be ignored (maybe someone could test
this with a little repeated intersection script)?

Thoughts?

Nyall



_______________________________________________
Qgis-developer mailing list
Qgis-developer@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/qgis-developer



--
Dr. Marco Hugentobler
Sourcepole -  Linux & Open Source Solutions
Weberstrasse 5, CH-8004 Zürich, Switzerland
marco.hugentob...@sourcepole.ch http://www.sourcepole.ch
Technical Advisor QGIS Project Steering Committee


_______________________________________________
Qgis-developer mailing list
Qgis-developer@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/qgis-developer



--
Dr. Marco Hugentobler
Sourcepole -  Linux & Open Source Solutions
Weberstrasse 5, CH-8004 Zürich, Switzerland
marco.hugentob...@sourcepole.ch http://www.sourcepole.ch
Technical Advisor QGIS Project Steering Committee

_______________________________________________
Qgis-developer mailing list
Qgis-developer@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/qgis-developer

Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?

Reply via email to