Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
Hi Nyall - For methods which don't have a geos implementation (eg, QgsGeometry::smooth ), should they just be implemented in the QgsGeometryEngine base class? They can stay in QgsGeometry as they don't require conversion to a different format and they would not benefit from prepare(). It means that they'd need to be aware that QGIS uses geos for these geometry operations and use that QgsGeos is the class required for geometry relation/modification operations Ideally, they work just with QgsGeometryEngine and don't realize which library is behind it. is there a good reason we couldn't instead move these geos implementations to QgsGeometryEngine itself and do away with QgsGeos? See above. QgsGeos is a subclass of QgsGeometryEngine. The only thing to be done is probably to move the method QgsGeometryEditUtils::geometryEngine() to a better place (e.g. QgsGeometry) and add python bindings for it. That way, plugin programmers just get the recommended subclass (obviously QgsGeos) and work with the base class interface. and do away with QgsGeos? Do you mean remove the python bindings for QgsGeos? Agreed, it will be good to discourage the direct use of the subclass. Regards, Marco On 24.06.2015 01:13, Nyall Dawson wrote: On 24 June 2015 at 00:44, Marco Hugentobler wrote: Hi Nyall I'm still a bit unclear as to where you are taking these classes though. Is the intention that eventually ALL operations will be done using QgsGeos? In the end, yes, everything should go through QgsGeometryEngine interface. There are two possibilities: 1. client code creates QgsGeos, calls prepare() on it and makes repeaded intersection(), etc. directly on QgsGeometryEngine. This can be done with master branch as is. This sounds like the right approach to me, but I have a couple of further questions: - For methods which don't have a geos implementation (eg, QgsGeometry::smooth ), should they just be implemented in the QgsGeometryEngine base class? - I'm a bit concerned about the usability of this approach for plugin developers. It means that they'd need to be aware that QGIS uses geos for these geometry operations and use that QgsGeos is the class required for geometry relation/modification operations. That's not obvious and could be a stumbling block for new PyQGIS developers. It also locks these plugins then into using the geos engine, and if we ever decided to switch to a different engine then the plugins would also need to be updated. Just thinking aloud here... is there a good reason we couldn't instead move these geos implementations to QgsGeometryEngine itself and do away with QgsGeos? Then, in the (unlikely?) situation that a geos implementation is inferior to a native QGIS method we can mix-and-match between geos and alternative implementations as required. 2. client code uses QgsGeometry and calls intersection() on it. To benefit from prepared geometries, we would need to enhance QgsGeometry to cache QgsGeos and call prepare() the first time it is used. Thinking about it, version 1 might be better. We cannot know if the cached QgsGeos is really needed a second time and it might use more memory if not. In the first case, the client code knows if it is intended to do repeaded operations on the same geometry. What do you think? Could we put a giant "NOT STABLE API AND MAY CHANGE" for all these classes, so that we can address any unforeseen issues in 2.12? That would be ok. Great! That's my biggest reservation - if we've got the opportunity to tweak the API for 2.12 then my fears have been unfounded :) Thanks! Nyall -- Dr. Marco Hugentobler Sourcepole - Linux & Open Source Solutions Weberstrasse 5, CH-8004 Zürich, Switzerland marco.hugentob...@sourcepole.ch http://www.sourcepole.ch Technical Advisor QGIS Project Steering Committee ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
On 24 June 2015 at 00:44, Marco Hugentobler wrote: > Hi Nyall > >> I'm still a bit unclear as to where >> you are taking these classes though. Is the intention that eventually >> ALL operations will be done using QgsGeos? > > > In the end, yes, everything should go through QgsGeometryEngine interface. > There are two possibilities: > > 1. client code creates QgsGeos, calls prepare() on it and makes repeaded > intersection(), etc. directly on QgsGeometryEngine. This can be done with > master branch as is. This sounds like the right approach to me, but I have a couple of further questions: - For methods which don't have a geos implementation (eg, QgsGeometry::smooth ), should they just be implemented in the QgsGeometryEngine base class? - I'm a bit concerned about the usability of this approach for plugin developers. It means that they'd need to be aware that QGIS uses geos for these geometry operations and use that QgsGeos is the class required for geometry relation/modification operations. That's not obvious and could be a stumbling block for new PyQGIS developers. It also locks these plugins then into using the geos engine, and if we ever decided to switch to a different engine then the plugins would also need to be updated. Just thinking aloud here... is there a good reason we couldn't instead move these geos implementations to QgsGeometryEngine itself and do away with QgsGeos? Then, in the (unlikely?) situation that a geos implementation is inferior to a native QGIS method we can mix-and-match between geos and alternative implementations as required. > > 2. client code uses QgsGeometry and calls intersection() on it. To benefit > from prepared geometries, we would need to enhance QgsGeometry to cache > QgsGeos and call prepare() the first time it is used. > > Thinking about it, version 1 might be better. We cannot know if the cached > QgsGeos is really needed a second time and it might use more memory if not. > In the first case, the client code knows if it is intended to do repeaded > operations on the same geometry. > > What do you think? > >> Could we put a giant "NOT STABLE API AND MAY CHANGE" for >> all these classes, so that we can address any unforeseen issues in >> 2.12? > > > That would be ok. Great! That's my biggest reservation - if we've got the opportunity to tweak the API for 2.12 then my fears have been unfounded :) Thanks! Nyall ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
Hi Nyall I'm still a bit unclear as to where you are taking these classes though. Is the intention that eventually ALL operations will be done using QgsGeos? In the end, yes, everything should go through QgsGeometryEngine interface. There are two possibilities: 1. client code creates QgsGeos, calls prepare() on it and makes repeaded intersection(), etc. directly on QgsGeometryEngine. This can be done with master branch as is. 2. client code uses QgsGeometry and calls intersection() on it. To benefit from prepared geometries, we would need to enhance QgsGeometry to cache QgsGeos and call prepare() the first time it is used. Thinking about it, version 1 might be better. We cannot know if the cached QgsGeos is really needed a second time and it might use more memory if not. In the first case, the client code knows if it is intended to do repeaded operations on the same geometry. What do you think? Could we put a giant "NOT STABLE API AND MAY CHANGE" for all these classes, so that we can address any unforeseen issues in 2.12? That would be ok. Regards, Marco On 23.06.2015 13:59, Nyall Dawson wrote: On 23 June 2015 at 16:15, Marco Hugentobler wrote: Hi Nyall This potentially has huge impacts on the performance of common tasks such as selecting all features which intersect a geometry. Are you really sure the impact is huge? If you select features on the map, you get a new one from provider each time. So also with the old geometry class, it was necessary to convert to geos for each intersect. That's why the subject line says "potentially" ;) But yes, if several geos operations are done in sequence on really the same geometry instance, there is an overhead now. However, I would assume the geos conversion takes far less time than the geos operation itself. Did you observe cases where the geos conversion really takes very long time? My biggest concern was operations such as the "Select by location" operations, where a single feature is compared against all the other features in a second layer. In this case there would be benefit from retaining the geos representation between each comparison. However, I've just run some tests and there's no appreciable difference in the time required for selecting intersecting features from large layers using this algorithm between 2.8->2.10, so you're correct in that the geos conversion isn't very expensive. (Unfortunately, as Martin has pointed out, memory usage in 2.10 was up by 50%). Lets assume the geos conversion indeed takes a long time. Couldn't we just store the pointer to QgsGeos (or better QgsGeometryEngine class) inside QgsGeometry? This would be straightforward to do. Furthermore, it is possible to prepare QgsGeos. A prepared geos geometry can do _much_ faster intersects/touches/ etc. operations if called repeatedly. We can cache a prepared QgsGeos now with the new engine, but we couldn't do that with the old QgsGeometry class. So for repeated predicate calculations, the new geometry is much better, also performance wise. Agreed - when PostGIS flipped their intersects operation to utilise prepared geometries the difference was huge. I'm excited to see this taken advantage of in QGIS too! I'm still a bit unclear as to where you are taking these classes though. Is the intention that eventually ALL operations will be done using QgsGeos? Because if not, then I can't see anyway to actually utilise the prepared geometries as it stands now - none of the methods in QgsGeometry utilise these. Please don't take my reservations as attacking this work. I think it's great stuff, and generally has been quite painless since it landed, which is surprising given the extent of the work. I just want to make sure 2.10 is in the best shape it can be in, and I'm doubtful that we are release ready yet. My biggest concern is honestly having the current QgsGeometryV2 API locked in while there are still questions regarding how it's implemented. Could we put a giant "NOT STABLE API AND MAY CHANGE" for all these classes, so that we can address any unforeseen issues in 2.12? Nyall Regards, Marco On 23.06.2015 04:09, Nyall Dawson wrote: Hi all, Unfortunately, we've become aware of a serious performance regression caused by the new geometry engine. Basically, the situation is that for all geometry operations which rely on geos (think buffers, splits, spatial relation operations such as intersects and within,... ) the geometry now needs to be converted into a geos representation with *every* operation. In the old geometry engine this conversion was done once, and the result stored so that follow up operations would not need to recalculate it. This potentially has huge impacts on the performance of common tasks such as selecting all features which intersect a geometry. I've had a look, and unfortunately it's not trivial to fix this. I think the correct solution to this is to: - make QgsGeometryEngine accept and return QgsGeometry containers, not Qg
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
On 23 June 2015 at 20:07, Matthias Kuhn wrote: > So the question is, is this something that we can fix later on, or would > fixing it require an API break and therefore we need to wait for 3.0 if > we don't fix it now? > As I mentioned in an earlier reply, could we mark this component of the API as non-stable for 2.10? That would give us the possibility of later revision. IIRC it wouldn't be the first time this has been done - I think the labeling API was initially marked as experimental and non stable too... Nyall > Regards, > Matthias > > On 06/23/2015 11:37 AM, Marco Hugentobler wrote: >> Hi Martin >> >> The new geometries uses more memory for backward compatibility. In the >> mid/long term, the cached wkb will be removed, so QgsAbstractGeometry >> will be the only represention. The provider can feed wkb (e.g. >> database providers) or construct the geometry by feeding QgsPointV2s. >> >> I'm however surprised that this should have such an impact with the >> attached dataset. Because the shp has size 588MByte, you could load >> that in memory several times (and we don't load everything into memory >> in QGIS in case the dataset is even bigger). Looking on the attached >> file, I think the problem is not the memory, but rather some round >> trips in the closestVertex function. >> >> But it is far from something that would justify a release delay or >> even a geometry rollback. >> >> Regards, >> Marco >> >> On 23.06.2015 05:42, Martin Dobias wrote: >>> Hi >>> >>> Agreed with Nyall. Apart from the above mentioned problems, there are >>> newly introduced performance regressions with snapping: the cached >>> geometries in 2.10 take double the amount of memory they used in 2.8, >>> and snapping got much slower with more complex layers (compared to >>> 2.8) - e.g. with [1]. The extra memory consumption is caused by the >>> fact that providers provide geometry in WKB representation, which is >>> then converted to new representation and WKB is kept. We should use >>> just one representation and drop the usage of the other one - that >>> means stop using WKB in providers and common code paths, so the WKB >>> representation does not even get created (and cached). >>> >>> One heretic idea at the end - what others think about postponing the >>> release of the new geometry architecture to 2.12 so that there is more >>> time to address the current issues (fix performance, fix high memory >>> consumption, clean up API, write unit tests). It seems to me that some >>> of the issues would be difficult to address even if the release of >>> 2.10 is moved by another week or so. >>> >>> Regards >>> Martin >>> >>> [1] http://www.iucnredlist.org/spatial-data/MAMMALS_TERRESTRIAL.zip >>> >>> >>> On Tue, Jun 23, 2015 at 10:09 AM, Nyall Dawson >>> wrote: Hi all, Unfortunately, we've become aware of a serious performance regression caused by the new geometry engine. Basically, the situation is that for all geometry operations which rely on geos (think buffers, splits, spatial relation operations such as intersects and within,... ) the geometry now needs to be converted into a geos representation with *every* operation. In the old geometry engine this conversion was done once, and the result stored so that follow up operations would not need to recalculate it. This potentially has huge impacts on the performance of common tasks such as selecting all features which intersect a geometry. I've had a look, and unfortunately it's not trivial to fix this. I think the correct solution to this is to: - make QgsGeometryEngine accept and return QgsGeometry containers, not QgsAbstractGeometryV2 - store the generated geos representation of geometries within QgsGeometryPrivate inside the QgsGeometry container. This way it will be reusable between different geometry operations, and shared when QgsGeometry objects are copied. This will also have the benefit that if a geometry is prepared using geos then subsequent geos operations performed on that QgsGeometry and its shared copies will be much faster. - make QgsGeometry a friend class of QgsGeo, so that it can access QgsGeometryPrivate to retrieve or set the geos representation of the geometry as required An alternative (short term) solution would be to just cache the geos representation when geometry operations are called through the older QgsGeometry modification/relationship operations. This would be easier, but means that the API of QgsGeometryEngine will be stuck with the current design, and we won't be able to properly fix this until breaking the api for 3.0. Either way, I doubt this can be addressed within the remaining 3 days we have until release. Should we delay to address this? Release with the regression? Or am I missing something and there's a
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
On 23 June 2015 at 16:25, Werner Macho wrote: > Hi! > I would be in favor of releasing it as is. > > 1. This is not the _LTS_ release (I thought this is exactly why we > introduced LTS) - so people should know that there are new features > which sometimes can cause "something" > 2. releasing it with the changes means more feedback if this is really > a problem and more time with lots of testing to fix it until the next > LTS > ... I'm not in favour of treating non LTS releases as "beta" type releases though. There's been potentially 10s of thousands of dollars of investment in 2.10, and people who have sponsored features are still expecting a stable release when 2.10 comes out. Nyall > Though there should probably a release note to make people aware and > ask them to test this very well. > > kind regards > Werner > > On Tue, Jun 23, 2015 at 8:15 AM, Marco Hugentobler > wrote: >> Hi Nyall >> >>>This potentially has huge impacts on the performance of common tasks such >>> as selecting all features which intersect a geometry. >> >> Are you really sure the impact is huge? If you select features on the map, >> you get a new one from provider each time. So also with the old geometry >> class, it was necessary to convert to geos for each intersect. >> >> But yes, if several geos operations are done in sequence on really the same >> geometry instance, there is an overhead now. However, I would assume the >> geos conversion takes far less time than the geos operation itself. Did you >> observe cases where the geos conversion really takes very long time? >> >> Lets assume the geos conversion indeed takes a long time. Couldn't we just >> store the pointer to QgsGeos (or better QgsGeometryEngine class) inside >> QgsGeometry? This would be straightforward to do. Furthermore, it is >> possible to prepare QgsGeos. A prepared geos geometry can do _much_ faster >> intersects/touches/ etc. operations if called repeatedly. We can cache a >> prepared QgsGeos now with the new engine, but we couldn't do that with the >> old QgsGeometry class. So for repeated predicate calculations, the new >> geometry is much better, also performance wise. >> >> >> Regards, >> Marco >> >> >> On 23.06.2015 04:09, Nyall Dawson wrote: >> >> Hi all, >> >> Unfortunately, we've become aware of a serious performance regression caused >> by the new geometry engine. Basically, the situation is that for all >> geometry operations which rely on geos (think buffers, splits, spatial >> relation operations such as intersects and within,... ) the geometry now >> needs to be converted into a geos representation with *every* operation. In >> the old geometry engine this conversion was done once, and the result stored >> so that follow up operations would not need to recalculate it. This >> potentially has huge impacts on the performance of common tasks such as >> selecting all features which intersect a geometry. >> >> I've had a look, and unfortunately it's not trivial to fix this. I think the >> correct solution to this is to: >> >> - make QgsGeometryEngine accept and return QgsGeometry containers, not >> QgsAbstractGeometryV2 >> - store the generated geos representation of geometries within >> QgsGeometryPrivate inside the QgsGeometry container. This way it will be >> reusable between different geometry operations, and shared when QgsGeometry >> objects are copied. This will also have the benefit that if a geometry is >> prepared using geos then subsequent geos operations performed on that >> QgsGeometry and its shared copies will be much faster. >> - make QgsGeometry a friend class of QgsGeo, so that it can access >> QgsGeometryPrivate to retrieve or set the geos representation of the >> geometry as required >> >> An alternative (short term) solution would be to just cache the geos >> representation when geometry operations are called through the older >> QgsGeometry modification/relationship operations. This would be easier, but >> means that the API of QgsGeometryEngine will be stuck with the current >> design, and we won't be able to properly fix this until breaking the api for >> 3.0. >> >> Either way, I doubt this can be addressed within the remaining 3 days we >> have until release. Should we delay to address this? Release with the >> regression? Or am I missing something and there's an easier solution we >> could implement? Or even possibly this additional cost of recalculating the >> geos representation is trivial and can be ignored (maybe someone could test >> this with a little repeated intersection script)? >> >> Thoughts? >> >> Nyall >> >> >> >> ___ >> Qgis-developer mailing list >> Qgis-developer@lists.osgeo.org >> http://lists.osgeo.org/mailman/listinfo/qgis-developer >> >> >> >> -- >> Dr. Marco Hugentobler >> Sourcepole - Linux & Open Source Solutions >> Weberstrasse 5, CH-8004 Zürich, Switzerland >> marco.hugentob...@sourcepole.ch http://www.sourcepole.ch >> Technical Advisor QGIS Projec
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
On 23 June 2015 at 16:15, Marco Hugentobler wrote: > Hi Nyall > >>This potentially has huge impacts on the performance of common tasks such >> as selecting all features which intersect a geometry. > > Are you really sure the impact is huge? If you select features on the map, > you get a new one from provider each time. So also with the old geometry > class, it was necessary to convert to geos for each intersect. That's why the subject line says "potentially" ;) > > But yes, if several geos operations are done in sequence on really the same > geometry instance, there is an overhead now. However, I would assume the > geos conversion takes far less time than the geos operation itself. Did you > observe cases where the geos conversion really takes very long time? My biggest concern was operations such as the "Select by location" operations, where a single feature is compared against all the other features in a second layer. In this case there would be benefit from retaining the geos representation between each comparison. However, I've just run some tests and there's no appreciable difference in the time required for selecting intersecting features from large layers using this algorithm between 2.8->2.10, so you're correct in that the geos conversion isn't very expensive. (Unfortunately, as Martin has pointed out, memory usage in 2.10 was up by 50%). > Lets assume the geos conversion indeed takes a long time. Couldn't we just > store the pointer to QgsGeos (or better QgsGeometryEngine class) inside > QgsGeometry? This would be straightforward to do. Furthermore, it is > possible to prepare QgsGeos. A prepared geos geometry can do _much_ faster > intersects/touches/ etc. operations if called repeatedly. We can cache a > prepared QgsGeos now with the new engine, but we couldn't do that with the > old QgsGeometry class. So for repeated predicate calculations, the new > geometry is much better, also performance wise. Agreed - when PostGIS flipped their intersects operation to utilise prepared geometries the difference was huge. I'm excited to see this taken advantage of in QGIS too! I'm still a bit unclear as to where you are taking these classes though. Is the intention that eventually ALL operations will be done using QgsGeos? Because if not, then I can't see anyway to actually utilise the prepared geometries as it stands now - none of the methods in QgsGeometry utilise these. Please don't take my reservations as attacking this work. I think it's great stuff, and generally has been quite painless since it landed, which is surprising given the extent of the work. I just want to make sure 2.10 is in the best shape it can be in, and I'm doubtful that we are release ready yet. My biggest concern is honestly having the current QgsGeometryV2 API locked in while there are still questions regarding how it's implemented. Could we put a giant "NOT STABLE API AND MAY CHANGE" for all these classes, so that we can address any unforeseen issues in 2.12? Nyall > > > Regards, > Marco > > > On 23.06.2015 04:09, Nyall Dawson wrote: > > Hi all, > > Unfortunately, we've become aware of a serious performance regression caused > by the new geometry engine. Basically, the situation is that for all > geometry operations which rely on geos (think buffers, splits, spatial > relation operations such as intersects and within,... ) the geometry now > needs to be converted into a geos representation with *every* operation. In > the old geometry engine this conversion was done once, and the result stored > so that follow up operations would not need to recalculate it. This > potentially has huge impacts on the performance of common tasks such as > selecting all features which intersect a geometry. > > I've had a look, and unfortunately it's not trivial to fix this. I think the > correct solution to this is to: > > - make QgsGeometryEngine accept and return QgsGeometry containers, not > QgsAbstractGeometryV2 > - store the generated geos representation of geometries within > QgsGeometryPrivate inside the QgsGeometry container. This way it will be > reusable between different geometry operations, and shared when QgsGeometry > objects are copied. This will also have the benefit that if a geometry is > prepared using geos then subsequent geos operations performed on that > QgsGeometry and its shared copies will be much faster. > - make QgsGeometry a friend class of QgsGeo, so that it can access > QgsGeometryPrivate to retrieve or set the geos representation of the > geometry as required > > An alternative (short term) solution would be to just cache the geos > representation when geometry operations are called through the older > QgsGeometry modification/relationship operations. This would be easier, but > means that the API of QgsGeometryEngine will be stuck with the current > design, and we won't be able to properly fix this until breaking the api for > 3.0. > > Either way, I doubt this can be addressed within the remaining 3 days we
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
Hi, I don't think reverting the geometry work is an option. It works well in general and offers really nice new possibilities with z and m values and circular arcs despite it was merged a bit late, did not receive a lot of attention in the freeze period and lacks some unit tests. But then, should we spoil CPU cycles and memory just because it's present on modern machines? The way I see it, QGIS should also be able to run on limited hardware like portable devices or older hardware. IIRC implicitly shared geometries have been one of the features which were an integral part of this from the beginning. If we don't have it now but are able to add it later on I don't mind so much but I do mind if we miss the oportunity to add it now. So the question is, is this something that we can fix later on, or would fixing it require an API break and therefore we need to wait for 3.0 if we don't fix it now? Regards, Matthias On 06/23/2015 11:37 AM, Marco Hugentobler wrote: > Hi Martin > > The new geometries uses more memory for backward compatibility. In the > mid/long term, the cached wkb will be removed, so QgsAbstractGeometry > will be the only represention. The provider can feed wkb (e.g. > database providers) or construct the geometry by feeding QgsPointV2s. > > I'm however surprised that this should have such an impact with the > attached dataset. Because the shp has size 588MByte, you could load > that in memory several times (and we don't load everything into memory > in QGIS in case the dataset is even bigger). Looking on the attached > file, I think the problem is not the memory, but rather some round > trips in the closestVertex function. > > But it is far from something that would justify a release delay or > even a geometry rollback. > > Regards, > Marco > > On 23.06.2015 05:42, Martin Dobias wrote: >> Hi >> >> Agreed with Nyall. Apart from the above mentioned problems, there are >> newly introduced performance regressions with snapping: the cached >> geometries in 2.10 take double the amount of memory they used in 2.8, >> and snapping got much slower with more complex layers (compared to >> 2.8) - e.g. with [1]. The extra memory consumption is caused by the >> fact that providers provide geometry in WKB representation, which is >> then converted to new representation and WKB is kept. We should use >> just one representation and drop the usage of the other one - that >> means stop using WKB in providers and common code paths, so the WKB >> representation does not even get created (and cached). >> >> One heretic idea at the end - what others think about postponing the >> release of the new geometry architecture to 2.12 so that there is more >> time to address the current issues (fix performance, fix high memory >> consumption, clean up API, write unit tests). It seems to me that some >> of the issues would be difficult to address even if the release of >> 2.10 is moved by another week or so. >> >> Regards >> Martin >> >> [1] http://www.iucnredlist.org/spatial-data/MAMMALS_TERRESTRIAL.zip >> >> >> On Tue, Jun 23, 2015 at 10:09 AM, Nyall Dawson >> wrote: >>> Hi all, >>> >>> Unfortunately, we've become aware of a serious performance >>> regression caused >>> by the new geometry engine. Basically, the situation is that for all >>> geometry operations which rely on geos (think buffers, splits, spatial >>> relation operations such as intersects and within,... ) the geometry >>> now >>> needs to be converted into a geos representation with *every* >>> operation. In >>> the old geometry engine this conversion was done once, and the >>> result stored >>> so that follow up operations would not need to recalculate it. This >>> potentially has huge impacts on the performance of common tasks such as >>> selecting all features which intersect a geometry. >>> >>> I've had a look, and unfortunately it's not trivial to fix this. I >>> think the >>> correct solution to this is to: >>> >>> - make QgsGeometryEngine accept and return QgsGeometry containers, not >>> QgsAbstractGeometryV2 >>> - store the generated geos representation of geometries within >>> QgsGeometryPrivate inside the QgsGeometry container. This way it >>> will be >>> reusable between different geometry operations, and shared when >>> QgsGeometry >>> objects are copied. This will also have the benefit that if a >>> geometry is >>> prepared using geos then subsequent geos operations performed on that >>> QgsGeometry and its shared copies will be much faster. >>> - make QgsGeometry a friend class of QgsGeo, so that it can access >>> QgsGeometryPrivate to retrieve or set the geos representation of the >>> geometry as required >>> >>> An alternative (short term) solution would be to just cache the geos >>> representation when geometry operations are called through the older >>> QgsGeometry modification/relationship operations. This would be >>> easier, but >>> means that the API of QgsGeometryEngine will be stuck with the current >>> design, and we won't be abl
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
Hi Martin The new geometries uses more memory for backward compatibility. In the mid/long term, the cached wkb will be removed, so QgsAbstractGeometry will be the only represention. The provider can feed wkb (e.g. database providers) or construct the geometry by feeding QgsPointV2s. I'm however surprised that this should have such an impact with the attached dataset. Because the shp has size 588MByte, you could load that in memory several times (and we don't load everything into memory in QGIS in case the dataset is even bigger). Looking on the attached file, I think the problem is not the memory, but rather some round trips in the closestVertex function. But it is far from something that would justify a release delay or even a geometry rollback. Regards, Marco On 23.06.2015 05:42, Martin Dobias wrote: Hi Agreed with Nyall. Apart from the above mentioned problems, there are newly introduced performance regressions with snapping: the cached geometries in 2.10 take double the amount of memory they used in 2.8, and snapping got much slower with more complex layers (compared to 2.8) - e.g. with [1]. The extra memory consumption is caused by the fact that providers provide geometry in WKB representation, which is then converted to new representation and WKB is kept. We should use just one representation and drop the usage of the other one - that means stop using WKB in providers and common code paths, so the WKB representation does not even get created (and cached). One heretic idea at the end - what others think about postponing the release of the new geometry architecture to 2.12 so that there is more time to address the current issues (fix performance, fix high memory consumption, clean up API, write unit tests). It seems to me that some of the issues would be difficult to address even if the release of 2.10 is moved by another week or so. Regards Martin [1] http://www.iucnredlist.org/spatial-data/MAMMALS_TERRESTRIAL.zip On Tue, Jun 23, 2015 at 10:09 AM, Nyall Dawson wrote: Hi all, Unfortunately, we've become aware of a serious performance regression caused by the new geometry engine. Basically, the situation is that for all geometry operations which rely on geos (think buffers, splits, spatial relation operations such as intersects and within,... ) the geometry now needs to be converted into a geos representation with *every* operation. In the old geometry engine this conversion was done once, and the result stored so that follow up operations would not need to recalculate it. This potentially has huge impacts on the performance of common tasks such as selecting all features which intersect a geometry. I've had a look, and unfortunately it's not trivial to fix this. I think the correct solution to this is to: - make QgsGeometryEngine accept and return QgsGeometry containers, not QgsAbstractGeometryV2 - store the generated geos representation of geometries within QgsGeometryPrivate inside the QgsGeometry container. This way it will be reusable between different geometry operations, and shared when QgsGeometry objects are copied. This will also have the benefit that if a geometry is prepared using geos then subsequent geos operations performed on that QgsGeometry and its shared copies will be much faster. - make QgsGeometry a friend class of QgsGeo, so that it can access QgsGeometryPrivate to retrieve or set the geos representation of the geometry as required An alternative (short term) solution would be to just cache the geos representation when geometry operations are called through the older QgsGeometry modification/relationship operations. This would be easier, but means that the API of QgsGeometryEngine will be stuck with the current design, and we won't be able to properly fix this until breaking the api for 3.0. Either way, I doubt this can be addressed within the remaining 3 days we have until release. Should we delay to address this? Release with the regression? Or am I missing something and there's an easier solution we could implement? Or even possibly this additional cost of recalculating the geos representation is trivial and can be ignored (maybe someone could test this with a little repeated intersection script)? Thoughts? Nyall ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer -- Dr. Marco Hugentobler Sourcepole - Linux & Open Source Solutions Weberstrasse 5, CH-8004 Zürich, Switzerland marco.hugentob...@sourcepole.ch http://www.sourcepole.ch Technical Advisor QGIS Project Steering Committee ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
Hi Marco, On Tue, 23. Jun 2015 at 08:15:27 +0200, Marco Hugentobler wrote: > Are you really sure the impact is huge? Well, if you have to ask and Nyall apparently is the first to notice - the impact can't be huge. I think reverting is no option - there have been numerous things that were adapted to the new engine. The indexing when snapping on a large layer is noticable however. But I don't think that's a huge issue either. Some will complain - but there are other things that don't work nicely on large layers either. It's a reminder that new stuff should go in early in the development cycle and not shortly before the feature freeze. But that's just a future note. I'd release what we have. Jürgen -- Jürgen E. Fischer norBIT GmbH Tel. +49-4931-918175-31 Dipl.-Inf. (FH) Rheinstraße 13 Fax. +49-4931-918175-50 Software Engineer D-26506 Norden http://www.norbit.de QGIS release manager (PSC) GermanyIRC: jef on FreeNode signature.asc Description: Digital signature ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
Hi! I would be in favor of releasing it as is. 1. This is not the _LTS_ release (I thought this is exactly why we introduced LTS) - so people should know that there are new features which sometimes can cause "something" 2. releasing it with the changes means more feedback if this is really a problem and more time with lots of testing to fix it until the next LTS Though there should probably a release note to make people aware and ask them to test this very well. kind regards Werner On Tue, Jun 23, 2015 at 8:15 AM, Marco Hugentobler wrote: > Hi Nyall > >>This potentially has huge impacts on the performance of common tasks such >> as selecting all features which intersect a geometry. > > Are you really sure the impact is huge? If you select features on the map, > you get a new one from provider each time. So also with the old geometry > class, it was necessary to convert to geos for each intersect. > > But yes, if several geos operations are done in sequence on really the same > geometry instance, there is an overhead now. However, I would assume the > geos conversion takes far less time than the geos operation itself. Did you > observe cases where the geos conversion really takes very long time? > > Lets assume the geos conversion indeed takes a long time. Couldn't we just > store the pointer to QgsGeos (or better QgsGeometryEngine class) inside > QgsGeometry? This would be straightforward to do. Furthermore, it is > possible to prepare QgsGeos. A prepared geos geometry can do _much_ faster > intersects/touches/ etc. operations if called repeatedly. We can cache a > prepared QgsGeos now with the new engine, but we couldn't do that with the > old QgsGeometry class. So for repeated predicate calculations, the new > geometry is much better, also performance wise. > > > Regards, > Marco > > > On 23.06.2015 04:09, Nyall Dawson wrote: > > Hi all, > > Unfortunately, we've become aware of a serious performance regression caused > by the new geometry engine. Basically, the situation is that for all > geometry operations which rely on geos (think buffers, splits, spatial > relation operations such as intersects and within,... ) the geometry now > needs to be converted into a geos representation with *every* operation. In > the old geometry engine this conversion was done once, and the result stored > so that follow up operations would not need to recalculate it. This > potentially has huge impacts on the performance of common tasks such as > selecting all features which intersect a geometry. > > I've had a look, and unfortunately it's not trivial to fix this. I think the > correct solution to this is to: > > - make QgsGeometryEngine accept and return QgsGeometry containers, not > QgsAbstractGeometryV2 > - store the generated geos representation of geometries within > QgsGeometryPrivate inside the QgsGeometry container. This way it will be > reusable between different geometry operations, and shared when QgsGeometry > objects are copied. This will also have the benefit that if a geometry is > prepared using geos then subsequent geos operations performed on that > QgsGeometry and its shared copies will be much faster. > - make QgsGeometry a friend class of QgsGeo, so that it can access > QgsGeometryPrivate to retrieve or set the geos representation of the > geometry as required > > An alternative (short term) solution would be to just cache the geos > representation when geometry operations are called through the older > QgsGeometry modification/relationship operations. This would be easier, but > means that the API of QgsGeometryEngine will be stuck with the current > design, and we won't be able to properly fix this until breaking the api for > 3.0. > > Either way, I doubt this can be addressed within the remaining 3 days we > have until release. Should we delay to address this? Release with the > regression? Or am I missing something and there's an easier solution we > could implement? Or even possibly this additional cost of recalculating the > geos representation is trivial and can be ignored (maybe someone could test > this with a little repeated intersection script)? > > Thoughts? > > Nyall > > > > ___ > Qgis-developer mailing list > Qgis-developer@lists.osgeo.org > http://lists.osgeo.org/mailman/listinfo/qgis-developer > > > > -- > Dr. Marco Hugentobler > Sourcepole - Linux & Open Source Solutions > Weberstrasse 5, CH-8004 Zürich, Switzerland > marco.hugentob...@sourcepole.ch http://www.sourcepole.ch > Technical Advisor QGIS Project Steering Committee > > > ___ > Qgis-developer mailing list > Qgis-developer@lists.osgeo.org > http://lists.osgeo.org/mailman/listinfo/qgis-developer ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
Hi Nyall >This potentially has huge impacts on the performance of common tasks such as selecting all features which intersect a geometry. Are you really sure the impact is huge? If you select features on the map, you get a new one from provider each time. So also with the old geometry class, it was necessary to convert to geos for each intersect. But yes, if several geos operations are done in sequence on really the same geometry instance, there is an overhead now. However, I would assume the geos conversion takes far less time than the geos operation itself. Did you observe cases where the geos conversion really takes very long time? Lets assume the geos conversion indeed takes a long time. Couldn't we just store the pointer to QgsGeos (or better QgsGeometryEngine class) inside QgsGeometry? This would be straightforward to do. Furthermore, it is possible to prepare QgsGeos. A prepared geos geometry can do _much_ faster intersects/touches/ etc. operations if called repeatedly. We can cache a prepared QgsGeos now with the new engine, but we couldn't do that with the old QgsGeometry class. So for repeated predicate calculations, the new geometry is much better, also performance wise. Regards, Marco On 23.06.2015 04:09, Nyall Dawson wrote: Hi all, Unfortunately, we've become aware of a serious performance regression caused by the new geometry engine. Basically, the situation is that for all geometry operations which rely on geos (think buffers, splits, spatial relation operations such as intersects and within,... ) the geometry now needs to be converted into a geos representation with *every* operation. In the old geometry engine this conversion was done once, and the result stored so that follow up operations would not need to recalculate it. This potentially has huge impacts on the performance of common tasks such as selecting all features which intersect a geometry. I've had a look, and unfortunately it's not trivial to fix this. I think the correct solution to this is to: - make QgsGeometryEngine accept and return QgsGeometry containers, not QgsAbstractGeometryV2 - store the generated geos representation of geometries within QgsGeometryPrivate inside the QgsGeometry container. This way it will be reusable between different geometry operations, and shared when QgsGeometry objects are copied. This will also have the benefit that if a geometry is prepared using geos then subsequent geos operations performed on that QgsGeometry and its shared copies will be much faster. - make QgsGeometry a friend class of QgsGeo, so that it can access QgsGeometryPrivate to retrieve or set the geos representation of the geometry as required An alternative (short term) solution would be to just cache the geos representation when geometry operations are called through the older QgsGeometry modification/relationship operations. This would be easier, but means that the API of QgsGeometryEngine will be stuck with the current design, and we won't be able to properly fix this until breaking the api for 3.0. Either way, I doubt this can be addressed within the remaining 3 days we have until release. Should we delay to address this? Release with the regression? Or am I missing something and there's an easier solution we could implement? Or even possibly this additional cost of recalculating the geos representation is trivial and can be ignored (maybe someone could test this with a little repeated intersection script)? Thoughts? Nyall ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer -- Dr. Marco Hugentobler Sourcepole - Linux & Open Source Solutions Weberstrasse 5, CH-8004 Zürich, Switzerland marco.hugentob...@sourcepole.ch http://www.sourcepole.ch Technical Advisor QGIS Project Steering Committee ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
Out of curiosity, how long of a delay do you estimate would be needed to fix things? 4 weeks, 8 weeks? On 23 Jun 2015 09:09, "Nyall Dawson" wrote: > Hi all, > > Unfortunately, we've become aware of a serious performance regression > caused by the new geometry engine. Basically, the situation is that for all > geometry operations which rely on geos (think buffers, splits, spatial > relation operations such as intersects and within,... ) the geometry now > needs to be converted into a geos representation with *every* operation. In > the old geometry engine this conversion was done once, and the result > stored so that follow up operations would not need to recalculate it. This > potentially has huge impacts on the performance of common tasks such as > selecting all features which intersect a geometry. > > I've had a look, and unfortunately it's not trivial to fix this. I think > the correct solution to this is to: > > - make QgsGeometryEngine accept and return QgsGeometry containers, not > QgsAbstractGeometryV2 > - store the generated geos representation of geometries within > QgsGeometryPrivate inside the QgsGeometry container. This way it will be > reusable between different geometry operations, and shared when QgsGeometry > objects are copied. This will also have the benefit that if a geometry is > prepared using geos then subsequent geos operations performed on that > QgsGeometry and its shared copies will be much faster. > - make QgsGeometry a friend class of QgsGeo, so that it can access > QgsGeometryPrivate to retrieve or set the geos representation of the > geometry as required > > An alternative (short term) solution would be to just cache the geos > representation when geometry operations are called through the older > QgsGeometry modification/relationship operations. This would be easier, but > means that the API of QgsGeometryEngine will be stuck with the current > design, and we won't be able to properly fix this until breaking the api > for 3.0. > > Either way, I doubt this can be addressed within the remaining 3 days we > have until release. Should we delay to address this? Release with the > regression? Or am I missing something and there's an easier solution we > could implement? Or even possibly this additional cost of recalculating the > geos representation is trivial and can be ignored (maybe someone could test > this with a little repeated intersection script)? > > Thoughts? > > Nyall > > ___ > Qgis-developer mailing list > Qgis-developer@lists.osgeo.org > http://lists.osgeo.org/mailman/listinfo/qgis-developer > ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
+1 to full postpone for me. I don't like the idea of shipping something that is slower and snapping is slow. Those are core features of a GIS. Nathan On Tue, 23 Jun 2015 3:22 pm Nyall Dawson wrote: > On 23 June 2015 at 13:42, Martin Dobias wrote: > > > One heretic idea at the end - what others think about postponing the > > release of the new geometry architecture to 2.12 so that there is more > > time to address the current issues (fix performance, fix high memory > > consumption, clean up API, write unit tests). It seems to me that some > > of the issues would be difficult to address even if the release of > > 2.10 is moved by another week or so. > > I'd be a cautious +1 to this - but I'm also concerned about the impact > of reverting this work now. Scary stuff, but like you said, I'm not > confident we can get the new geometry work into a release-ready shape > in 3 days... (Also, who is expected to do this? AFAIK all the > sponsored bug fixing has been exhausted for 2.10). > > Nyall > ___ > Qgis-developer mailing list > Qgis-developer@lists.osgeo.org > http://lists.osgeo.org/mailman/listinfo/qgis-developer > ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
On 23 June 2015 at 13:42, Martin Dobias wrote: > One heretic idea at the end - what others think about postponing the > release of the new geometry architecture to 2.12 so that there is more > time to address the current issues (fix performance, fix high memory > consumption, clean up API, write unit tests). It seems to me that some > of the issues would be difficult to address even if the release of > 2.10 is moved by another week or so. I'd be a cautious +1 to this - but I'm also concerned about the impact of reverting this work now. Scary stuff, but like you said, I'm not confident we can get the new geometry work into a release-ready shape in 3 days... (Also, who is expected to do this? AFAIK all the sponsored bug fixing has been exhausted for 2.10). Nyall ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer
Re: [Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
Hi Agreed with Nyall. Apart from the above mentioned problems, there are newly introduced performance regressions with snapping: the cached geometries in 2.10 take double the amount of memory they used in 2.8, and snapping got much slower with more complex layers (compared to 2.8) - e.g. with [1]. The extra memory consumption is caused by the fact that providers provide geometry in WKB representation, which is then converted to new representation and WKB is kept. We should use just one representation and drop the usage of the other one - that means stop using WKB in providers and common code paths, so the WKB representation does not even get created (and cached). One heretic idea at the end - what others think about postponing the release of the new geometry architecture to 2.12 so that there is more time to address the current issues (fix performance, fix high memory consumption, clean up API, write unit tests). It seems to me that some of the issues would be difficult to address even if the release of 2.10 is moved by another week or so. Regards Martin [1] http://www.iucnredlist.org/spatial-data/MAMMALS_TERRESTRIAL.zip On Tue, Jun 23, 2015 at 10:09 AM, Nyall Dawson wrote: > Hi all, > > Unfortunately, we've become aware of a serious performance regression caused > by the new geometry engine. Basically, the situation is that for all > geometry operations which rely on geos (think buffers, splits, spatial > relation operations such as intersects and within,... ) the geometry now > needs to be converted into a geos representation with *every* operation. In > the old geometry engine this conversion was done once, and the result stored > so that follow up operations would not need to recalculate it. This > potentially has huge impacts on the performance of common tasks such as > selecting all features which intersect a geometry. > > I've had a look, and unfortunately it's not trivial to fix this. I think the > correct solution to this is to: > > - make QgsGeometryEngine accept and return QgsGeometry containers, not > QgsAbstractGeometryV2 > - store the generated geos representation of geometries within > QgsGeometryPrivate inside the QgsGeometry container. This way it will be > reusable between different geometry operations, and shared when QgsGeometry > objects are copied. This will also have the benefit that if a geometry is > prepared using geos then subsequent geos operations performed on that > QgsGeometry and its shared copies will be much faster. > - make QgsGeometry a friend class of QgsGeo, so that it can access > QgsGeometryPrivate to retrieve or set the geos representation of the > geometry as required > > An alternative (short term) solution would be to just cache the geos > representation when geometry operations are called through the older > QgsGeometry modification/relationship operations. This would be easier, but > means that the API of QgsGeometryEngine will be stuck with the current > design, and we won't be able to properly fix this until breaking the api for > 3.0. > > Either way, I doubt this can be addressed within the remaining 3 days we > have until release. Should we delay to address this? Release with the > regression? Or am I missing something and there's an easier solution we > could implement? Or even possibly this additional cost of recalculating the > geos representation is trivial and can be ignored (maybe someone could test > this with a little repeated intersection script)? > > Thoughts? > > Nyall > > > ___ > Qgis-developer mailing list > Qgis-developer@lists.osgeo.org > http://lists.osgeo.org/mailman/listinfo/qgis-developer ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer
[Qgis-developer] Potentially serious performance regression in new geometry - should 2.10 be delayed?
Hi all, Unfortunately, we've become aware of a serious performance regression caused by the new geometry engine. Basically, the situation is that for all geometry operations which rely on geos (think buffers, splits, spatial relation operations such as intersects and within,... ) the geometry now needs to be converted into a geos representation with *every* operation. In the old geometry engine this conversion was done once, and the result stored so that follow up operations would not need to recalculate it. This potentially has huge impacts on the performance of common tasks such as selecting all features which intersect a geometry. I've had a look, and unfortunately it's not trivial to fix this. I think the correct solution to this is to: - make QgsGeometryEngine accept and return QgsGeometry containers, not QgsAbstractGeometryV2 - store the generated geos representation of geometries within QgsGeometryPrivate inside the QgsGeometry container. This way it will be reusable between different geometry operations, and shared when QgsGeometry objects are copied. This will also have the benefit that if a geometry is prepared using geos then subsequent geos operations performed on that QgsGeometry and its shared copies will be much faster. - make QgsGeometry a friend class of QgsGeo, so that it can access QgsGeometryPrivate to retrieve or set the geos representation of the geometry as required An alternative (short term) solution would be to just cache the geos representation when geometry operations are called through the older QgsGeometry modification/relationship operations. This would be easier, but means that the API of QgsGeometryEngine will be stuck with the current design, and we won't be able to properly fix this until breaking the api for 3.0. Either way, I doubt this can be addressed within the remaining 3 days we have until release. Should we delay to address this? Release with the regression? Or am I missing something and there's an easier solution we could implement? Or even possibly this additional cost of recalculating the geos representation is trivial and can be ignored (maybe someone could test this with a little repeated intersection script)? Thoughts? Nyall ___ Qgis-developer mailing list Qgis-developer@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/qgis-developer