RE: [mapserver-users] ONE PASS QUERY (RFC 52) - FEATURE OR BUG?
I think that would probably eliminate the WFS issues we saw in 5.6 where OGC filters didn't map cleanly to a core query as we'd essentially be adding a new and more flexible core query. I'm not so sure that it would address a common use case for old behavior. For example, it seems that folks like to start with a core query against a number of layers (e.g. find me all restaurants, bars, and coffee shops within a certain bbox) and then they (through subsequent UI interactions) augment that original set of results by adding or removing features in a number of layers based on new queries, often a point query. I think that would be very hard to manage through a single filter. It's easy to do, however, via a list of feature IDs. I still think adding a filterObj and a msLayerWhichShapesFiltered() is a good idea and want to see this in 6.0. In addition, I was thinking that: 1) we could use shapeindex to hold a global feature ID (OID, row id, etc...), and tileindex (for non-tiled shapefile/raster layers) to hold a result set specific row id 2) driver specific version of msLayerNextShape(...) would set shapeindex and optionally tileindex (for example, the shapefile driver would only set the former, postgis would set both) 3) driver specific versions of msLayerGetShape(...) would be charged with making the decision on doing either a random access query (select ... where oid=x) or leveraging the existing result set based on the passed tileIndex (msLayerResultsGetShape(...) goes away) 4) resurrect the old query file writer (in addition to the new one). That code wrote the shapeindex and tileindex but we'd only cache the tileindex for tiled shapefile/raster layers 5) resurrect the old query file reader and it would load a query in a state that wouldn't have the tileindexes so old, slow processing would result for RDBMS layers Knowing when the result set processing (e.g. via WFS, templates and query maps) should set things up for old vs. new is a bit tricky. Right now there's a flag set in the result cache that I believe only the GML writer respects. It shows that it's possible to support both worlds though. That flag could trigger old/new file IO and result cache processing. I need to resolve this with the parallel discussion Tamas and Frank were having. My 2 cents anyway... Steve From: mapserver-users-boun...@lists.osgeo.org [mapserver-users-boun...@lists.osgeo.org] On Behalf Of Paul Ramsey [pram...@cleverelephant.ca] Sent: Monday, March 22, 2010 11:25 PM To: Lime, Steve D (DNR) Cc: mapserver-users@lists.osgeo.org; BrainDrain Subject: Re: [mapserver-users] ONE PASS QUERY (RFC 52) - FEATURE OR BUG? Is it better to back out to the old behaviour or would defining a filter object that allows complex query logic meet the need in a more direct way? (Ie, is running multiple queries a feature or a workaround for an even older limitation?) P. On Mon, Mar 22, 2010 at 9:21 PM, Lime, Steve D (DNR) steve.l...@state.mn.us wrote: I think we're in need of a RFC 52a. Clearly the compound query handling the old approach afforded is of value to a group of users and that wasn't accounted for in the initial RFC. The work around Assefa had to do with WFS and a certain subset of OGC filters at the sprint is evidence that the approach was even used in the core code (I wasn't aware of that at the time). We (in 5.6.3) developed a work around that retains two sets of indexes one suitable for random access and one for a specific result set (e.g. cursor). It uses the already present tileindex property of a shapeObj to store the latter. I think we can have the best of both here by storing the two indexes and potentially we can revert to a single getShape() function in MapScript and revive the old queryfile format as a option as well. Just needs to be planned for now that the full impacts are better understood. Steve From: mapserver-users-boun...@lists.osgeo.org [mapserver-users-boun...@lists.osgeo.org] On Behalf Of Frank Warmerdam [warmer...@pobox.com] Sent: Monday, March 22, 2010 10:16 PM To: Tamas Szekeres Cc: mapserver-users@lists.osgeo.org; BrainDrain Subject: Re: [mapserver-users] ONE PASS QUERY (RFC 52) - FEATURE OR BUG? Tamas Szekeres wrote: In my understanding with the original approach the driver should: 1. Retain the result set of the queries at the layer (ie. in the layerinfo structure) until the layer is open and no subsequent whichShapes is called to 'invalidate' the query. Tamas, Your point here is that the query result should live until invalidated by another whichShapes, right? I would agree with that, but draw on a layer does do a whichShapes, right? So a draw is expected to invalidate a query, right? 2. Provide such index in shapeObj which would allow to retieve in a subsequent resultsGetShape within the result set. ok 3. Retain the random access
Re: [mapserver-users] ONE PASS QUERY (RFC 52) - FEATURE OR BUG?
2010/3/23 Frank Warmerdam warmer...@pobox.com 4. Preserve the behaviour of keeping separate set of results for separate layer instances. In this regard a query on one layer should not invalidate the results for a different layer instance of the same driver. This seems to be a lot to expect. We go to significant effort with the connection pooling to allow reuse of a connection for different layers, and in effect in many drivers this connection also carries a bunch of context with it. Certainly in the case of OGR an OGRLayer retains a concept of current query result, but it can be invalidated by lots of operations other than ResetReading() and GetNextFeature(). I would imagine this is true to a greater or lesser to other drivers that pool connections. Frank, It may be driver dependent, but I'm tending to think we should open up a new connection/session/dataset whatever, for those queries which would retain the results at the driver, corresponding to a particular result set stored at layer level. This is not the case for those queries where the results are not retained (like drawing the layers / background) and the connection pool approach could continue be used in these cases. That's why I suggested an additional parameter in whichShapes to define the purpose of the query. 5. Creating a clone of a layer should provide to use a separate query (by keeping the results intact on the original layer). This would be essential for msDrawQueryLayer to work when drawing the background before the highlighted features. This is also quite impractical for some implementations - certainly for OGR. I'm quite unsure how msDrawQueryLayer would ever work with OGR then. In this case, MS_HILITE would require to draw the entire layer first and then the highlighted shapes from the result set. Drawing the entire layer (regardless of whether it's happening on a clone) would reset the spatialfilter on the driver with the same connection, and a subsequent resultsGetShape would fail to retrieve the same features. In retrospect, I'm not all that confident that we really considered the impact of RFC 52 on use cases such as those you raise. I certainly didn't understand these impacts. What is less clear to me is where to go from here. RFC 52 was put in place because the old approach was giving terrible performance in some cases. This is really a good question ;-) RFC 52 is out in the stable branch and prevents from a number of users to upgrade (we don't know the exact number though). Leaving this version as it stands would bring in more people involved in this version of the API, while we foresee some API change shortly. I think our best chance would be to provide the original version at the drivers in parallel to the current that means: both getShape and layerGetShape should work properly. This should probably be controlled by a layer processing option at driver level. It would be reasonable to switch the defaults to the 2 pass approach in the stable branch, while the users could eventually override this in their mapfiles. But if we put the expectations you list into place there is no way it can be made fast on OGR short of maintaining distinct OGRDataset instances for each query in addition to the one used to draw the layer. This could cause various performance and resource problems. While I don't exactly see the performance impacts, it may require a bit more memory for sure. However since we intend to reuse the results of a query it would definitely imply to store the corresponding reference of the OGRDataset during the time interval when a subsequent access to the query may happen. I'm also hesitant to think that the 1pass option is better for all OGR data sources in all cases. Having a couple of test scripts to see the performance difference of the same query with these 2 methods would be helpful. Best regards, Tamas ___ mapserver-users mailing list mapserver-users@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/mapserver-users
Re: [mapserver-users] ONE PASS QUERY (RFC 52) - FEATURE OR BUG?
Hi, I second to these concerns, absolutely. With regards to RFC 52 there have been a couple of breaking changes in MapServer 5.6 which prevents me from upgrading to this version in my existing projects. I've already tried to ring the bell in this topic with a couple of posts (see below), but it seems the use case described here (as keeping long term mapObj references) is not widely used and falls ouside of the general area of interest; http://n2.nabble.com/Ready-for-5-6-2-td4743344.html#a4746772 http://n2.nabble.com/OGR-single-pass-query-issues-was-Ready-for-5-6-2-td4753764.html By raising up your issues below I've studied RFC 52http://mapserver.org/development/rfc/ms-rfc-52.htmlagain to see the objectives, and it seems we are getting out of the sync with the current implementation at the drivers. In my understanding with the original approach the driver should: 1. Retain the result set of the queries at the layer (ie. in the layerinfo structure) until the layer is open and no subsequent whichShapes is called to 'invalidate' the query. 2. Provide such index in shapeObj which would allow to retieve in a subsequent resultsGetShape within the result set. 3. Retain the random access behaviour (getShape) for backward compatibility in parallel to resultsGetShape. Since the RFC doesn't contain explicit note about the opposite, the drivers should also: 4. Preserve the behaviour of keeping separate set of results for separate layer instances. In this regard a query on one layer should not invalidate the results for a different layer instance of the same driver. 5. Creating a clone of a layer should provide to use a separate query (by keeping the results intact on the original layer). This would be essential for msDrawQueryLayer to work when drawing the background before the highlighted features. 6. Using a drawQuery should not invalidate the results of a previous query. 7. Drawing the map should not invalidate the results of a previous query. Further notes: To provide correct implementations at the drivers it seems we should provide a bit more information for the driver to distinguish between the purpose of a query. For example whichShapes should take an additional parameter (querymode) with the following predefined values: MS_QUERY_SEQUENTIAL: The returned shapes will be retrieved by nextShape (no subsequent (result)getshape will happen). This would be used by the normal drawing operations. MS_QUERY_RANDOM: The results would be retrieved by nextShape. The driver would provide oid-s as feature indexes to support to retrieve the features by the original (2 pass) behaviour (getShape). No features are retained at the driver for further access. MS_QUERY_PRESERVE: The results would be retrieved by nextShape and resultsGetShape. The drivers should store the result set at the layer for further retrieval (1 pass). The features retrieved by MS_QUERY_PRESERVE should be kept in a separate location either in layerinfo or in a separate structure (a queryinfo for example). The latter would provide the isolate the results from the layer opened state. MS_QUERY_SEQUENTIAL or MS_QUERY_RANDOM should not invalidate the results in queryinfo. We should also provide the user with an option to select between the default of MS_QUERY_RANDOM/MS_QUERY_PRESERVE at layer level. Probably a layer processing option would be sufficient. Best regards, Tamas 2010/3/21 BrainDrain paulborod...@gmail.com Please read carefully. 'old style' (two pass) query advantages: - in mapscript (c#) layer's query methods are CUMULATIVE (relative to other layer's queries). Query result (success/failure) has no effect on other layers when I call map.savequery. - In this case (query file contains just oid's) - I CAN create COMPLEX queries by applying different parameters and mixing query types - Query result is oid (or row index in shapefile) - it's cool for creating advanced attr. postqueries and disadvantages (insignificant): - query binary (closed) format? no output to string (only to file) - query file sensitive to layer indexes (map file cleanups/refinements/some normalization can cause query file incompatibility) - if server data changed refreshing map image by old url to cgi with queryfile parameter doesn't perform requery (need custom http handler/module) AND ONE PASS QUERY (RFC 52) advantages: - open query file format - speedup (no random access) disadvantages (huge): - layer's query methods are NOT CUMULATIVE (relative to other layer's queries). I CAN'T create COMPLEX queries (by using different attribute/spatial queries (metadata driven) for different layers)! - query result - some (shape)indexes and when I'm querying many layers in sequence I NEED TO PRESERVE QUERY FILE FOR EVERY LAYER (on success result) (!) and than on feature attributes (for some layer) demand (delayed request - its a normal behavior, for. ex I'm requesting only shape names for some layer which has results - to build results
Re: [mapserver-users] ONE PASS QUERY (RFC 52) - FEATURE OR BUG?
Tamas Szekeres wrote: In my understanding with the original approach the driver should: 1. Retain the result set of the queries at the layer (ie. in the layerinfo structure) until the layer is open and no subsequent whichShapes is called to 'invalidate' the query. Tamas, Your point here is that the query result should live until invalidated by another whichShapes, right? I would agree with that, but draw on a layer does do a whichShapes, right? So a draw is expected to invalidate a query, right? 2. Provide such index in shapeObj which would allow to retieve in a subsequent resultsGetShape within the result set. ok 3. Retain the random access behaviour (getShape) for backward compatibility in parallel to resultsGetShape. ok Since the RFC doesn't contain explicit note about the opposite, the drivers should also: 4. Preserve the behaviour of keeping separate set of results for separate layer instances. In this regard a query on one layer should not invalidate the results for a different layer instance of the same driver. This seems to be a lot to expect. We go to significant effort with the connection pooling to allow reuse of a connection for different layers, and in effect in many drivers this connection also carries a bunch of context with it. Certainly in the case of OGR an OGRLayer retains a concept of current query result, but it can be invalidated by lots of operations other than ResetReading() and GetNextFeature(). I would imagine this is true to a greater or lesser to other drivers that pool connections. 5. Creating a clone of a layer should provide to use a separate query (by keeping the results intact on the original layer). This would be essential for msDrawQueryLayer to work when drawing the background before the highlighted features. This is also quite impractical for some implementations - certainly for OGR. 6. Using a drawQuery should not invalidate the results of a previous query. I don't know much about drawQuery but it does seem plausible to ask that drawQuery should not invalidate the query it is drawing. 7. Drawing the map should not invalidate the results of a previous query. But drawing maps uses the feature access machinery like whichShapes doesn't it? How can we expect map drawing not to invalidate a query? In retrospect, I'm not all that confident that we really considered the impact of RFC 52 on use cases such as those you raise. I certainly didn't understand these impacts. What is less clear to me is where to go from here. RFC 52 was put in place because the old approach was giving terrible performance in some cases. But if we put the expectations you list into place there is no way it can be made fast on OGR short of maintaining distinct OGRDataset instances for each query in addition to the one used to draw the layer. This could cause various performance and resource problems. Best regards, -- ---+-- I set the clouds in motion - turn up | Frank Warmerdam, warmer...@pobox.com light and sound - activate the windows | http://pobox.com/~warmerdam and watch the world go round - Rush| Geospatial Programmer for Rent ___ mapserver-users mailing list mapserver-users@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/mapserver-users
Re: [mapserver-users] ONE PASS QUERY (RFC 52) - FEATURE OR BUG?
Is it better to back out to the old behaviour or would defining a filter object that allows complex query logic meet the need in a more direct way? (Ie, is running multiple queries a feature or a workaround for an even older limitation?) P. On Mon, Mar 22, 2010 at 9:21 PM, Lime, Steve D (DNR) steve.l...@state.mn.us wrote: I think we're in need of a RFC 52a. Clearly the compound query handling the old approach afforded is of value to a group of users and that wasn't accounted for in the initial RFC. The work around Assefa had to do with WFS and a certain subset of OGC filters at the sprint is evidence that the approach was even used in the core code (I wasn't aware of that at the time). We (in 5.6.3) developed a work around that retains two sets of indexes one suitable for random access and one for a specific result set (e.g. cursor). It uses the already present tileindex property of a shapeObj to store the latter. I think we can have the best of both here by storing the two indexes and potentially we can revert to a single getShape() function in MapScript and revive the old queryfile format as a option as well. Just needs to be planned for now that the full impacts are better understood. Steve From: mapserver-users-boun...@lists.osgeo.org [mapserver-users-boun...@lists.osgeo.org] On Behalf Of Frank Warmerdam [warmer...@pobox.com] Sent: Monday, March 22, 2010 10:16 PM To: Tamas Szekeres Cc: mapserver-users@lists.osgeo.org; BrainDrain Subject: Re: [mapserver-users] ONE PASS QUERY (RFC 52) - FEATURE OR BUG? Tamas Szekeres wrote: In my understanding with the original approach the driver should: 1. Retain the result set of the queries at the layer (ie. in the layerinfo structure) until the layer is open and no subsequent whichShapes is called to 'invalidate' the query. Tamas, Your point here is that the query result should live until invalidated by another whichShapes, right? I would agree with that, but draw on a layer does do a whichShapes, right? So a draw is expected to invalidate a query, right? 2. Provide such index in shapeObj which would allow to retieve in a subsequent resultsGetShape within the result set. ok 3. Retain the random access behaviour (getShape) for backward compatibility in parallel to resultsGetShape. ok Since the RFC doesn't contain explicit note about the opposite, the drivers should also: 4. Preserve the behaviour of keeping separate set of results for separate layer instances. In this regard a query on one layer should not invalidate the results for a different layer instance of the same driver. This seems to be a lot to expect. We go to significant effort with the connection pooling to allow reuse of a connection for different layers, and in effect in many drivers this connection also carries a bunch of context with it. Certainly in the case of OGR an OGRLayer retains a concept of current query result, but it can be invalidated by lots of operations other than ResetReading() and GetNextFeature(). I would imagine this is true to a greater or lesser to other drivers that pool connections. 5. Creating a clone of a layer should provide to use a separate query (by keeping the results intact on the original layer). This would be essential for msDrawQueryLayer to work when drawing the background before the highlighted features. This is also quite impractical for some implementations - certainly for OGR. 6. Using a drawQuery should not invalidate the results of a previous query. I don't know much about drawQuery but it does seem plausible to ask that drawQuery should not invalidate the query it is drawing. 7. Drawing the map should not invalidate the results of a previous query. But drawing maps uses the feature access machinery like whichShapes doesn't it? How can we expect map drawing not to invalidate a query? In retrospect, I'm not all that confident that we really considered the impact of RFC 52 on use cases such as those you raise. I certainly didn't understand these impacts. What is less clear to me is where to go from here. RFC 52 was put in place because the old approach was giving terrible performance in some cases. But if we put the expectations you list into place there is no way it can be made fast on OGR short of maintaining distinct OGRDataset instances for each query in addition to the one used to draw the layer. This could cause various performance and resource problems. Best regards, -- ---+-- I set the clouds in motion - turn up | Frank Warmerdam, warmer...@pobox.com light and sound - activate the windows | http://pobox.com/~warmerdam and watch the world go round - Rush | Geospatial Programmer for Rent ___ mapserver-users mailing list mapserver-users@lists.osgeo.org http