Some weeks ago, I asked about the UniqueVisitor on the user mailing list
(see [1]) for building queries similar to:

SELECT DISTINCT(name) FROM osmadmlv2
ORDER BY name asc OFFSET 2 LIMIT 4

The conclusion is that UniqueVisitor can't currently produce this result.

It was suggested that the optimized implementation on JDBCDataStore could
be modified to take into account sorting, while a TreeSet could be used for
the general UniqueVisitor implementation.

However, I think there is a deeper, conceptual problem on this approach and
I am thinking on a cleaner way to solve this problem. The real problem is
that a visitor is supposed to visit *only* the elements of a feature
collection, but the optimized JDBCDataStore implementation is running a new
query potentially bringing new elements to the result. This is necessary
because ORDER BY and DISTINCT has to be calculated first, then offset and
limit is applied (otherwise the result is totally different). So probably
this is not really fitting on a visitor but should be architectured in a
different way.

To make it clearer, think on the following records:
["sun", "bean", "blue", "pea", "red", "pea", "red", "blue", "green",
"clock"].

The original query will produce:
["clock", "green", "pea", "red"]

Implementing the UniqueVisitor using a TreeSet, together with the proper
Query (SortBy, startIndex, maxFeatures) would produce:
["bean", "blue", "blue", "clock"] (original collection) -> ["bean", "blue",
"clock"] (visited collection)
which is a totally different result.

So maybe this should be implemented in a different way (a Hint was
suggested in the past, see [2]). Note that I am still not offering myself
to implement it, but I might do it depending on the expected amount of work
to do so.

Does it sound reasonable for you?

Best regards,

César Martínez



[1]
http://osgeo-org.1560.x6.nabble.com/UniqueVisitor-and-sorted-and-paged-queries-td5135576.html
[2] http://osgeo-org.1560.x6.nabble.com/distinct-query-hint-td5049204.html

-- 
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
   César Martínez Izquierdo
   GIS developer
   -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -
   Blog: http://geotechnotes.wordpress.com/
   ETC-SIA: http://sia.eionet.europa.eu/
   Universitat Autònoma de Barcelona (SPAIN)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
------------------------------------------------------------------------------
Is your legacy SCM system holding you back? Join Perforce May 7 to find out:
• 3 signs your SCM is hindering your productivity
• Requirements for releasing software faster
• Expert tips and advice for migrating your SCM now
http://p.sf.net/sfu/perforce
_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to