On Thu, Apr 2, 2015 at 11:51 PM, Andrea Aime <andrea.a...@geo-solutions.it>
wrote:

> On Thu, Apr 2, 2015 at 7:47 PM, Martin Davis <mtncl...@gmail.com> wrote:
>
>> I've now identified the problem causing the slow WFS performance.  As
>> Andrea suspected, it is in the ArcSDEQuery.calculateResultCount() method,
>> called before the WFS query queries the data.  The issue is that (on our
>> Oracle SDE instance at least) the SeQuery.calculateTableStatistics() call
>> is extremely slow (likely it is doing a full table scan rather than using
>> the spatial index).
>>
>> This shows up very obviously in our case, since we have a layer with 11M
>> features in it. But it's impacting performance for every WFS query (even
>> scans of small tables seem to be much slower than the actual data
>> retrieval).  (This actually makes me wonder if the SDE API is pulling the
>> data over the wire to count it!).
>>
>> Here's the actual stats from a test mockup:
>>
>> Testing layer MTA_SPATIAL.MTA_MINERAL_PLACER_GRID_POLY
>> Fetching table stats...
>> Row count = 1560  ----  955.04 s
>> Querying data...
>> Row count = 1560  ----  2.052 s
>>
>
> Wow... gross :-)
> At that point you can indeed fetch all the data, maybe telling SDE that
> you only want one small column (to reduce data transfer)
>

Nope, have to include the geometry column, in order to be able to run the
spatial query (the SDE API requires the geometry column to be specified).

>
>
>>
>> We're going to open a ticket with ESRI about this, but I don't have much
>> optimism they'll do anything for us (given that the SDE API is sunsetting).
>>
>> So what are the options on the GeoServer side?  It might be always faster
>> to simply run the query twice, once to count and once for the data.  In
>> fact, there is already code in the calculateResultCount to do this, for
>> Oracle versioned layers. Perhaps this should be extended for *all* Oracle
>> layers? (Note that I still think there may be a bug in this code to do with
>> the order of API calls, but that can be fixed at the same time).
>>
>
> Could be but.. I'm honestly blown away that this would be the way to go...
> if it is, maybe we should add some flag to control it?
> Or is it just going to always be slower to try get the stats?
>

If the API is indeed doing a client-side evaluation of the spatial filter,
then the stats are always going to be no faster than the query (and a LOT
slower for a highly selective query).  I tested against three tables with
very different datasets (with 10s, 1000s and Ms of records), and the stats
call was always slower than the actual query.

If this IS the case, then it doesn't seem like there's much point to a
flag, since every query with a spatial filter should use the "query twice"
strategy for best performance?

I don't know if this is the case on non-Oracle databases, so it might be
worth limiting this code path to only Oracle (for now - easy to extend to
other DBs if their drivers are found to have the same behaviour).

Also, we were orginally testing using the 9.3 JARs, against a 10.2 SDE.
Today have verified that the 10.1 JARs have the same issue.


>
> Cheers
> Andrea
>
> --
> ==
> GeoServer Professional Services from the experts! Visit
> http://goo.gl/NWWaa2 for more information.
> ==
>
> Ing. Andrea Aime
> @geowolf
> Technical Lead
>
> GeoSolutions S.A.S.
> Via Poggio alle Viti 1187
> 55054  Massarosa (LU)
> Italy
> phone: +39 0584 962313
> fax: +39 0584 1660272
> mob: +39  339 8844549
>
> http://www.geo-solutions.it
> http://twitter.com/geosolutions_it
>
> *AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*
>
> Le informazioni contenute in questo messaggio di posta elettronica e/o
> nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
> loro utilizzo è consentito esclusivamente al destinatario del messaggio,
> per le finalità indicate nel messaggio stesso. Qualora riceviate questo
> messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
> darcene notizia via e-mail e di procedere alla distruzione del messaggio
> stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
> divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
> utilizzarlo per finalità diverse, costituisce comportamento contrario ai
> principi dettati dal D.Lgs. 196/2003.
>
>
>
> The information in this message and/or attachments, is intended solely for
> the attention and use of the named addressee(s) and may be confidential or
> proprietary in nature or covered by the provisions of privacy act
> (Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
> Code).Any use not in accord with its purpose, any disclosure, reproduction,
> copying, distribution, or either dissemination, either whole or partial, is
> strictly forbidden except previous formal approval of the named
> addressee(s). If you are not the intended recipient, please contact
> immediately the sender by telephone, fax or e-mail and delete the
> information in this message that has been received in error. The sender
> does not give any warranty or accept liability as the content, accuracy or
> completeness of sent messages and accepts no responsibility  for changes
> made after they were sent or for other risks which arise as a result of
> e-mail transmission, viruses, etc.
>
> -------------------------------------------------------
>
------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
_______________________________________________
Geoserver-users mailing list
Geoserver-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-users

Reply via email to