Re: [Geoserver-devel] Some follow up to GSIP-155 with a larger data dir

2017-02-17 Thread Andrea Aime
On Sat, Feb 11, 2017 at 3:37 PM, Andrea Aime 
wrote:
>
> A few numbers on this beast with the default catalog facade:
>
>- Cold startup time: I've seen values between 230 seconds and 290
>seconds (not sure why such a high variability)
>- Hot startup time: 59 seconds
>
>
Aaand down to 45 seconds using this pull request:

https://github.com/geoserver/geoserver/pull/2115

Can anyone review? :-)

Cheers
Andrea

-- 
==
GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.
==

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054  Massarosa (LU)
phone: +39 0584 962313
fax: +39 0584 1660272
mob: +39  339 8844549

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.



The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility  for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

---
--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot___
Geoserver-devel mailing list
Geoserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geoserver-devel


[Geoserver-devel] Some follow up to GSIP-155 with a larger data dir

2017-02-11 Thread Andrea Aime
Hi,
after GSIP-155 landed I've also merged https://osgeo-org.atlas
sian.net/browse/GEOS-7954
which saves the datastore connection checks if the capabilities generation
is setup to
skip misconfigured layers.

Then, worked on some little tweaks here and there, and updated the catalog
bulk load
tool to optionally perform deep copies of workspaces and created a new
configuration
that has a lot of stores and layers of most kinds.

In particular, took the release data dir, merged everything in it into a
single workspace,
and then cloned said workspace 1000 times.
This resulted in a data directory with:

   - 1001 workspaces
   - 11000 stores, a mix of shapefiles, postgis, directory of shapefile,
   single tiff, arcgrid, mosaics
   - 42000 layers and 42000 associated tile layers

A few numbers on this beast with the default catalog facade:

   - Cold startup time: I've seen values between 230 seconds and 290
   seconds (not sure why such a high variability)
   - Hot startup time: 59 seconds
   - Load testing a states layer stored in postgis (same as GSIP-155
   benchmar), 35ms, that is, 1ms more than the GSIP-155 test with 10k layers

I've also loaded it in the jdbconfig (conversion took 37 mins), here are
the numbers for it:

   - Cold startup time: 290 seconds
   - Hot startup time: 120 seconds
   - Load testing a states layer stored in postgis (same as GSIP-155
   benchmark), 117ms (same as the timings in GSIP-155)

The JDBCConfig startup time is almost fully spent querying postgresql like
crazy for the internal layers associated to gwc tile layers, and finding
which layers should not be cached in the in-memory GWC cache (this last one
makes it do a full scan of all layers).

This marks the end of these investigations/optimizations for the time
being: I see no more obvious ways to make default catalog facade faster,
besides
trying to parallellize loading the catalog itself (which is hard enough,
won't try to do it in the short term).

Cheers
Andrea

PS: someone off line told me this work is killing jdbcconfig... I don't
quite agree.
It's just showing that JDBCConfig is doing too much work to translate layer
names into internal ids, and that some caches from name to id are needed to
make it competitive when serving OGC requests (along with some clustering
support to drop the name -> id cache
when names change).

JDBCConfig likely remains quite a bit faster in terms of startup time for
configurations that have few cached layers,
but if the GWC configuration could also be moved inside the database then
it would likely start-up in 20-30 seconds
no matter how many layers are in the catalog, while the startup time of
default catalog facade depends linearly to the amount of
layers in the catalog.
All it needs is extra effort to improve it.

-- 
==
GeoServer Professional Services from the experts! Visit
http://goo.gl/it488V for more information.
==

Ing. Andrea Aime
@geowolf
Technical Lead

GeoSolutions S.A.S.
Via di Montramito 3/A
55054  Massarosa (LU)
phone: +39 0584 962313 <+39%200584%20962313>
fax: +39 0584 1660272 <+39%200584%20166%200272>
mob: +39  339 8844549 <+39%20339%20884%204549>

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

*AVVERTENZE AI SENSI DEL D.Lgs. 196/2003*

Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate. Il
loro utilizzo è consentito esclusivamente al destinatario del messaggio,
per le finalità indicate nel messaggio stesso. Qualora riceviate questo
messaggio senza esserne il destinatario, Vi preghiamo cortesemente di
darcene notizia via e-mail e di procedere alla distruzione del messaggio
stesso, cancellandolo dal Vostro sistema. Conservare il messaggio stesso,
divulgarlo anche in parte, distribuirlo ad altri soggetti, copiarlo, od
utilizzarlo per finalità diverse, costituisce comportamento contrario ai
principi dettati dal D.Lgs. 196/2003.



The information in this message and/or attachments, is intended solely for
the attention and use of the named addressee(s) and may be confidential or
proprietary in nature or covered by the provisions of privacy act
(Legislative Decree June, 30 2003, no.196 - Italy's New Data Protection
Code).Any use not in accord with its purpose, any disclosure, reproduction,
copying, distribution, or either dissemination, either whole or partial, is
strictly forbidden except previous formal approval of the named
addressee(s). If you are not the intended recipient, please contact
immediately the sender by telephone, fax or e-mail and delete the
information in this message that has been received in error. The sender
does not give any warranty or accept liability as the content, accuracy or
completeness of sent messages and accepts no responsibility  for changes
made after they were sent or for other risks which arise as a result of
e-mail transmission, viruses, etc.

---
--