Hi Ben,
thanks for the quick response. Please, read my comments inline.
On Mon, Sep 12, 2016 at 3:35 AM, Ben Caradoc-Davies <b...@transient.nz>
wrote:
> However, I do not claim to understand it all, and I would like to hear
> Niels' and Rini's opinion as they have much deeper understanding of joining.
>
I am looking forward to hearing what they have to say as well :-)
>
> I can confirm that, merged on master, a full build passes for GeoTools,
> GeoServer, and GeoServer app-schema online tests against PostGIS. That is,
> these changes do not appear to break anything at all, not even the
> reference data set tests, and you have provided a flag to turn off the new
> functionality just in case it breaks something.
>
I also ran GeoServer app-schema online tests against Oracle (which caused
me a considerable amount of pain, BTW) and they pass (well, I had to
comment out 3D tests, as they require an additional library to run).
> - Your GeoTools changes add many new tab characters and mix tabs and
> spaces; there are a few places where you add tabs where tabs are already
> used, and I would prefer if you expanded these for the entire enclosing
> method
> - TODO block in AppSchemaDataAccess and unused temporary variable
> "expressions" can be removed
>
Yes, I'll clear these up in the PR, sorry for the little mess.
> - Flag app-schema.encodeNestedFilters is enabled by default, needs
> documentation on the joining page
>
Will do.
> - Unnecessary copyright date updates in ComplexFilterSplitter,
> JoiningJDBCFeatureSource, FilterToSQL
>
Uh-oh, I guess I need to catch up with the recent policy change on
copyright date updates: I remember a long discussion about it, but I don't
remember the conclusion :-) I'll look it up
> - FeatureChainedAttributeVisitor has commented line: // chainedAttribute
> that can be removed
> - Commented "throw new UnsupportedOperationException" in
> NestedAttributeExpression can be removed
> - JoiningFieldEncoder made public, requires javadocs
> - JoiningJDBCFeatureSource commented source code lines should be removed,
> especially the large code block in createNestedFilter
> - The commented code blocks in the GeoServer tests that mention the EXISTS
> keyword *are* OK as they really are documentation :-)
>
Yep, as above, I'll clear these up in the PR, thanks for pointing them out.
I should be able to submit the PRs in a few days.
Thanks!
On 12/09/16 02:56, Stefano Costa wrote:
> Hi all,
> I'm writing to solicit feedback from the community about an improvement to
> App-Schema's joining implementation that we have been working on lately.
>
> In short, we would like to introduce support for the SQL encoding of
> filters on nested attributes, which, in the current implementation, end up
> in the post-filter and are evaluated in memory, after all the features have
> been loaded.
> For those who might be unfamiliar with it, the inner workings of the
> current joining implementation are clearly illustrated by Niels Charlier in
> this presentation:
> https://www.seegrid.csiro.au/wiki/pub/Infosrvices/GeoserverA
> ppSchemaJoining/joining_presentation.pdf
>
> The aforementioned presentation mentions the fact that filtering on nested
> attributes is "currently done with post-filtering", and suggests a possible
> strategy for translating them to SQL.
>
> What we have implemented so far is a slight variation of the approach
> suggested by Niels.
>
> Let's take GeoSciML's MappedFeature (container) and GeologicUnit (nested)
> features as example.
> The following filter on the gml:description attribute of the nested
> GeologicUnit type:
>
> <ogc:PropertyIsLike>
>
> <ogc:PropertyName>gsml:specification/gsml:GeologicUnit/gml:
> description</ogc:PropertyName>
> <ogc:Literal>*sedimentary*</ogc:Literal>
> </ogc:PropertyIsLike>
>
> would be ignored by the current joining implementation (i.e. it is regarded
> as a post-filter):
>
> SELECT ...
> FROM "appschematest"."MAPPEDFEATURE"
> ORDER BY "appschematest"."MAPPEDFEATURE"."ID" ASC,
> "appschematest"."MAPPEDFEATURE"."PKEY"
>
> SELECT ...
> FROM "appschematest"."GEOLOGICUNIT"
> INNER JOIN "appschematest"."MAPPEDFEATUREPROPERTYFILE" ON (
> "MAPPEDFEATUREPROPERTYFILE"."GEOLOGIC_UNIT_ID" = "GEOLOGICUNIT"."GML_ID")
> ORDER BY "appschematest"."MAPPEDFEATUREPROPERTYFILE"."ID" ASC,
> "appschematest"."GEOLOGICUNIT"."GML_ID" ASC,
> "appschematest"."GEOLOGICUNIT"."PKEY"
>
>
> but would be translated to the following SQL queries in the new
> implementation:
>
> SELECT ...
> FROM "appschematest"."MAPPEDFEATURE"
> INNER JOIN (
> SELECT DISTINCT "ID" FROM "appschematest"."MAPPEDFEATURE"
> *WHERE EXISTS (*
> * SELECT "chain_link_1"."PKEY" FROM "appschematest"."GEOLOGICUNIT"
> "chain_link_1" *
> * WHERE UPPER("chain_link_1"."TEXTDESCRIPTION") LIKE '%SEDIMENTARY%' AND
> "appschematest"."MAPPEDFEATURE"."GEOLOGIC_UNIT_ID" =
> "chain_link_1"."GML_ID")*
> ) "temp_alias_used_for_filter" ON ( "MAPPEDFEATURE"."ID" =
> "temp_alias_used_for_filter"."ID" )
> ORDER BY "appschematest"."MAPPEDFEATURE"."ID" ASC,
> "appschematest"."MAPPEDFEATURE"."PKEY"
>
> SELECT ...
> FROM "appschematest"."GEOLOGICUNIT"
> INNER JOIN "appschematest"."MAPPEDFEATUREPROPERTYFILE" ON (
> "MAPPEDFEATUREPROPERTYFILE"."GEOLOGIC_UNIT_ID" = "GEOLOGICUNIT"."GML_ID")
> INNER JOIN (
> SELECT DISTINCT "ID" FROM "appschematest"."MAPPEDFEATUREPROPERTYFILE"
> *WHERE EXISTS (*
> * SELECT "chain_link_1"."PKEY" FROM "appschematest"."GEOLOGICUNIT"
> "chain_link_1" *
> * WHERE UPPER("chain_link_1"."TEXTDESCRIPTION") LIKE '%SEDIMENTARY%' AND
> "appschematest"."MAPPEDFEATUREPROPERTYFILE"."GEOLOGIC_UNIT_ID" =
> "chain_link_1"."GML_ID")*
>
> ) "temp_alias_used_for_filter" ON ( "MAPPEDFEATUREPROPERTYFILE"."ID" =
> "temp_alias_used_for_filter"."ID" )
> ORDER BY "appschematest"."MAPPEDFEATUREPROPERTYFILE"."ID" ASC,
> "appschematest"."GEOLOGICUNIT"."GML_ID" ASC,
> "appschematest"."GEOLOGICUNIT"."PKEY"
>
>
> We have seen significant performance improvements, especially with the
> total number of features in the DB is high (several thousands), but only a
> few of them would satisfy the filter.
>
> At present, we don't aim for a comprehensive implementation of nested
> filters encoding; as a consequence, our implementation has the following
> limitations:
> 1. only binary comparison, PropertyIsLike and PropertyIsNull filters are
> translated to SQL
> 2. polymorphic mappings are not supported
> 3. filters involving multiple nested attributes are not supported
>
> In case anybody wants to take a look before a formal PR, the code can be
> found in this branch of my personal geotools fork:
> https://github.com/ridethepenguin/geotools/tree/my_nested_filters_join
>
> As this enhancement only applies when joining is enabled, unit tests have
> been added in the gs-app-schema-test module of geoserver:
> https://github.com/ridethepenguin/geoserver/tree/my_nested_
> filters_join_tests
>
> Last but not least, I would like to point out a couple of fixes (I hope)
> I've introduced as I went along. The first is in UnmappingFilterVisitor:
> https://github.com/geotools/geotools/compare/master...rideth
> epenguin:my_nested_filters_join#diff-de21f72fdbda25bc69ea142fad909c8eR821
>
> I've introduced it to make encoding of joining for simple content possible,
> but it seems to me to be a more general fix to avoid having nulls in the
> encoded filter.
>
> The second is in XPathUtil:
> https://github.com/geotools/geotools/compare/master...rideth
> epenguin:my_nested_filters_join#diff-11339740d8ff9ccff758df9400eecfc5R97
>
> It seems to me the current implementation of startsWith is broken, as it
> considers gml:name[2] and gml:name[3] paths to be the same, while they may
> be mapped to different tables (e.g. see SimpleAttributeFeatureChainWfs
> Test)
>
> Any feedback is higly appreciated :-)
>
> Thanks!
>
>
>
>
> ------------------------------------------------------------
> ------------------
>
>
>
> _______________________________________________
> GeoTools-Devel mailing list
> GeoTools-Devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/geotools-devel
>
>
--
Ben Caradoc-Davies <b...@transient.nz>
Director
Transient Software Limited <http://transient.nz/>
New Zealand
--
Best regards,
Stefano Costa
==
GeoServer Professional Services from the experts!
Visithttp://goo.gl/it488V for more information.
==
Dott. Stefano Costa
Senior Software Engineer
GeoSolutions S.A.S.Via di Montramito 3/A
55054 Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
http://www.geo-solutions.ithttp://twitter.com/geosolutions_it
-------------------------------------------------------
AVVERTENZE AI SENSI DEL D.Lgs. 196/2003
Le informazioni contenute in questo messaggio di posta elettronica e/o
nel/i file/s allegato/i sono da considerarsi strettamente riservate.
Il loro utilizzo è consentito esclusivamente al destinatario del
messaggio, per le finalità indicate nel messaggio stesso. Qualora
riceviate questo messaggio senza esserne il destinatario, Vi preghiamo
cortesemente di darcene notizia via e-mail e di procedere alla
distruzione del messaggio stesso, cancellandolo dal Vostro sistema.
Conservare il messaggio stesso, divulgarlo anche in parte,
distribuirlo ad altri soggetti, copiarlo, od utilizzarlo per finalità
diverse, costituisce comportamento contrario ai principi dettati dal
D.Lgs. 196/2003.
The information in this message and/or attachments, is intended solely
for the attention and use of the named addressee(s) and may be
confidential or proprietary in nature or covered by the provisions of
privacy act (Legislative Decree June, 30 2003, no.196 - Italy's New
Data Protection Code).Any use not in accord with its purpose, any
disclosure, reproduction, copying, distribution, or either
dissemination, either whole or partial, is strictly forbidden except
previous formal approval of the named addressee(s). If you are not the
intended recipient, please contact immediately the sender by
telephone, fax or e-mail and delete the information in this message
that has been received in error. The sender does not give any warranty
or accept liability as the content, accuracy or completeness of sent
messages and accepts no responsibility for changes made after they
were sent or for other risks which arise as a result of e-mail
transmission, viruses, etc.
------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. http://sdm.link/zohodev2dev
_______________________________________________
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel