Hi Marco,
Thanks for your quick reaction,
In this mail I’ll concentrate on the Filtering issue, as it is the most
blocking for us. The schemaless plugin is interesting and I’ll check it out.
Fixing the filtering bug causes the dates to partially work. So I’ll see what
happens after the core bug is fixed.
I’ve investigated the code wednesday evening, and this is what I came up with:
A lot of the unit tests simply don’t run when no database is attached. In
fact, I see a lot of tests without @Test annotation, so maybe they’re from the
Junit 3 era and never work anymore. If you have an easy way to run them, e.g.
on a geoserver build machine, I’d be interested
Because of this, I tried to create junit tests that does not require mongodb.
To make this possible, I had to make 2 dirty changes in the src/main/java
code:
1. In MongoDataStore, declare the method isMongoVersionLessThan2_6
protected instead of private
2. Add a class MongoSchemaMemoryStore, comparable to MongoSchemaFileStore,
backed by a ConcurrentHashMap. The MongoDB schemaless plugin might have made
this obsolete.
I’d like your idea about both dirty changes.
At this point, the problematic cases become testable. I created some infra and
a checkSplitter method that checks if the splitter creates the correct output:
public class FilterSplitterTest {
private MongoFeatureSource source;
@Before
public void setup() {
MongoDataStore ds =
new MongoDataStore("mongodb://dummy/dum", "mem:") {
@Override
protected boolean isMongoVersionLessThan2_6(MongoClientURI
dataStoreClientURI) {
return false;
}
};
ContentEntry entry = new ContentEntry(ds, new NameImpl("dummyEntry"));
this.source = new MongoFeatureSource(entry, null, null);
}
@Test
public void testProperty() throws Exception {
checkSplitter("a=1", "a=1", "INCLUDE");
}
@Test
public void testAndMongoOnly() throws Exception {
checkSplitter("a=1 AND b=1", "a=1 AND b=1", "INCLUDE");
}
@Test
public void testAndBoth() throws Exception {
//FIXME: CQL.toFilter(before) has wrong date format
checkSplitter("a=1 AND b BEFORE 1980-09-03T00:00:00", "a=1", "c before
1980-09-03T00:00:00");
}
@Test
public void testJSonSelect() throws Exception {
//FIXME this makes no sense, but nothing I try makes more sense
//checkSplitter("a=jsonSelect('a')", "a=1 AND b=1", "INCLUDE");
//FIXME need a working example of a function.
//CQL.toFilter("strConcat(NAME, 'suffix')"); does not work even if it
javadoc claims it should
}
/**
* @param beforeSplit The input to the splitter
* @param toMongo The part of the filter given to mongodb
* @param toPostprocess The part of the filter done by postprocessing
* @throws CQLException
*/
private void checkSplitter(String beforeSplit, String toMongo, String
toPostprocess)
throws CQLException {
//FIXME getCountInternal ignores the toPostprocess, is this correct?
Filter[] split = source.splitFilter(CQL.toFilter(beforeSplit));
Assert.assertEquals(CQL.toFilter(toMongo), split[0]);
Assert.assertEquals(CQL.toFilter(toPostprocess), split[1]);
}
}
These tests demonstrate some interesting point:
* testProperty/ testAndMongoOnly trigger the bug as they should
* testAndBoth does not trigger the bug. As it happens, the AND filter can
paper over part of the impact if the stacks are not balanced.
* Bonus bug? testAndBoth demonstrates that BeforeImpl.toString() uses the
default java date format and not ISO.
* Bonus bug? testJSonSelect: As the JSonSelect patch is the root cause of
all these troubles, I try to write a test that finds out if this feature is not
damaged by my bugfix. Unfortunately, I found out that I can’t. Every usage I
try crashes with a ClassCastException while parsing. Even the strConCat demo
mentioned in the CQL Javadoc crashes. I saw these same crashes on our dev
geoserver instance. But maybe it’s just me not knowing what CQL is supposed to
do.
If you can comment on the 2 bonus bugs and the supposed way to use the
JSonSelect function, I would be grateful.
All of this is preparation for the real bugfix. I agree with your assessment
of the bug, but with some caveats:
* There is a class MongoFilterSplitter that duplicates the defective
filter, with slightly different but also faulty behaviour. I propose to delete
it.
* MongoFeatureSource.splitFilter is the core of the bug. Both
BinaryComparisonOperator and PropertyIsLike seem impacted, and as you say,
should defer to their parent implementation.
* But: What if e.g. there is a JSonSelectFunction but no Literal. The
behaviour from super.visit…() needs to be verified for this function, it might
not be as intended. But I need clarification on the 2nd bonus bug to write
some unit tests that demonstrate correct behaviour before
If I suppose optimistically that you agree with all the above and fill in the
2nd bonus bug, I can provide a PR. Do you want 1 PR containing everything, or
would you prefer me to split it up in multiple PRs for ease of code review?
Hans
From: Marco Volpini <[email protected]>
Sent: vrijdag 8 juli 2022 9:14
To: Hans Yperman <[email protected]>
Cc: [email protected]
Subject: Re: [Geotools-devel] geotools mongodb bugs in filtering
Dear Hans,
all improvements and bug fixes are welcome on both GeoTools and GeoServer. You
can either provide the bug fix/improvement by yourself by opening a pull
request on the GeoServer or Geotools repo, either by having someone else doing
it for you (see the commercial support page<https://geoserver.org/support/>).
In terms of help that can be provided to fix the issues that you have found, I
can give you some code pointers (see below) and review the pull requests once
they are opened.
Please see my replies below to your points:
· MongoDB is schemaless, but it seems geoserver needs 2 schema
definitions: 1 defining the attributes for layers, a 2nd specific for mongodb
cached on disk and invisible if you don't know about its existence. If an
attribute is added to mongodb, we'll need to delete/adapt the second, and
restart the geoserver.
This is due the fact that GeoTools and GeoServer are schema driven based on the
gml standard that always needs a schema upfront in order to properly handle the
data. Moreover the gt mongodb data store manages SimpleFeatures. Simple Feature
specs doesn't allow nested objects and arrays to be encoded. The addition
internal JSON schema is indeed also used to flattenize any nested structure to
make it SimpleFeature compatible. However if you are interested in serving
feature only as WFS GeoJSON and as WMS you might think to try the MongoDB
schemaless
plugin<https://docs.geoserver.org/latest/en/user/community/schemaless-features/schemaless-mongo/index.html>
that instead will serve mongodb document as features as they are in the db,
without flattening properties and without the need to provide a schema upfront.
Mind that GML output will not be available in this case.
· The mongodb password needs to be hardcoded in the URL. This means
geoserver can't hide/encrypt it. It is readable for everyone who can see the
datastore.
Yes currently the mongodb pwd can be only provided in the connection string
input field in plain text, although GeoServer allows to externalize the
connection string using environment
properties<https://docs.geoserver.org/master/en/user/datadirectory/configtemplate.html>.
Feel free to improve the mongodb connection string configuration of the
MongoDB DataStore. Here some code pointers that you might find usefull:
MongoDataStoreFactory<https://github.com/geotools/geotools/blob/1fed12f88ab0c9d4ef92aa9cf73af8f94d5c1291/modules/plugin/mongodb/src/main/java/org/geotools/data/mongodb/MongoDataStoreFactory.java#L30>
and
MongoDataStore<https://github.com/geotools/geotools/blob/1e036c73a4e420beeeb86c381c797a898a68c8a7/modules/plugin/mongodb/src/main/java/org/geotools/data/mongodb/MongoDataStore.java#L74>.
· I can't get Filtering to work. e.g. I add cql_filter=datatype%3D'C'
which adds a filter on an attribute datatype='C'. Depending on the attribute,
I get the whole mongodb collection or nothing. I debugged it to a function
splitFilter that basically just throws my filter away. Issue GEOT-5911 has the
same conclusions as I have and a partial fix, but it seems dead.
This indeed is a bug. I've reproduced it with the mongo db data store (while on
the mongodb schemaless plugin the same filter works fine). I gave just a quick
look so I'm not 100% sure but to fix it I believe that at the end of this visit
method in the PostPreFilter
Splitter<https://github.com/geotools/geotools/blob/1e036c73a4e420beeeb86c381c797a898a68c8a7/modules/plugin/mongodb/src/main/java/org/geotools/data/mongodb/MongoFeatureSource.java#L340>
in an else statement a call to the superclass visit(BinaryComparisonOperator
filter) should be added.
· I have the impression filtering on dates/times is either not
implemented (throw new UnsupportedOperationException() ) or done by geoserver
instead of mongo (costing me the mongodb indexes). Can't test this because of
the issue above so I might be wrong here.
Currently you can filter by dates/times only if the date time value is stored
as a string in ISO format on MongoDB. In a cql filter then you can specify a
time filter using >,<,>=,<= operators like in the following example:
cql_filter= mydateproperty >'2021-03-15T09:54:59.000Z'. MongoDB Date type is
not currently supported as it is not supported the translation of cql temporal
operators to mongodb filters.
Regards,
Marco Volpini
==
GeoServer Professional Services from the experts!
Visit http://bit.ly/gs-services-us for more information.
==
Marco Volpini
Software Engineer
GeoSolutions Group
phone: +39 0584 962313
fax: +39 0584 1660272
https://www.geosolutionsgroup.com/
http://twitter.com/geosolutions_it
-------------------------------------------------------
Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE
2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si precisa
che ogni circostanza inerente alla presente email (il suo contenuto, gli
eventuali allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i
destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per
errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le sarei
comunque grato se potesse darmene notizia.
This email is intended only for the person or entity to which it is addressed
and may contain information that is privileged, confidential or otherwise
protected from disclosure. We remind that - as provided by European Regulation
2016/679 “GDPR” - copying, dissemination or use of this e-mail or the
information herein by anyone other than the intended recipient is prohibited.
If you have received this email by mistake, please notify us immediately by
telephone or e-mail.
On Thu, Jul 7, 2022 at 5:04 PM Hans Yperman
<[email protected]<mailto:[email protected]>> wrote:
Hello everybody, and especially Marco Volpini ,
We are testing the geoserver/geotools/mongodb combination. We have a number of
filtering defects that block us from continuing our tests.
There are a few issues open for this, e.g., GEOT-5911 GEOT-6570 GEOT-6348
GEOT-6810 . We debugged the underlying cause, and we think we have enough
knowledge to fix this bug.
We want to know your position as maintainer:
* Do we agree that there is a problem related to filtering worth fixing.
* Can you provide necessary help(e.g., guidance) to fix this, if resources
available.
* Are you willing to accept a pull request , after aligning with your coding
standards?
Unfortunately, we already sent 2 messages to the geoserver and geotools user
list and did not receive any reaction on either.
See:
https://sourceforge.net/p/geoserver/mailman/geoserver-users/thread/DB9PR05MB898397B5B85AAF69FA48213A9CB59%40DB9PR05MB8983.eurprd05.prod.outlook.com/#msg37671813
https://sourceforge.net/p/geotools/mailman/geotools-gt2-users/thread/DB9PR05MB89834EE49A41E3744010B51D9CBA9%40DB9PR05MB8983.eurprd05.prod.outlook.com/#msg37674928
Kind regards,
Hans
_______________________________________________
GeoTools-Devel mailing list
[email protected]<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/geotools-devel
_______________________________________________
GeoTools-Devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel