TL;DR: Cayenne using DISTINCT for column queries in 4.2 affects Agrest
algorithms, but is nevertheless correct and should stay around.
----
When testing M1, I noticed a difference in ordering of the second-level results
in Agrest vs 4.0/4.1. Those are generated in Agrest via a SelectQuery with
columns roughly looking like this:
List<Property<?>> properties = new ArrayList<>();
properties.add(Property.createSelf(E3.class));
Expression exp = ExpressionFactory.dbPathExp("e2.id");
properties.add(Property.create(exp, Integer.class));
SelectQuery query = new SelectQuery(E3.class);
query.setColumns(properties);
Here "e2" is a to-one relationship. The difference in generated SQL between 4.0
and 4.2 is the DISTINCT keyword added in the latter:
4.0:
SELECT t0.name, t0.e2_id, t0.id_, t1.id_ FROM utest.e3 t0 JOIN utest.e2 t1 ON
(t0.e2_id = t1.id_)
4.2:
SELECT DISTINCT t0.name, t0.e2_id, t0.id_, t1.id_ FROM utest.e3 t0 JOIN
utest.e2 t1 ON t0.e2_id = t1.id_
Both produce the correct result, but since there's no explicit ordering, the
actual order of the objects returned by Derby is different. I am less worried
about the ordering (this was just an indicator to me that something has
changed), but DISTINCT has a performance impact, and now it seems it will
affect the main execution path of Agrest.
I suppose Cayenne behavior in 4.2 is correct as with column queries there are
no simple rules for when there may be duplicate result rows. Our case doesn't
require DISTINCT only because of the special combination (entity and a related
to-one id). And we need to fix this on Agrest end (that join is redundant in
case of to-one).
Still figured I'd mention...
Andrus