Optimised groupBy implementation with row handler support
---------------------------------------------------------
Key: IBATIS-503
URL: https://issues.apache.org/jira/browse/IBATIS-503
Project: iBatis for Java
Issue Type: Improvement
Components: SQL Maps
Affects Versions: 2.3.0
Reporter: William Shields
I previously raised a bug against groupBy usage with row handlers. Basically
if you had data like this:
USER_NUMBER,ACCOUNT_NUMBER,SECURITY_CODE,UNITS
1,123,GOOG,100
1,123,MSFT,200
1,123,ORCL,300
1,456,AAPL,150
2,789,GOOG,120
and an abbreviated result map of:
<resultMap id="holding" resultClass="com.myclass.Holding">
<result proeprty="securityCode" column="SECURITY_CODE"/>
<result property="units" column="UNITS"/>
</resultMap>
<resultMap id="account" resultClass="com.myclass.Account" groupBy="userNumber,
accountNumber">
<result property="userNumber" column="USER_NUMBER"/>
<result property="accountNumber" column="ACCOUNT_NUMBER"/>
<result property="holdings" resultMap="holding"/>
</resultMap>
And yes this could be nested 3 levels (or more). In the original bug report if
you used queryWithRowHandler() this was the result:
1,123,GOOG,100
--> Fire for (1,123) Note: at this point holdings is of size 1 with only GOOG
in there.
1,123,MSFT,200
1,123,ORCL,300
1,456,AAPL,150
--> Fire for (1,456)
2,789,GOOG,120
--> Fire for (2,789)
Now eventually it was decided this wasn't a bug because ibatis groupBy does
implicit ordering (in fact I believe it stores the entire result set in a map
using an identity key or something similar based on the groupBy expression).
The reason for this is of course so the query still returns the same result no
matter the ordering of the result set. This has several problems:
1. The entire result set must be loaded before you can safely return any
results (which goes against why you're using row handlers in the first place);
2. The row handler behaviour is misleading and arguably wrong; and
3. It is probably more efficient to sort the result with SQL (assuming its
appropriately indexed, which it should be anyway). This makes the internal
sorting unnecessary.
My first proposal is to make the use of queryWithRowHandler() for a groupBy
query to throw an error.
My second proposal is to have some kind of optimised groupBy alternative that
does no internal sorting. Perhaps this could take the form of another
attribute (eg groupBySort="false"). The groupBy expression basically forms an
identity key. For each unique groupBy tuple there will be one row in the
result set. The optimised groupBy will basically "break" (in the DB/reporting
sense) on a change in the identity expression. Those of you who have done
reporting languages like Natural, Jasper/Crystal Reports and the like will be
familiar with the concept of break conditions.
The optimised/unsorted gorupBy alternative will allow queryWithRowHandler() and
it will behave "correctly". By this i Mean that if, as in the above dataset,
the row handler for the first account (3 holdings) will be called after the
third holding is added to the Account object.
I can (and have) coded a row handler that basically does this but its tedious
and it has the disadvantage that the sort is still happening and the entire
data set is getting loaded into memory. IMHO ibatis needs a better solution.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.