Ok, I guess I was living in the past - OJB used to do these as two 
separate queries (get PKs, then retrieve matched objects).  Didn't realize 
that it was now working as you describe (good news).

I have rewritten my code to use a subquery and thus get the best of both 
worlds - I can apply distinct to just the pk in the subquery, and still 
get all of the matching rows with a single SQL call.

Thanks for the clarification, Armin.

-steve

Steve Clark
ECOS Development Group
[EMAIL PROTECTED]
(970)226-9291




Armin Waibel <[EMAIL PROTECTED]> 
06/26/2006 05:22 PM
Please respond to
"OJB Users List" <ojb-user@db.apache.org>


To
OJB Users List <ojb-user@db.apache.org>
cc

Subject
Re: Why do queries for Identity retrieve all columns?






[EMAIL PROTECTED] wrote:
> Armin,
> 
> I'm using 1.0.4.
> 
> I am not doing a QueryByIdentity.  I am doing a QueryByCriteria.  So 
when 
> I do:
> 
>         QueryByCriteria query = QueryFactory.newQuery(Article.class, 
> criteria, true);
>         broker.getCollectionByQuery(query);
> 
> my expectation would be that the SQL generated would be:
> 
>         -- Which Identities match my criteria?
>         select distinct Artikel_Nr from Artikel where ...
> 
>         -- for cache misses:
>         select Artikel_Nr, Artikelname, Lieferanten_Nr, ... from Artikel 

> where Artikel_Nr = ?
>

This isn't the query/cache-strategy of OJB, because in this case the 
cache should match most objects from the database (this would require a 
really big cache), else we will get many single select statements for 
the missed objects in cache.


> But instead, here's what happens:
> 
>         -- Which Identities match my criteria?
>         select distinct Artikel_Nr, Artikelname, Lieferanten_Nr, ... 
from 
> Artikel where ...
> 
>         -- for cache misses:
>         select Artikel_Nr, Artikelname, Lieferanten_Nr, ... from Artikel 

> where Artikel_Nr = ?
>

This isn't the strategy of OJB. First OJB execute the distinct query, 
then OJB loop through the result set read the Identity of the row and 
check for cache match. If true the object will be read from cache, else 
the whole object will be materialized from result set row.


> When all of the irrelevant columns are included in the first query, that 

> query slows down dramatically.  So what I'm wondering is why the first 
> query includes all of these columns that it doesn't need, rather than 
just 
> retrieving the PKs?
>

See above.

...
 > millions of rows in my result set; afterwards, I have about 700.  If I
 > query just for the PK, the query takes a few seconds to return the 700
 > PK
...

You could do that in two steps:
First use a ReportQuery to retrieve all PK's (e.g. with 
PB.getReportQueryIteratorByQuery(q) or getCollectionByQuery), then use 
PB.getObjectByIdentity(...) to materialize the object.
In this case the worst case scenario would execute 701 select statements 
to materialize the objects (if you use the two-level cache and none-lazy 
references, OJB will need additional queries to lookup the referenced 
objects).
http://db.apache.org/ojb/docu/guides/query.html#Report+Queries
http://db.apache.org/ojb/docu/tutorials/pb-tutorial.html#Find+object+by+primary+key


regards,
Armin

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to