Hi,
I ran some tests using 3.0b with SQLTemplate in combination with
prefetching and found
a possible new problem.
It seems that when running the query in eg 1 minute, it takes about 2
minutes before cayenne
has constructed the prefetched objects.
My query produces 2.5 million records. The query will take about 30
minutes. Construction
of the objects will then take an extra hour.
This is not really workable.
Hans
Hans Pikkemaat wrote:
Hi,
What I can see when I use paging in combination with SQLTemplate is this:
Cayenne first runs the main SQLTemplate query which is stored in memory
When I get the first page it determines the key values of the main
query which it then
uses in a new query which will return the main table plus the detail
table data.
This will produce the main table object through which the detail table
is accessible.
The problem here is that the key of the main table is used only. The
SQLTemplate query was manually
constructed and does a query on the main table and a left join to the
detail table so this will produce
a duplicate key value where a main table record has 2 related detail
table records.
This doesnt have to be a problem, actually the query does return the
number of records used as page
size. But internally in cayenne something weird happens. Somehow the
duplicate records are remove
and the IncrementalFaultList.checkPageResultConsistency method throws
an exception for this.
Because the main query returns the main object but also the detail
object I find it strange
that the query generated for the page only uses the main table key. I
would expect that
it also would use the key of the detail table.
An example. Say I have a main table key 1 and related detail records
with key 1, 2 and 3.
Say I run the SQLTemplate which returns key 1 but only key 1 and 2 for
the detail table.
The page query will now run for all detail records and return all
records which I did not
request.
From this I'm concluding that if an SQLTemplate is used it is not
usefull (read: faulty) to
include the detail table in this query. When paging is used all the
detail tables are automatically
queried.
If I write the main SQLTemplate query such it only returns the main
object then the
Exception does not occur.
My conclusion is then that if you want to use paging with SQLTemplate
the main
query should only return the main table. Prefetching will then return
ALL related
table records.
tx
HPI
Andrus Adamchik wrote:
Yeah, still need to check that one.
On Nov 12, 2009, at 10:43 AM, Hans Pikkemaat wrote:
Hi,
Yes, the paginated query would indeed be the only way for me to go
forward.
The problem however is that I get the exception I posted earlier.
tx
Hans
Andrus Adamchik wrote:
For paginated queries we contemplated a strategy of a list with
constant size of fully resolved objects. I.e. when a page is
swapped in, some other (LRU?) page is swapped out. We decided
against it, as in a general case it is hard to consistently
predict which page should be swapped out.
However it should be rather easy to write such a list for a
specific case with a known access order (e.g. a standard iteration
order). In fact I would vote to even include such implementation
in Cayenne going forward.
More specifically, you can extend IncrementalFaultList [1],
overriding 'resolveInterval' to swap out previously read pages,
turning them back into ids. And the good part is that you can use
your extension directly without any need to modify the rest of
Cayenne.
Andrus
[1]
http://cayenne.apache.org/doc/api/org/apache/cayenne/access/IncrementalFaultList.html
On Nov 12, 2009, at 10:07 AM, Hans Pikkemaat wrote:
Hi,
So this means that if I use a generic query that the query
results are always stored
completely in the object store (or the query cache if I configure
it).
Objects are returned in a list so as long I have a reference to
this list (because I'm
traversing it) these objects are not garbage collected.
If I use the query cache the full query results are cached. This
means that I can only
tell it to remove the whole query.
Effectively this means I'm unable to run a big query and process
the results as a stream.
So I cannot process the first results and then somehow make them
available for
garbage collection.
The only option I have would be the iterated query but this is
only usefull for queries
one 1 table without any relations because it is not possible to
use prefetching nor is
it possible to manually construct relations between obects.
My conclusion here is that cayenne is simply not suitable for
doing large batch wise
query processing because of the memory implications.
tx
HPI
Andrus Adamchik wrote:
As mentioned in the docs, individual objects and query lists are
cached independently. Of course query lists contain a subset of
cached
object store objects inside the lists. An object won't get gc'd
if it
is also stored in the query list.
Now list cache expiration is controlled via query cache factory. By
default this is an LRU map, so as long as the map has enough
space to
hold lists (its capacity == # of lists, not # of objects), the
objects
won't get gc'd.
You can explicitly remove entries from the cache via QueryCache
remove
and removeGroup methods. Or you can use a different
QueryCacheFactory
that implements some custom expiration/cleanup mechanism.
Andrus
On Nov 11, 2009, at 3:43 PM, Hans Pikkemaat wrote:
Hi,
I use the latest version of cayenne, 3.0b and am experimenting
with
the object caching features.
The documentation states that committed objects are purged from
the
cache because it uses weak references.
(http://cayenne.apache.org/doc/individual-object-caching.html)
If I however run a query using SQLTemplate which caches the
objects
into the dataContext local cache (objectstore),
the objects don't seem to be purged at all. If I simply run the
query dump the contents using an iterator on the resulting
List then the nr of registered objects in the objectstore stays
the
same (dataContext.getObjectStore().registeredObjectsCount()).
Even if I manually run System.gc() I don't see any changes (I know
this can be normal as gc() doesn't guarantee anything)
What am I doing wrong? Under which circumstances will cayenne
purge
the cache?
tx
Hans
--
TSi Solutions
Neptunusstraat 25
7521 WC Enschede
Tel. +31 (0)88 - 25 00 000
Fax. +31 (0)88 - 25 00 122
Hans Pikkemaat
Java Developer (Services Team)
E-mail: [email protected]
<mailto:[email protected]>
www.tsi-solutions.nl <http://www.tsi-solutions.nl/>
www.toeristiek.nl <http://www.toeristiek.nl/>
10 jaar TSi Solutions
... marktleider in het automatiseren en outsourcen van werkprocessen
in de reisbranche
... toonaangevende partij voor het verzamelen, structureren en
beschikbaarstellen van reiscontent
... Reisrevue Innovatieveer 2008 - Veervolle vermelding
... Winnaar Reisrevue Innovatieveer 2009
... Top 20 positie in 2008 Deloitte Technology Fast50 Nederland
... Top 10 positie in 2009 Deloitte Technology Fast50 Benelux
... genomineerd voor Technology 500 EMEA 2009
TSi Solutions is de handelsnaam van Travel Service International
b.v.[KvK 06091935]
DISCLAIMER: De informatie opgenomen in dit bericht kan vertrouwelijk
zijn en is uitsluitend bestemd voor de geadresseerde.
Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud
niet te gebruiken en de afzender direct te informeren door het bericht
te retourneren.
The information contained in this message may be confidential and is
intended to be exclusively for the addressee.
Should you receive this message unintentionally, please do not use the
contents herein and notify the sender immediately by return e-mail.