Hi,

Of course I don't want to load the whole thing into memory.
I want to run the query and use an iterator to go through the results.
Using paging the jdbc driver is able to produce chunks which prevents
the whole resultset to be loaded into memory.

This same principle I was trying to accomplish using cayenne but clearly
without success.

So I'm going to fall back to cayenne iterated query or even jdbc.

tx

Hans


Michael Gentry wrote:
I'm not exactly sure what you are trying to accomplish, but could you
use plain SQL to do the job (run it from an SQL prompt)?  That's the
approach I normally take when I have to do updates to large amounts of
data.  Especially for a one-off task or something ill-suited to Java
code.  Even if you were using raw JDBC (no ORM) and tried to pull back
2.5 million records it would be difficult.  I don't know the size of
the data record you are using, but if it is even 1k (not an
unreasonable size) it would require 2.5 GB of RAM just to hold the
records.

mrg


On Fri, Nov 13, 2009 at 10:20 AM, Hans Pikkemaat
<[email protected]> wrote:
Hi,

That was the initial approax I tried. The problem with this is that I cannot
manually
create relations between objects constructed from data rows. This means that
when
I access the detail table through the relation it will execute a query to
get them from
the database.

If I have 100 main records it runs 100 queries to get all the details.
This is not performing well. I need to run 1 query which is doing a left
join and
gets all the data in one go.

But I totally agree with you that ORM is too much overhead. I don't need
caching
or something like that. Actually I'm trying to prevent that it is caching
the records.
I'm working on a solution now that is using the iterated query which is
returning
datarows where I construct new objects and the relationsship between them
myself.

tx

Hans


Michael Gentry wrote:
Not just Cayenne, Hans.  No ORM efficiently handles the scale you are
talking about.  You need to find a way to break your query down into
smaller chunks to process.  What you are doing might be workable with
50k records, but not 2.5m.  Find a way to break your query down into
smaller units to process or explore what Andrus suggested with
ResultIterator:

http://cayenne.apache.org/doc/iterating-through-data-rows.html

If you can loop over one record at a time and process it (thereby
letting the garbage collector clean out the ones you have processed)
then your memory usage should be somewhat stable and manageable, even
if the initial query time takes a while.

mrg


On Fri, Nov 13, 2009 at 7:09 AM, Hans Pikkemaat
<[email protected]> wrote:

Anyway, my conclusion is indeed: don't use cayenne for large query
processing.





Reply via email to