On 23/07/2007, at 11:29 PM, Jean-Paul Le Fèvre wrote:
Hi, I'm trying to import a pretty big amount of data into my database. The input is a xml formatted file. It describes more than 10 millionsof objects each having tens of attributes. The application parses the input file, creates the cayenne objects and commits the changes if requested.
We've just written something quite similar. We tested it with something like 500,000 objects across several tables.
As you can imagining I'm facing difficulties trying to avoid out of memoryerrors. Unfortunately, at this point, I'm still unable to load my big input file.
As a starting point I hope you are using a SAX parser for your XML.
To figure out what it's happening I'm monitoring the application behavior with jconsole. My tactic is the following : every 10000 objects (this numberis a parameter) I call rollbackChanges() or commitChanges().
We committed each object individually (or sometimes a logical group of objects). That way we could have very fine control and validation errors only caused the loss of that particular record (or small group). I don't know that it helps to commit in batches like this.
When I run the program in rollback mode It turns out that the memory used oscillates between a min and a max value as expected : after each rollbackthe garbage collector feels free to cleanup the memory. But in commit mode the amount of memory keeps on increasing and the application fails eventually.
Probably because the context continues to fill up with the objects you are committing. They aren't discarded. Try creating a new context (and discarding the old for the gc to clean up) after every couple thousand records.
In our situation we were able to get away with 256Mb RAM on the client (we are running this as ROP) and 512Mb RAM on the server (most of which appears to be used by Derby).
Ari --------------------------> Aristedes Maniatis phone +61 2 9660 9700 PGP fingerprint 08 57 20 4B 80 69 59 E2 A9 BF 2D 48 C2 20 0C C8
PGP.sig
Description: This is a digitally signed message part
