Kristian Waagan <[EMAIL PROTECTED]> writes:

> 
> Ture Munter wrote:
> > I'm using embedded derby in a Java SE application, however I use Derby
> > as a network server while building the database. In that connection I
> > ran into problems.
> > I have a database with 65000 rows that I wanted to add to a Derby
> > database (total size before insert around 10000 records, 30MB). Each row
> > consist of some strings and some floats (6 fields in total) and the
> > operation requires two inserts and one select. During insert I run Derby
> > as a Derby network server, the problem is that no matter what I do I get
> > OutOfMemory exceptions (Derby ran out of heap space) all the time.
> > Setting the max memory use of the JVM to 1GB allowed me to add around
> > 17000 rows, but not anymore than that.
> 
> Hello Ture,

Hello Kristian,

Thanks for the quick response, I'm very impressed :-)


> 
> Is this a out-of-the-box configuration, or have you configured Derby by 
> specifying properties?
> If you have tuned it, have you set the Derby page cache size? 
> (derby.system.pageCacheSize).
> The page size?
> Is there only one connection inserting data?

I use Derby in the out-of-the-box configuration, it worked fine and performance
was good until this problem showed up. I'm going to use the database in
read-only mode and in the embedded configuration. I forgot to tell that I'm
using Derby 10.3.1.4 and JavaSE JDK 1.5.0_12.


> Sounds to me as if you have a program generating the data. Is this 
> something you could share with the community?
> That would be great, then people could run it and see if they see the 
> same behavior. Also, if there's a bug it will be easier to find it if we 
> have a running repro.

All the data are read in from a text-file, parsed and inserted into the
database. As the data are published scientific data I can share my source code
and the data. Everything is available on 

www.fysik.dtu.dk/~munter/derby/

There are a few comments in the source-code that should explain what happens
including the SQL to create the needed tables. Everything should be pretty
self-explanatory as it only reads the text-file line-by-line and inserts them
into the database. 

In order to get it to work, I am constantly closing the connection and
reconnecting to the network server, this is done every 300 rows. This happens
around line 215. 
If this with closing/opening the connection is disabled and Derby runs as
network server (and the database is empty), it runs out of memory after 6956
rows (on my machine with 2GB of memory). If closing/opening the connection is
enabled it first runs out of memory after 16198 rows (when restarting with the
existing database w 6956 rows).

Maybe it would work better/without problems if I used the methods to get the
last auto-generated key - but it shouldn't be necessary to be so careful in
order not to "break" anythings :-) 


> If that is not possible, can you run with Java SE 6 and tell it to
> dump the heap on OOME?
> Another possibility is to use jmap to get a histogram of the 
> heap-content (add the option to only get live objects as well), and 
> this would be easy to post.


I haven't tried any of those things yet, maybe I can find time for that
tomorrow, but for now have fun with the program I uploaded :-)


> > The problem was "solved" by running Derby embedded in the program that
> > inserted data into the database and only commit every 50th row. When
> > running Derby embedded I didn't run out of memory. After adding all
> > 65000 records I can still run Derby as network server and do queries,
> > but there are problems when trying to insert data. There are no problems
> > when running Derby embedded. Have anybody experienced the same
> > problem/features?
> > 
> > I also experience a decrease in performance after having inserted many
> > rows, but the slow-down is not as bad as described in the thread
> > "exponential increase in insert time".
> 
> Just out of curiosity, are you using Statement.executeUpdate, prepared 
> statements or batching when inserting the data?


A little executeUpdate and some PreparedStatements and first calling commit
after 50 rows have been inserted.

Thanks,
Ture Munter




Reply via email to