Re: inserts slowing down after 2.5m rows

2009-02-27 Thread publicayers


 The application is running on a client machine. I'm not sure how to 
tell if there's a different disk available that I could log to.


If checkpoint is causing this delay, how to a manage that? Can I turn 
checkpointing off? I already have durability set to test; I'm not 
concerned about recovering from a crashed db.


Brian

On Fri, Feb 27, 2009 at  9:34 AM, Peter Ondruška wrote:


Could be checkpoint.. BTW to speed up bulk load you may want to use

large log files located separately from data disks.

2009/2/27, Brian Peterson dianeay...@verizon.net 
mailto:dianeay...@verizon.net   mailto:dianeay...@verizon.net :
I have a big table that gets a lot of inserts. Rows are inserted 10k 
at a
time with a table function. At around 2.5 million rows, inserts slow 
down

from 2-7s to around 15-20s. The table's dat file is around 800-900M.



I have durability set to test, table-level locks, a primary key 
index and
another 2-column index on the table. Page size is at the max and page 
cache
set to 4500 pages. The table gets compressed (inplace) every 500,000 
rows.
I'm using Derby 10.4 with JDK 1.6.0_07, running on Windows XP. I've 
ruled
out anything from the rest of the application, including GC (memory 
usage
follows a consistent pattern during the whole load). It is a local 
file
system. The database has a fixed number of tables (so there's a fixed 
number
of dat files in the database directory the whole time). The logs are 
getting
cleaned up, so there's only a few dat files in the log directory as 
well.




Any ideas what might be causing the big slowdown after so many loads?



Brian






Re: inserts slowing down after 2.5m rows

2009-02-27 Thread publicayers


 I've increased the log size and the checkpoint interval, but it 
doesn't seem to help.


It looks like the inserts begin to dramatically slow down once the table 
reaches the initial allocation of pages. Things just fly along until it 
gets to about 1100 pages (I've allocated an initial 1000 pages, pages 
are 32k).


Any suggestions on how to keep the inserts moving quickly at this point?

Brian

On Fri, Feb 27, 2009 at  3:41 PM, publicay...@verizon.net wrote:

 The application is running on a client machine. I'm not sure how to 
tell if there's a different disk available that I could log to.


If checkpoint is causing this delay, how to a manage that? Can I turn 
checkpointing off? I already have durability set to test; I'm not 
concerned about recovering from a crashed db.


Brian

On Fri, Feb 27, 2009 at  9:34 AM, Peter Ondruška wrote:


Could be checkpoint.. BTW to speed up bulk load you may want to use

large log files located separately from data disks.

2009/2/27, Brian Peterson  dianeay...@verizon.net  
mailto:dianeay...@verizon.net :
I have a big table that gets a lot of inserts. Rows are inserted 10k 
at a time with a table function. At around 2.5 million rows, inserts 
slow down from 2-7s to around 15-20s. The table's dat file is around 
800-900M.



I have durability set to test, table-level locks, a primary key 
index and another 2-column index on the table. Page size is at the max 
and page cache set to 4500 pages. The table gets compressed (inplace) 
every 500,000 rows. I'm using Derby 10.4 with JDK 1.6.0_07, running on 
Windows XP. I've ruled out anything from the rest of the application, 
including GC (memory usage follows a consistent pattern during the 
whole load). It is a local file system. The database has a fixed 
number of tables (so there's a fixed number of dat files in the 
database directory the whole time). The logs are getting cleaned up, 
so there's only a few dat files in the log directory as well.



Any ideas what might be causing the big slowdown after so many loads?


Brian





Re: procedure to upgrade or create an embedded database

2008-12-08 Thread publicayers
How do I tell if the database needs to be upgraded? Or is there no 
impact if I specify upgrade but one is not necessary?


Brian


On Mon, Dec 8, 2008 at  8:43 AM, Rick Hillegas wrote:


Brian Peterson wrote:
I'm trying to figure out how to determine what connection parameters 
for an
embedded database I should use when there might not be a database in 
place
(so create=true), or there might be an older version already in 
place (so
upgrade=true), or there might already be a database of the right 
version

in place (so create or add nothing to the URL).

I read that upgrade and create cannot both be specified in the
connection URL. If I'm putting out a release of my application that 
uses the
latest version of Derby (10.4) while a previous version used 10.2.2, 
what
are the recommended steps for making the connection to the Derby 
database if
one is already present? (Note that I have to handle this 
programmatically as

part of the application startup.)

Do I first try a URL with create and see if there's an error, and 
if so,
follow up with a call with upgrade? Or do I have the procedure 
always use

upgrade and follow up with a URL with create if it fail to make a


Hi Brian,

Your first approach sounds good to me:

1) Bring up the database, specifying create=true

2) If you get the warning that the database already exists, then bring 
down the database and reboot it in upgrade mode


Regards,
-Rick

connection?

Brian






faster inserts in big tables

2008-11-25 Thread publicayers
I have 10's of thousands of rows to add to a table, possibly 100's of 
thousands, and I'm wondering if there's anything else I can do to speed 
this up. The table could end up having a couple of million rows.


This is what I've done so far:
* Using a PreparedStatement that gets reused with each insert.
* Set locking level to TABLE for that table.
* Turned off autocommit.
* Set the connection to READ_COMMIT.

In addition to that, I'm also setting these system parameters, though 
not

necessarily to improve insert performance:
* derby.system.durability=test
* derby.storage.pageSize=32768

The table has one odd feature: The last column is a VARCHAR(32672) FOR 
BIT DATA. I've tried setting the length to something smaller, but it 
didn't really seem to matter.


The primary key is an auto generated int with another 2-column index on 
two BIGINT columns. Something I found interesting is that the inserts 
seem to go 2x faster if I have the 2-column index in place than if I 
have just the primary-key index.


I'm running
   Derby 10.2.2
   JRE 1.6.0_07
   Windows XP SP2

Is there anything else I can do to speed up row inserts?

Thanks,
Brian