Hasn't been a lot of response to this thread.  I have a 23gb database
holding only 500mb of data, all created with just inserts (no deletes).  For
our app, this is a serious problem. 

Someone suggested the problem is caused by multi-threaded inserts, but the
tables which exhibit the problem were only inserted into by a single thread,
each.  

Any suggestions? 

Is there a way to tell, before compacting, how much space would be saved by
compacting a table?  With this information, at least I would be able to
periodically compact just those tables which merit being compacted, as a
workaround to the real problem.

Thanks,
Jim

> -----Original Message-----
> From: Jim Newsham [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, October 21, 2008 11:21 AM
> To: 'Derby Discussion'
> Subject: RE: excessive disk space allocation
> 
> 
> 
> > -----Original Message-----
> > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> > Sent: Monday, October 20, 2008 9:27 PM
> > To: Derby Discussion
> > Subject: Re: excessive disk space allocation
> >
> > Jim Newsham <[EMAIL PROTECTED]> writes:
> >
> > > Hi,
> > >
> > > I'm doing some benchmarking of our application which stores data in
> > derby.
> > > The parts of the application which I am exercising only perform
> inserts,
> > not
> > > deletes.  The results suggest that derby disk space allocation is
> > excessive,
> > > particularly because compressing the tables reduces the size of the
> > database *
> > > substantially*.  For example, here are the results of several
> databases,
> > both
> > > before and after compression.
> > >
> > > Application running time.  original -> compressed
> > >
> > > 0.5 days.  178.2mb -> 63.1mb
> > >
> > > 1 day.  559.3mb -> 82.8mb
> > >
> > > 2 days.  1,879.1mb -> 120.8mb
> > >
> > > 4 days.  5,154.4mb -> 190.5mb
> > >
> > > 8 days. 11,443.7mb -> 291.6mb
> > >
> > > 16 days.  23,706.7mb -> 519.3mb
> > >
> > > Plotting the data, I observe that both uncompressed and compressed
> sizes
> > > appear to grow linearly, but the growth factor (slope of the linear
> > equation)
> > > is 53 times as large for the uncompressed database.  Needless to say.
> > this is
> > > huge.
> > >
> > > I expected that with only inserts and no deletes, there should be
> little
> > or no
> > > wasted space (and no need for table compression).  Is this assumption
> > > incorrect?
> >
> > Hi Jim,
> >
> > You may have come across a known issue with multi-threaded inserts to
> > the same table:
> >
> > http://thread.gmane.org/gmane.comp.apache.db.derby.devel/36430
> > https://issues.apache.org/jira/browse/DERBY-2337
> > https://issues.apache.org/jira/browse/DERBY-2338
> 
> Thanks for those links.  I used the diagnostic dump program from the
> mentioned discussion thread to see how much the individual tables in my
> database are compacting.
> 
> The "multi-threaded inserts to the same table" theory doesn't quite jive
> here.  In my case, I have multiple threads inserting into the database,
> but
> most of the data goes into tables which are only inserted into by a single
> thread for the duration of the application.
> 
> There are only two tables inserted into by more than one thread, and the
> data they contain is relatively small (a few percent).  For a test
> database
> I'm looking at right now, these two tables compress to 50% and 90% of
> original size, respectively... not much at all.
> 
> By contrast, I am seeing most of the other tables (which aren't inserted
> into by more than one thread) compress to between 0.5% and 3.8% of
> original
> size.  For example, I see one table go from 783 pages to 4 pages.
> 
> Jim
> 
> 
> 



Reply via email to