Hasn't been a lot of response to this thread. I have a 23gb database holding only 500mb of data, all created with just inserts (no deletes). For our app, this is a serious problem.
Someone suggested the problem is caused by multi-threaded inserts, but the tables which exhibit the problem were only inserted into by a single thread, each. Any suggestions? Is there a way to tell, before compacting, how much space would be saved by compacting a table? With this information, at least I would be able to periodically compact just those tables which merit being compacted, as a workaround to the real problem. Thanks, Jim > -----Original Message----- > From: Jim Newsham [mailto:[EMAIL PROTECTED] > Sent: Tuesday, October 21, 2008 11:21 AM > To: 'Derby Discussion' > Subject: RE: excessive disk space allocation > > > > > -----Original Message----- > > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > > Sent: Monday, October 20, 2008 9:27 PM > > To: Derby Discussion > > Subject: Re: excessive disk space allocation > > > > Jim Newsham <[EMAIL PROTECTED]> writes: > > > > > Hi, > > > > > > I'm doing some benchmarking of our application which stores data in > > derby. > > > The parts of the application which I am exercising only perform > inserts, > > not > > > deletes. The results suggest that derby disk space allocation is > > excessive, > > > particularly because compressing the tables reduces the size of the > > database * > > > substantially*. For example, here are the results of several > databases, > > both > > > before and after compression. > > > > > > Application running time. original -> compressed > > > > > > 0.5 days. 178.2mb -> 63.1mb > > > > > > 1 day. 559.3mb -> 82.8mb > > > > > > 2 days. 1,879.1mb -> 120.8mb > > > > > > 4 days. 5,154.4mb -> 190.5mb > > > > > > 8 days. 11,443.7mb -> 291.6mb > > > > > > 16 days. 23,706.7mb -> 519.3mb > > > > > > Plotting the data, I observe that both uncompressed and compressed > sizes > > > appear to grow linearly, but the growth factor (slope of the linear > > equation) > > > is 53 times as large for the uncompressed database. Needless to say. > > this is > > > huge. > > > > > > I expected that with only inserts and no deletes, there should be > little > > or no > > > wasted space (and no need for table compression). Is this assumption > > > incorrect? > > > > Hi Jim, > > > > You may have come across a known issue with multi-threaded inserts to > > the same table: > > > > http://thread.gmane.org/gmane.comp.apache.db.derby.devel/36430 > > https://issues.apache.org/jira/browse/DERBY-2337 > > https://issues.apache.org/jira/browse/DERBY-2338 > > Thanks for those links. I used the diagnostic dump program from the > mentioned discussion thread to see how much the individual tables in my > database are compacting. > > The "multi-threaded inserts to the same table" theory doesn't quite jive > here. In my case, I have multiple threads inserting into the database, > but > most of the data goes into tables which are only inserted into by a single > thread for the duration of the application. > > There are only two tables inserted into by more than one thread, and the > data they contain is relatively small (a few percent). For a test > database > I'm looking at right now, these two tables compress to 50% and 90% of > original size, respectively... not much at all. > > By contrast, I am seeing most of the other tables (which aren't inserted > into by more than one thread) compress to between 0.5% and 3.8% of > original > size. For example, I see one table go from 783 pages to 4 pages. > > Jim > > >