To comment on the following update, log in, then open the issue: http://www.openoffice.org/issues/show_bug.cgi?id=99961 Issue #|99961 Summary|Modifying table design causes the db to grow (old data | might be kept in file) Component|Database access Version|OOo 3.0.1 Platform|Unknown URL| OS/Version|Windows 2000 Status|UNCONFIRMED Status whiteboard| Keywords| Resolution| Issue type|DEFECT Priority|P3 Subcomponent|none Assigned to|dbaneedsconfirm Reported by|wimbaetens
------- Additional comments from wimbaet...@openoffice.org Fri Mar 6 10:44:59 +0000 2009 ------- DB type=HSQLDB When repeatedly modifying the table design (e.g. during a normalisation), I have the feeling that a lot of (old) garbage data is still kept in the db-file. My suspicion is confirmed when I create a totally new DB and copy table by table, query by query to the new, empty db-file. The difference can be quite big: the new copy can be 2MB, where the original db-file reaches sometimes 10x as much (20MB!), which makes it a real pain to use that db. Is the old data somehow staying behind in the db-file or am I doing something really stupid? I think that the following example may reproduce the phenomenon. I started out with non-normalised tables, which I started to normalise, for example (tblNames): ID;Name;Type "0";"name1";"some very long description containing a lot of text duplication" "1";"name2";"some very long description containing a lot of text duplication" "2";"name3";"some very long description containing a lot of text duplication" "3";"name4";"another long description" where ID=integer (primary key, autovalue) Name=VARCHAR 10 Type=VARCHAR 255 For these four records, I have maximum 4x (integer+Varchar10+Varchar255) bytes = 4x (10+10+255) = 1100 The example used here contains only a few records. In reality, I had a table with hundreds of records and only about three distinct values for 'Type'. Created a second table (tblDescription): ID;TXT "0";"some very long description containing a lot of text duplication" "1";"another long description" tblNames gets a new column: ID;Name;Type;D_ID where tblNames.D_ID refers to tblDescription.ID Eventually, column tblNames.Type is deleted. At this point, I have the following estimate: tblNames: 4x (integer+Varchar10+integer) = 4x (10+10+10) = 120 tblDescription: 2x (integer+Varchar255) = 2x (10+255) = 530 TOTAL: max 650 bytes In practice, the db-file keeps growing, as if the original tblNames.type column is still kept somewhere. It's true that this is just a small example, but as stated above, a db can grow from 2 to 20MB using similar operations. To get the data from the beginning, I use a query or table view that dereferences tblNames.D_ID to tblDescription.TXT. If necessary, I can come up with a larger example db-file (however, it will cost me quite some time to repeat the process) --------------------------------------------------------------------- Please do not reply to this automatically generated notification from Issue Tracker. Please log onto the website and enter your comments. http://qa.openoffice.org/issue_handling/project_issues.html#notification --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@dba.openoffice.org For additional commands, e-mail: issues-h...@dba.openoffice.org