[ http://issues.apache.org/jira/browse/DERBY-2168?page=comments#action_12459314 ] Dyre Tjeldvoll commented on DERBY-2168: ---------------------------------------
A big project indeed. I assume that this change will cause a change in the disk format, and with it various upgrade issues? Perhaps also a major version bump? > Create new row format for derby to optimize access to columns within a row > -------------------------------------------------------------------------- > > Key: DERBY-2168 > URL: http://issues.apache.org/jira/browse/DERBY-2168 > Project: Derby > Issue Type: Improvement > Components: Store > Affects Versions: 10.3.0.0 > Reporter: Mike Matrigali > Priority: Minor > > The current (and only) low level row format for derby was chosen to at the > beginning of the project to be the most flexible. So it treats every > column as variable length. The simple row format is just a sequence of > columns, with each column having a header indicating how long it > is. So there is no way to determine where the N'th column is in the row > unless it first traverses the N-1 columns before > it. A number of queries that might benefit from a different row format > include: > 1) non-covered queries which don't require all columns of data > 2) non index scans which disqualify a number of rows based on a subset of > columns that don't happen to be the 1st N columns of the row. > A pretty standard row format would have some sort of table at the beginning > which would allow one to jump to a given offset of the row without > going through all the other columns. Building up this table would likely > increase the insert cost slightly, and would increase the diskspace required > to store rows. > Another standard kind of row format would be to optimize the storage of > fixed length fields. Currently the store does not know anything about fixed > length fields as each datatype controls it's own storage. New interfaces > could be added either at create time or maybe in the datatypes themselves > to export the knowledge that datatypes are fixed length. > This is a big project. Note that a lot of performance work in StoredPage has > made it "know" about the current record and field formats, as it was > a big performance hit to make class calls for every field traversal. This > means that adding a new record and/or field format is not as isolated as > one might hope. Also we are likely to need to support both the old and new > format. Anyone considering this work, I would suggest a very rough > prototype with peformance measurement first to make sure you are getting the > expected performance before doing a lot of work. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
