On Fri, 2007-09-07 at 13:52 -0700, Avery Payne wrote: > So I've been seeing/hearing all of the hoopla over vertical databases > (column stores), and how they'll not only slice bread but also make > toast, etc. I've done some quick searches for past articles on > "C-Store", "Vertica", "Column Store", and "Vertical Database", and have > seen little discussion on this.
I looked at doing this a while back, for similar reasons. ISTM we would be able to do this fairly well if we implemented Index-only columns. i.e. columns that don't exist in the heap, only in an index. Taken to the extreme, all columns could be removed from the heap and placed in an index(es). Only the visibility information would remain on the heap. Syntax for this would be an ALTER TABLE SET STORAGE command, with a new type of storage definition that will only be accepted if an index already has been defined on the table which includes the specified column. Doing this per column would be a big win over vertical databases since AFAIK they *have* to do this to *every* column, even if it is not beneficial to do so. Every existing index plan works immediately. The main annoyance is retrieving a column value that doesn't exist on the heap. That would require a new kind of plan that involves preparing the index(es) by sorting them on htid and then doing a left merge join with the main heap. By now, most people will be screaming at their monitors "what an idiot, thats gonna REALLY suck". True, but this is the same thing that column-oriented databases have to do also, so it would be taking the best that vertical databases have to offer and accepting the consequences. There are some other plan possibilities also, apart from this basic value retrieval, but they would require further index API changes; I'm not certain of those as being primary use cases, however. Vertical databases honestly do have their uses and there are many kinds of marketing query that have complex where clauses yet only simple select clauses. There are a number of industry-specific database products that have utilised this technique to good effect for a number of years. So ISTM the main changes would be executor changes to allow retrieving column values of index-only columns when they are required, and to modify the insert/update tuple code somewhat. I thought maybe we can call it COAST, Column-oriented attribute storage technique, :-) -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster