I want to announce implementation of In-Memory Columnar Store extension for PostgreSQL.
Vertical representation of data is stored in PostgreSQL shared memory.
Various basic and sophisticated analytic operators are provided for manipulation with timeseries.

      GitHub repository: https://github.com/knizhnik/imcs/
      Documentation: http://www.garret.ru/imcs/user_guide.html
      Sources: http://www.garret.ru/imcs-1.02.tar.gz

Columnar store manager stores data tables as sections of columns of data rather than as rows of data. Most of traditional DBMS-es store data in rows ("horizontally"): all record attributes are stored together. Such approach allows to load the whole record using one read operation which usually leads to better performance for OLTP queries (which access or update single records). But OLAP queries are mostly performing operations on individual columns, for example calculating sum or average of some column. In this case vertical data representation, when data for each column is stored independently, is more efficient. There are several DBMS-es in marker which are based on vertical model: Vertica, SciDB,... Also most of mainstream commercial databases also provide OLAP extensions based on vertical storage: Blue Acceleration for DB2, Oracle Database In-Memory Option, Microsoft SQL server column store...

Columnar store or vertical representation of data allows to achieve better performance in comparison with classical horizontal representation due to three factors: * Reducing size of fetched data: only columns involved in query are accessed. * Vector operations. Applying an operator to set of values (tile) makes it possible to minimize interpretation cost. Also SIMD instructions of modern processors accelerate execution of vector operations. * Compression of data. Certainly compression can also be used for all the records, but independent compression of each column can give much better results without significant extra CPU overhead. For example such simple compression algorithm like RLE (run-length-encoding) allows not only to reduce used space, but also minimize number of performed operations.



--
Sent via pgsql-announce mailing list (pgsql-announce@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-announce

Reply via email to