[
https://issues.apache.org/jira/browse/PHOENIX-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
James Taylor updated PHOENIX-4550:
----------------------------------
Summary: Declare maximum columns to ensure storage is dense when table has
many views (was: Ensure storage is dense when base table has many views)
> Declare maximum columns to ensure storage is dense when table has many views
> ----------------------------------------------------------------------------
>
> Key: PHOENIX-4550
> URL: https://issues.apache.org/jira/browse/PHOENIX-4550
> Project: Phoenix
> Issue Type: Improvement
> Reporter: James Taylor
> Priority: Major
>
> By declaring the max number of columns on a base table, we can optimize the
> storage for SINGLE_CELL_ARRAY_WITH_OFFSETS by not storing null values for the
> columns preceding the initial column of a view. This will make a huge
> difference in storage when you have a base table with many views. For example:
> {code}
> -- Declare that the base table will have no more than 10 columns
> CREATE IMMUTABLE TABLE base (k1 VARCHAR, prefix CHAR(3) v1 DATE,
> CONSTRAINT pk PRIMARY KEY (k1, prefix))
> MULTI_TENANT = true,
> MAX_COLUMNS = 10;
> CREATE VIEW v1(k2 VARCHAR PRIMARY KEY, v2 VARCHAR, v3 VARCHAR)
> AS SELECT * FROM base WHERE prefix = 'A00';
> CREATE VIEW v2(k2 VARCHAR PRIMARY KEY, v2 VARCHAR, v3 VARCHAR);
> AS SELECT * FROM base WHERE prefix = 'A10';
> ...
> {code}
> As the number of views grow, the difference between the base table column
> encoding (column #1) and the starting column number of the view (since the
> starting offset is determined by an incrementing value on the base table)
> will increase. This bloats the storage as we need to store null values for
> column encodings between the base table column and the starting column of the
> view.
> Instead, we'll pass through the MAX_COLUMNS value for queries and anything
> column encoding less than this we know it'll be at the start. Anything
> greater and we'll start the search from <column encoding> - <minimum view
> column encoding>.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)