[
https://issues.apache.org/jira/browse/PHOENIX-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14169076#comment-14169076
]
Hudson commented on PHOENIX-1333:
---------------------------------
FAILURE: Integrated in Phoenix | 3.0 | Hadoop1 #251 (See
[https://builds.apache.org/job/Phoenix-3.0-hadoop1/251/])
Phoenix-1333 Store statistics guideposts as VARBINARY (Ramkrishna S
(ramkrishna: rev 75484fb32ae73954e63f1364cb6652760fefe579)
* phoenix-core/src/main/java/org/apache/phoenix/schema/PTableImpl.java
*
phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixDatabaseMetaData.java
*
phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsWriter.java
*
phoenix-core/src/main/java/org/apache/phoenix/schema/stats/PTableStatsImpl.java
* phoenix-core/src/main/java/org/apache/phoenix/schema/stats/PTableStats.java
* phoenix-core/src/main/java/org/apache/phoenix/query/QueryConstants.java
* phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsUtil.java
* phoenix-core/src/it/java/org/apache/phoenix/end2end/QueryIT.java
*
phoenix-core/src/main/java/org/apache/phoenix/schema/stats/StatisticsCollector.java
* phoenix-core/src/main/java/org/apache/phoenix/iterate/ParallelIterators.java
PHOENIX-1333 - Store statistics guideposts as VARBINARY (Add missing file
(ramkrishna: rev 4c8798d57653e33c1fe1b68acd8a3a7569e79080)
* phoenix-core/src/main/java/org/apache/phoenix/schema/stats/GuidePostsInfo.java
> Store statistics guideposts as VARBINARY
> ----------------------------------------
>
> Key: PHOENIX-1333
> URL: https://issues.apache.org/jira/browse/PHOENIX-1333
> Project: Phoenix
> Issue Type: Sub-task
> Reporter: James Taylor
> Assignee: ramkrishna.s.vasudevan
> Priority: Critical
> Fix For: 4.2, 3.2
>
> Attachments: PHOENIX-1333_2.patch, Phoenix-1333.patch,
> Phoenix-1333_1.patch
>
>
> There's a potential problem with storing the guideposts as a VARBINARY ARRAY,
> as pointed out by PHOENIX-1329. We'd run into this issue if we're collecting
> stats for a table with a trailing VARBINARY row key column if the value
> contained embedded null bytes. Because of this, we're better off storing
> guideposts as VARBINARY and serializing/deserializing in the following manner:
> <byte length as vint><bytes><byte length as vint><bytes>...
> We should also store as a separate KeyValue column the total number of
> guideposts. So the schema of SYSTEM.STATS would look like this now instead:
> {code}
> public static final String CREATE_STATS_TABLE_METADATA =
> "CREATE TABLE " + SYSTEM_CATALOG_SCHEMA + ".\"" +
> SYSTEM_STATS_TABLE + "\"(\n" +
> // PK columns
> PHYSICAL_NAME + " VARCHAR NOT NULL," +
> COLUMN_FAMILY + " VARCHAR," +
> REGION_NAME + " VARCHAR," +
> GUIDE_POSTS + " VARBINARY," +
> GUIDE_POSTS_COUNT + " SMALLINT," +
> MIN_KEY + " VARBINARY," +
> MAX_KEY + " VARBINARY," +
> LAST_STATS_UPDATE_TIME+ " DATE, "+
> "CONSTRAINT " + SYSTEM_TABLE_PK_NAME + " PRIMARY KEY ("
> + PHYSICAL_NAME + ","
> + COLUMN_FAMILY + ","+ REGION_NAME+"))\n" +
> // TODO: should we support versioned stats?
> // Install split policy to prevent a physical table's stats from
> being split across regions.
> HTableDescriptor.SPLIT_POLICY + "='" +
> MetaDataSplitPolicy.class.getName() + "'\n";
> {code}
> Then the serialization code in StatisticsTable.addStats() would need to
> change to populate the GUIDE_POSTS_COUNT and serialize the GUIDE_POSTS in the
> new format.
> The deserialization code is isolated to StatisticsUtil.readStatisitics(). It
> would need to read the GUIDE_POSTS_COUNT first for estimated sizing, and then
> deserialize the GUIDE_POSTS in the new format.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)