Eric Yang created CHUKWA-700:
--------------------------------
Summary: Revisit Chukwa metrics schema design for HBase
Key: CHUKWA-700
URL: https://issues.apache.org/jira/browse/CHUKWA-700
Project: Chukwa
Issue Type: Bug
Components: Data Collection
Affects Versions: 0.6.0
Environment: MacOSX, Java
Reporter: Eric Yang
Current Chukwa HBase schema looks like this:
{code}
<columnFamily>
<timestamp>-<primaryKey> <cell>...
{code}
Monotonic increasing timestamp can not evenly distribute across region servers
without special handle and care periodically.
It is time to revise the schema, and proposed schema looks like this:
{code}
<cf>
<hhddmmyyyy>-<primaryId> <cell>...
{code}
Timestamp is stored with cell, row key helps to split data by hour, and a full
hour of metrics is stored on the same row. PrimaryKey is replaced with hash id
of the primary key. Metrics tables to aggregate metrics:
chukwaMetrics -> chukwaMetricsMonthly -> chukwaMetricsYearly
--
This message was sent by Atlassian JIRA
(v6.1#6144)