Eric Yang created CHUKWA-700:
--------------------------------

             Summary: Revisit Chukwa metrics schema design for HBase
                 Key: CHUKWA-700
                 URL: https://issues.apache.org/jira/browse/CHUKWA-700
             Project: Chukwa
          Issue Type: Bug
          Components: Data Collection
    Affects Versions: 0.6.0
         Environment: MacOSX, Java
            Reporter: Eric Yang


Current Chukwa HBase schema looks like this:

{code}
                                                <columnFamily>
<timestamp>-<primaryKey>   <cell>...
{code}

Monotonic increasing timestamp can not evenly distribute across region servers 
without special handle and care periodically.

It is time to revise the schema, and proposed schema looks like this:

{code}
                                                <cf>
<hhddmmyyyy>-<primaryId>  <cell>...
{code}

Timestamp is stored with cell, row key helps to split data by hour, and a full 
hour of metrics is stored on the same row.  PrimaryKey is replaced with hash id 
of the primary key.  Metrics tables to aggregate metrics:

chukwaMetrics -> chukwaMetricsMonthly -> chukwaMetricsYearly



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to