[jira] [Updated] (GORA-413) Support creation of dynamic columns within Gora datastore mapping designs

Kevin Ratnasekera (JIRA) Thu, 02 May 2019 01:55:48 -0700


     [ 
https://issues.apache.org/jira/browse/GORA-413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Kevin Ratnasekera updated GORA-413:
-----------------------------------
    Fix Version/s:     (was: 0.9)
                   1.0

> Support creation of dynamic columns within Gora datastore mapping designs
> -------------------------------------------------------------------------
>
>                 Key: GORA-413
>                 URL: https://issues.apache.org/jira/browse/GORA-413
>             Project: Apache Gora
>          Issue Type: New Feature
>          Components: gora-hbase
>    Affects Versions: 0.6
>            Reporter: Lewis John McGibbney
>            Priority: Major
>             Fix For: 1.0
>
>
> The conversation taking place on [dynamically generating HBase 
> columns|http://www.mail-archive.com/dev%40gora.apache.org/msg05754.html] has 
> raised an issue that new functionality needs to be added in order to achieve 
> this.
> The main driver for this issue coming to light is that Chukwa logs need to 
> dynamically create many many columns over time directly dependent on the 
> number of data chunks we get. Each data chunk has a [Sequence ID], this 
> sequenceID should be the column name.
> The table design will look like this
> {code}
> Row Key: [Invert Date]:[Data Type]:[Primary Key]
> Column Family: log
> Column Name: [Sequence ID]
> Timestamp: [log entry timestamp]
> Example:
> Row Key: 2132013102:TT:host1.example.com
> Column Family: log
> Column Name: 1230
> Cell Value: 2013-01-23 12:01:30 INFO This is a log entry.
> Timestamp: 1358942490
> {code}
> The inverted date allow the table to be partitioned by hour or day of the 
> month or month more easily.
> The usage of column name for consecutive sequence to allow fast retrieval in 
> a linear scan. This format is typically good for retrieve a hour worth of 
> logs fast for a node. Hence, if we are doing batch scanning of the table in a 
> rolling window via map reduce job at every hour interval, we get a even 
> spread the work load to multiple map reduce tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (GORA-413) Support creation of dynamic columns within Gora datastore mapping designs

Reply via email to