Hi, Lewis. In my use cases I always need a mix between static and dynamic columns. In my first week I tried to mix a Map over a column family overlapped with static columns. Didn't work because Gora was not prepared for that (and indeed needs thinking about it further).
What I do is separate the static columns in one column family (or serveral) from the dynamic stuff (that goes in a map). One Map is mapped to one column family in which each column:value is key=>value in the map. I have several maps depending on my needs, but can be just one big one with key=column. What I don't fully understand is the timestamp you talk about, since we don't handle HBase timestamps. Do you specifically need it? I'm not quite sure if I answer you :S Something important to ask is much columns will you store in the column family? Since we removed the StateManager, when you modify a map it deletes the column familiy and sends all the data again to be written ( https://github.com/apache/gora/blob/master/gora-hbase/src/main/java/org/apache/gora/hbase/store/HBaseStore.java#L289), so adding/removing just one column can be quite killing when persisting several huge maps. About what volume and write pattern are we talking? Best, Alfonso Nishikawa 2015-02-24 17:55 GMT+01:00 Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>: > Hi Folks, > I am currently supercharging persistence in Apache Chukwa [0] with Gora, > progress can be tracked in Jira [1]. > The issue I run in to, is that the required HBase schema looks as follows > > Row Key: [Invert Date]:[Data Type]:[Primary Key] > Column Family: log > Column Name: [Sequence ID] > Timestamp: [log entry timestamp] > > Example: > > Row Key: 2132013102:TT:host1.example.com > Column Family: log > Column Name: 1230 > Cell Value: 2013-01-23 12:01:30 INFO This is a log entry. > Timestamp: 1358942490 > > The issue here is therefore that there will be dynamically generated > columns, and the column names needs to be the field 'sequenceID', which is > coming from the data bean itself. > > I *think* that this causes a conflict between our current mapping workflow > where you 1) create data model in JSON, 2) create mapping file/datastore > schema, 3) compile JSON... and so forth. The data is then mapped into the > PREDEFINED datastore specific schema. > > The proposed change in workflow would involve 1) create data model in JSON, > 2) create mapping file/datastore schema, 3) compile JSON... and so forth. > The data is then mapped into the PREDEFINED datastore specific schema AND > ALSO DYNAMIC FIELDS CAN BE GENERATED ON THE FLY. > > Has anyone else required dynamic columns for any datastore? > > I think that this is very handy and I would like to see what you guys > think. > > Thanks > > [0] http://chukwa.apache.org > [1] https://issues.apache.org/jira/browse/CHUKWA-734 > > -- > *Lewis* >