Hello All, This is more like a HBase question but as I am planning to use Phoenix as a access layer, I hope phoenix user will help me.
I would like to create time series data to get on-the-fly analytics. This use case is for adTech. Report - what is houly,daily,weekly impression counts at country level for a given advertisement ID (ADID). I am doing hourly aggregation and loading into a Phoenix table. Primary Key - *ADID | COUNTRY | HOUR ID* --------------------------------------------------------------------------------- *ADID | COUNTRY | HOUR ID* | CF.IMP | --------------------------------------------------------------------------------- 1 | US | 2015062301 | 3000 | 1 | US | 2015062302 | 3421 | 1 | UK | 2015062302 | 1212 | --------------------------------------------------------------------------------- Is it a good schema design or shall I create alternate schema as below Primary Key - *ADID | COUNTRY | DAY ID* ---------------------------------------------------------------------------------------------------- *ADID | COUNTRY | DAY ID* | CF.IMP*01* | CF.IMP*02* | ---------------------------------------------------------------------------------------------------- 1 | US | 20150623 | 3000 | 3421 | 1 | UK | 20150623 | NULL | 1212 | ---------------------------------------------------------------------------------------------------- Here, I have taken hour part from hour ID and created 24 columns. I gone through many time-series NoSQL blog posts, most the author suggest to go with wider rows as above. This will reduce the scan, but I don't see much difference in both Data Models in term of latency for scanning. Can anybody please suggest good approach for my use case? Cheers, Pari