I'm planning on writing a time series of user action events including user profile, attributes and product purchase transactions to answer these questions/queries:
- What are the events leading up to the users conversion ie, purchase? - What the different attributes that changed over a given time period? - What is the LTV of a given user? - Retrieve list of attributes set/enabled for given user at some point in time. As a newbie to HBase, I wanted to confirm that tall table design ie, with row key <userid>_<timestamp> is _not_ the right design due to these reasons: * scanning for the latest state of user seems to be an expensive operation since not all the columns will be available in the latest event for the user * constructing a row key always requires timestamp to the appended if I'm not using the regex filtering * fetching the user at some point in time t1 involves fetching all the "<userid>*" rows and looking up the row with timestamp <= t1 Are these valid concerns? Thanks!
