Hi,

I'm currently designing a backend service that would store user profile
information for different applications. Most of the properties in a user
profile would be unknown to the service and specified by the applications
using the service, so the properties would need to be added dynamically.

I was planning to use CQL3 and a dynamic column family defined something
like this:

CREATE TABLE user (
  id UUID,
  propertyset_key TEXT,
  propertyset_val TEXT,
  PRIMARY KEY (id, propertyset_key)
);

There would be N (assuming < 50) property sets associated with a user.
The property set values would be complex object graphs represented as JSON.

Which would lead to the storage engine storing rows similar to this (AFAIK):

8b2c0b60-977a-11e2-99c2-c8bcc8dc5d1d
- basic_info:propertyset_val = { firstName:"john", lastName:"smith", ...}
- contact_info:propertyset_val = { address: {streetAddr:"1 infinite loop",
postalCode: ""}, ... }
- meal_prefs:propertyset_val = { ... }
- ...

Any comments on this design?

Another option would be to use the Cassandra map type for storing property
sets like this:

CREATE TABLE user (
  id UUID,
  property_sets MAP<text, text>,
  PRIMARY KEY (id)
);

Based on the documentation I understood that each map element would
internally be stored as separate a column, so are these user table
definitions equivalent from the storage engine perspective?

I'm using Astyanax which seems to support Cassandra collections.

With the second definition, it should be possible to later migrate a
dynamic property e.g. job_title to a static property, so that I could
execute CQL queries like this:

SELECT * FROM user WHERE job_title = 'developer';

but is it possible to accomplish that somehow with the first definition?
Or should I create separate column families for static and dynamic
properties instead?


marko

Reply via email to