I was wondering has anyone worked on a DMP/CDP for storing user and
customer profiles in Kudu. Each user will have their base ID's aka identity
graph along with statistics based on their attributes along with tables for
these attributes grouped by category.

Please let me know what you think of my thoughts.

I was thinking of creating a base profile table to store the ID's and
statistics along with unchanging or rarely changing attributes, such as
name, that do not need to be tracked. Next, I would create tables to
categorize groups of attributes, such as user information, behaviors,
geolocation, devices, etc. These attribute tables would have columns for
each attribute and would track changes by only inserting data via a time
stamp column to know when it was entered. Essentially, I would follow the
type 2 slowly changing dimension operandi for data warehouses. For
attributes that expire, we will partition by a time range so that we can
drop off expired data. For attributes where we only need to latest one, we
would add an active column to easily flag and query them after inactivating
older versions.

Any comments or advice would be truly appreciated.

Cheers,
Ben

Reply via email to