Try something like this in GBQ
[image: image.png]
-- data insertion
INSERT INTO `test.michboy` (user_id, feature_name, feature_value,
as_of_date, effective_from, effective_to)
SELECT
user_id,
'last_purchase_product',
last_purchase_product,
CURRENT_DATE() as_of_date,
CURRENT_DATE() effective_from,
DATE('2099-12-31') effective_to
FROM
your_daily_user_table
WHERE
date = CURRENT_DATE();
-- data update
UPDATE `test.michboy`
SET
feature_value = 'new_product',
effective_to = CURRENT_DATE() - 1
WHERE
user_id = 1
AND feature_name = 'last_purchase_product'
AND effective_to = DATE('2099-12-31');
-- Example of data retrieval
SELECT
feature_value
FROM
`test.michboy`
WHERE
user_id = 1
AND feature_name = 'last_purchase_product'
AND as_of_date = '2024-01-01'
AND '2024-01-01' BETWEEN effective_from AND effective_to;
use as_of_date and the effective_from and effective_to range to retrieve
the correct feature value for a given date.
HTH
Mich Talebzadeh,
Architect | Data Science | Financial Crime | Forensic Analysis | GDPR
view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
On Thu, 23 Jan 2025 at 19:42, Gunjan Kumar <[email protected]> wrote:
>
> What is the best way to implement a feature store in bigquery so that I
> can get features as of any date in past one year for a given user.
>
> Currently we have designed our bigquery table partitioned by daily date
> and we update the partition with the snaphopt of all users and it's
> features for that day. But this kind of data model has lot of duplicate
> data which is not required.
>
> How we can design this table in such a way that I don't loose the ability
> to get feature as of any date, but also avoid lot of duplicates.
>
> I want to have below functionality.
>
> Select last_purchase_product where user_id = 1 and as_of_date =
> '2024-01-01'
>
>