Has anyone tried using Postgres as a LRU cache for data in cloud blob
storage? I have a lot of data in Cassandra and want to move that to
Backblaze (like S3 but cheaper and less available). Cassandra is working
well but I have to add new nodes more frequently that I would like. The
older data is rarely needed but newer and recently accessed records need to
be local. Everything is on leased servers at Hetzner.

I was thinking that I could use a table partitioned by created date (one
partition per day) and drop the oldest partition(s) when it starts getting
full. This wouldn’t be LRU but would probably work for my application.
create table activity_stream (    activity_id text primary key,    created
timestamp,    last_read timestamp,    data bytea  -- 1k to approx 200k,
mostly on the lower side)-- Partition would use created to split by day
I could update last_read whenever a record is read. If I use a fill factor
of less than 100 and only update last_read would I avoid dead tuple
problems with HOT tuple optimisation?Then before dropping the oldest
partition I can update created = last_read on recently read records to move
them to other partitions making it LRU. This will be a small percentage of
the data.
Any suggestions / other approaches? Thanks.

Reply via email to