Hi all!

For development purposes, we dump the production database to local. It's fine 
because the DB is small enough. The company's growing and we want to reduce 
risks. To that end, we'd like to anonymize the data before it leaves the 
database server.

One solution we thought of would be to run statements prior to pg_dump, but 
within the same transaction, something like this:

BEGIN;
UPDATE users SET email = 'dev+' || id || '@example.com', password_hash = '/* 
hash of "password" */', ...;
-- launch pg_dump as usual, ensuring a ROLLBACK at the end
-- pg_dump must run with the *same* connection, obviously

-- if not already done by pg_dump
ROLLBACK;

Is there a ready-made solution for this? Our DB is hosted on Heroku, and we 
don't have 100% flexibility in how we dump.

I searched for "postgresql anonymize data dump before download"[1] and 
variations, but I didn't see anything highly relevant.

Thanks!
François

PS: Cross-posted to http://dba.stackexchange.com/q/168023/3935

  [1]: https://duckduckgo.com/?q=postgresql+anonymize+data+dump+before+download



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to