Re: [postgis-users] Basic duplicating database question

Kevin Neufeld Thu, 21 Oct 2010 09:10:35 -0700

On 10/21/2010 7:24 AM, lplateandy wrote:

The doc says "It also takes an exclusive lock on the specific index being
processed, which will block reads that attempt to use that index."


Does that mean that i'm not really any better off as the spatial index is
really the critical means controlling the reading of data into my GIS etc?

You could try to CREATE INDEX CONCURRENTLY, which won't block reads orwrites. There are some caveats, but that's dependent on your use cases.

http://www.postgresql.org/docs/9.0/static/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY

Rather than trying to reindex or cleaning up an existing index, youcould try to create a second index (perhaps even concurrently) with adifferent name on your table. Once complete you could DROP the old index.

What would be your scenario for a full vacuum?

Rather than performing a VACUUM FULL or CLUSTER (both will undesirablylock any production table, likely for several days on large tables), myapproach would be to create a duplicate table with new indexes andsilently switch the tables behind the scenes.

1. Create a duplicate table. If you place the new table in a tempschema, you can use the exact same DDL as the oldtable.

CREATE SCHEMA tmp;
SET search_path TO tmp, public;
CREATE TABLE mytable (...);

2. Copy over the data
INSERT INTO mytable SELECT * FROM public.mytable;

3. Recreate your indexes. This could even be done in parallel if youhave more than one index to create. I also temporarily bump up thememory parameters to create large indexes more quickly. Again, theexact same DDL can be used as on your original table since this tableexists in a different schema. The DDL can be simply copied out ofPgAdminIII.

-- temporarily set available memory to 1GB (if you have it)
SET maintenance_work_mem TO 1048576;
CREATE INDEX ... ON mytable ...;
...

4. In a single short locking transaction, perform the swap
CREATE SCHEMA tmp2;
BEGIN;
LOCK public.mytable;  -- will lock until end of transaction.
ALTER TABLE public.mytable SET SCHEMA TO tmp2;
ALTER TABLE tmp.mytable SET SCHEMA TO public;
-- Perhaps insert into the new table any insert statements
-- that that may have occurred since step 2 (perhaps based on a timestamp)
INSERT INTO public.mytable a
   SELECT * FROM tmp2.mytable b
   WHERE b.ts > (SELECT ts FROM public.mytable ORDER BY ts DESC LIMIT 1);
-- Perhaps deal with updates as well.
...
COMMIT;

I know, it looks complicated, but in a production system, this will befast with minimal downtime (only during the final locking swap).


Cheers,
Kevin
_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

Re: [postgis-users] Basic duplicating database question

Reply via email to