Added to TODO: o Allow COPY FROM to create index entries in bulk
http://archives.postgresql.org/pgsql-hackers/2008-02/msg00811.php --------------------------------------------------------------------------- ITAGAKI Takahiro wrote: > This is a proposal of fast data loading using batch update of indexes for 8.4. > It is a part of pg_bulkload (http://pgbulkload.projects.postgresql.org/) and > I'd like to integrate it in order to cooperate with other parts of postgres. > > The basic concept is spooling new coming data, and merge the spool and > the existing indexes into a new index at the end of data loading. It is > 5-10 times faster than index insertion per-row, that is the way in 8.3. > > > One of the problem is locking; Index building in bulkload is similar to > REINDEX rather than INSERT, so we need ACCESS EXCLUSIVE LOCK during it. > Bulkloading is not a upper compatible method, so I'm thinking about > adding a new "WITH LOCK" option for COPY command. > > COPY tbl FROM 'datafile' WITH LOCK; > > If the LOCK option is specified, the behavior of COPY will be changed > as follows: > > 1. Lock the target table in ACCESS EXCLUSIVE mode instead of ROW EXCLUSIVE. > 2. Prepare spooler (BTSpool) for each indexes. > 3. For each new row, put index entries into the spools (_bt_spool) > instead of index_insert. > 4. At the end of COPY, merge the spool and the existing indexes into a new > index file. The relfilenode of the index is changed like REINDEX. > > However, there might be better interfaces for bulk index creation. > For example, if we want to use it with pgloader, we might need > "bulkload mode" for indexes. pgloader commits every 10000 rows, > so the index spooler must keep alive until end of the session > over transactions. (or end of the transaction over sub-transactions) > > I'm working toward the simple "COPY WITH LOCK" approach for now, > but if there are other better ideas, I want to use them. > Advices and suggestions welcome. > > Regards, > --- > ITAGAKI Takahiro > NTT Open Source Software Center > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to [EMAIL PROTECTED] so that your > message can get through to the mailing list cleanly -- Bruce Momjian <[EMAIL PROTECTED]> http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.org&extra=pgsql-hackers