Re: [HACKERS] Detach/attach table and index data files from one cluster to another

2013-04-13 Thread Sameer Thakur
On Fri, Apr 12, 2013 at 9:52 PM, Andres Freund and...@2ndquadrant.comwrote:

 On 2013-04-12 12:14:24 -0400, Tom Lane wrote:
  Andrew Dunstan and...@dunslane.net writes:
   On 04/12/2013 10:15 AM, Tom Lane wrote:
   There's 0 chance of making that work, because the two databases
 wouldn't
   have the same notions of committed XIDs.
 
   Yeah. Trying to think way outside the box, could we invent some sort of
   fixup mechanism that could be applied to adopted files?
 
  Well, it wouldn't be that hard to replace XIDs with FrozenXID or
  InvalidXID as appropriate, if you had access to the source database's
  clog while you did the copying.  It just wouldn't be very fast.

 I think if one goes over the heap and hint bits everything (so the item
 pointers don't have to be immediately rewritten), freeze everything and
 such it should be doable at about disk speed unless you have a really
 fast disk subsystem.
 But it still is fairly complicated and I doubt its really necessary.

  I suppose it would still be faster than a COPY transfer, but I'm not
  sure it'd be enough faster to justify the work and the additional
  portability hits you'd be taking.

 Using binary copy might already give quite a speedup, Sameer, did you
 try that?
 No we have not so far, was soliciting feedback first from the hackers and
 possibly implement as a contrib module. Also i did misread the earlier post
 on the subject.



 Also, do you really need parts of a cluster or would a base backup of
 the whole cluster do the trick?
 We were looking at parts of cluster as an faster alternative to pg_dump
 and restore

 --
  Andres Freund http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services



Re: [HACKERS] Detach/attach table and index data files from one cluster to another

2013-04-12 Thread Tom Lane
Sameer Thakur samthaku...@gmail.com writes:
 The proposed tool tries to make migration faster for tables and indices
 only by copying their binary data files.

There's 0 chance of making that work, because the two databases wouldn't
have the same notions of committed XIDs.  You apparently don't
understand what you read in the other discussion --- the steps you are
objecting to are not optional, whether copying a whole tablespace or
only one table.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Detach/attach table and index data files from one cluster to another

2013-04-12 Thread Andrew Dunstan


On 04/12/2013 10:15 AM, Tom Lane wrote:

Sameer Thakur samthaku...@gmail.com writes:

The proposed tool tries to make migration faster for tables and indices
only by copying their binary data files.

There's 0 chance of making that work, because the two databases wouldn't
have the same notions of committed XIDs.



Yeah. Trying to think way outside the box, could we invent some sort of 
fixup mechanism that could be applied to adopted files? Of course, that 
could slow things down so much that it wouldn't be worth it, but it 
might be a nice research project.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Detach/attach table and index data files from one cluster to another

2013-04-12 Thread Alvaro Herrera
Andrew Dunstan escribió:
 
 On 04/12/2013 10:15 AM, Tom Lane wrote:
 Sameer Thakur samthaku...@gmail.com writes:
 The proposed tool tries to make migration faster for tables and indices
 only by copying their binary data files.
 There's 0 chance of making that work, because the two databases wouldn't
 have the same notions of committed XIDs.
 
 Yeah. Trying to think way outside the box, could we invent some sort
 of fixup mechanism that could be applied to adopted files? Of
 course, that could slow things down so much that it wouldn't be
 worth it, but it might be a nice research project.

I think the fixup procedure involves freezing Xids (prior to the
transporting), which the OP said he didn't want to do.

If you don't freeze beforehand, there's not enough info in the new
cluster to know which tuples are dead/alive.  Another option would be to
have a private copy of pg_clog/pg_subtrans for the transported
table(s), but that seems very difficult to arrange.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Detach/attach table and index data files from one cluster to another

2013-04-12 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 04/12/2013 10:15 AM, Tom Lane wrote:
 There's 0 chance of making that work, because the two databases wouldn't
 have the same notions of committed XIDs.

 Yeah. Trying to think way outside the box, could we invent some sort of 
 fixup mechanism that could be applied to adopted files?

Well, it wouldn't be that hard to replace XIDs with FrozenXID or
InvalidXID as appropriate, if you had access to the source database's
clog while you did the copying.  It just wouldn't be very fast.

I suppose it would still be faster than a COPY transfer, but I'm not
sure it'd be enough faster to justify the work and the additional
portability hits you'd be taking.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Detach/attach table and index data files from one cluster to another

2013-04-12 Thread Andres Freund
On 2013-04-12 12:14:24 -0400, Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
  On 04/12/2013 10:15 AM, Tom Lane wrote:
  There's 0 chance of making that work, because the two databases wouldn't
  have the same notions of committed XIDs.
 
  Yeah. Trying to think way outside the box, could we invent some sort of 
  fixup mechanism that could be applied to adopted files?
 
 Well, it wouldn't be that hard to replace XIDs with FrozenXID or
 InvalidXID as appropriate, if you had access to the source database's
 clog while you did the copying.  It just wouldn't be very fast.

I think if one goes over the heap and hint bits everything (so the item
pointers don't have to be immediately rewritten), freeze everything and
such it should be doable at about disk speed unless you have a really
fast disk subsystem.
But it still is fairly complicated and I doubt its really necessary.

 I suppose it would still be faster than a COPY transfer, but I'm not
 sure it'd be enough faster to justify the work and the additional
 portability hits you'd be taking.

Using binary copy might already give quite a speedup, Sameer, did you
try that?

Also, do you really need parts of a cluster or would a base backup of
the whole cluster do the trick?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Detach/attach table and index data files from one cluster to another

2013-04-12 Thread Pavan Deolasee
On Fri, Apr 12, 2013 at 9:44 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Andrew Dunstan and...@dunslane.net writes:
  On 04/12/2013 10:15 AM, Tom Lane wrote:
  There's 0 chance of making that work, because the two databases wouldn't
  have the same notions of committed XIDs.

  Yeah. Trying to think way outside the box, could we invent some sort of
  fixup mechanism that could be applied to adopted files?

 Well, it wouldn't be that hard to replace XIDs with FrozenXID or
 InvalidXID as appropriate, if you had access to the source database's
 clog while you did the copying.  It just wouldn't be very fast.


Would it be possible to fix the XIDs *after* copying the data files,
potentially on a different server so as to avoid any additional overhead on
the main server ? I guess so, though we will probably need some mechanism
to lock out access to the table (which seems easy), flush all its data
pages to the disk and some way to reliably flush all clog pages as well so
that they can be copied along with the data files. The page LSNs seem to be
easy to handle and can be easily zeroed out outside the server.

I wonder though if this all look like a material for something like
pg_reorg(pack) though some kind of support from the core may be required.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: [HACKERS] Detach/attach table and index data files from one cluster to another

2013-04-12 Thread Bruce Momjian
On Fri, Apr 12, 2013 at 10:22:38PM +0530, Pavan Deolasee wrote:
 
 
 
 On Fri, Apr 12, 2013 at 9:44 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 
 Andrew Dunstan and...@dunslane.net writes:
  On 04/12/2013 10:15 AM, Tom Lane wrote:
  There's 0 chance of making that work, because the two databases 
 wouldn't
  have the same notions of committed XIDs.
 
  Yeah. Trying to think way outside the box, could we invent some sort of
  fixup mechanism that could be applied to adopted files?
 
 Well, it wouldn't be that hard to replace XIDs with FrozenXID or
 InvalidXID as appropriate, if you had access to the source database's
 clog while you did the copying.  It just wouldn't be very fast.
 
 
 
 Would it be possible to fix the XIDs *after* copying the data files,
 potentially on a different server so as to avoid any additional overhead on 
 the
 main server ? I guess so, though we will probably need some mechanism to lock
 out access to the table (which seems easy), flush all its data pages to the
 disk and some way to reliably flush all clog pages as well so that they can be
 copied along with the data files. The page LSNs seem to be easy to handle and
 can be easily zeroed out outside the server.
 
 I wonder though if this all look like a material for something like pg_reorg
 (pack) though some kind of support from the core may be required.

Uh, now that you mention it, pg_upgrade in non-link mode does
something similer, in that it copies the data files and clog.  You could
use pg_upgrade in non-link mode, run VACUUM FREEZE on the upgraded
cluster, and then copy the data files.

The only problem is that pg_upgrade can't upgrade tablespaces with the
same system catalog version because the tablespace directory names would
conflict.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Detach/attach table and index data files from one cluster to another

2013-04-12 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Well, it wouldn't be that hard to replace XIDs with FrozenXID or
 InvalidXID as appropriate, if you had access to the source database's
 clog while you did the copying.  It just wouldn't be very fast.

If you're doing that in a streaming method, it strikes me that it'd be
plenty fast.

 I suppose it would still be faster than a COPY transfer, but I'm not
 sure it'd be enough faster to justify the work and the additional
 portability hits you'd be taking.

The big win here over a binary COPY is pulling through the indexes as-is
as well- without having to rebuild them.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Detach/attach table and index data files from one cluster to another

2013-04-12 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 I suppose it would still be faster than a COPY transfer, but I'm not
 sure it'd be enough faster to justify the work and the additional
 portability hits you'd be taking.

 The big win here over a binary COPY is pulling through the indexes as-is
 as well- without having to rebuild them.

Meh.  That raises the ante another substantial multiple with respect to
the amount of portability risk (eg, you're now absolutely dependent on
locale sort orders to be identical in both databases).  And I think
you'd have to freeze all updates to the table while you were copying the
table+indexes, if you wanted them to be consistent.

I can't imagine that we'd accept a patch that says to the recipient
database, here are some large binary blobs, please believe that
they represent a valid table and associated indexes.  Oh, and don't you
dare try to actually check them, because that would be slow.

Some other interesting things to think about here would be toast-table
OIDs embedded in toast pointers, data type OIDs embedded in arrays (and
maybe records too, I forget), enum value OIDs, btree vacuum cycle IDs,
GiST NSNs ... not sure what else, but I bet that's not a complete list.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Detach/attach table and index data files from one cluster to another

2013-04-12 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Stephen Frost sfr...@snowman.net writes:
  The big win here over a binary COPY is pulling through the indexes as-is
  as well- without having to rebuild them.

[... lots of reasons this is hard ...]

I agree that it's quite a bit more difficult, to the point that logical
replication which can be selective (eg: give me only table X + indexes)
might end up being the only answer, but otherwise this approach will
likely only be a modest improvement over binary COPY FREEZE- and there
only because we essentially end up skipping the type validation (which
we could just provide as an option, similar to COPY FREEZE...).

Thanks,

Stephen


signature.asc
Description: Digital signature