On Thu, Dec 9, 2010 at 10:05 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > I wrote: >> One fairly simple, if ugly, thing we could do about this is skip calling >> reduce_dependencies during the first loop if the TOC object is a blob; >> effectively assuming that nothing could depend on a blob. But that does >> nothing about the point that we're failing to parallelize blob >> restoration. Right offhand it seems hard to do much about that without >> some changes to the archive representation of blobs. Some things that >> might be worth looking at for 9.1: > >> * Add a flag to TOC objects saying "this object has no dependencies", >> to provide a generalized and principled way to skip the >> reduce_dependencies loop. This is only a good idea if pg_dump knows >> that or can cheaply determine it at dump time, but I think it can. > > I had further ideas about this part of the problem. First, there's no > need for a file format change to fix this: parallel restore is already > groveling over all the dependencies in its fix_dependencies step, so it > could count them for itself easily enough. Second, the real problem > here is that reduce_dependencies processing is O(N^2) in the number of > TOC objects. Skipping it for blobs, or even for all dependency-free > objects, doesn't make that very much better: the kind of people who > really need parallel restore are still likely to bump into unreasonable > processing time. I think what we need to do is make fix_dependencies > build a reverse lookup list of all the objects dependent on each TOC > object, so that the searching behavior in reduce_dependencies can be > eliminated outright. That will take O(N) time and O(N) extra space, > which is a good tradeoff because you won't care if N is small, while if > N is large you have got to have it anyway. > > Barring objections, I will do this and back-patch into 9.0. There is > maybe some case for trying to fix 8.4 as well, but since 8.4 didn't > make a separate TOC entry for each blob, it isn't as exposed to the > problem. We didn't back-patch the last round of efficiency hacks in > this area, so I'm thinking it's not necessary here either. Comments?
Ah, that sounds like a much cleaner solution. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers