On 2013-11-04 09:38:27 +0200, Heikki Linnakangas wrote: > On 29.10.2013 03:16, Andres Freund wrote: > >Hi, > > > >I've started a valgrind run earlier when trying to run the regression > >tests with valgrind --error-exitcode=122 (to cause the regression tests > >to fail visibly) but it crashed frequently... > > > >One of them was: > >... > >Which seems to indicate a dangling ->rd_smgr pointer, confusing the heck > >out of me because I couldn't see how that'd happen. > > > >Looking a bit closer it seems to me that the fake relcache > >infrastructure seems to neglect the chance that something used the fake > >entry to read something which will have done a RelationOpenSmgr(). Which > >in turn will have set rel->rd_smgr->smgr_owner to rel. > >When we then pfree() the fake relation in FreeFakeRelcacheEntry we have > >a dangling pointer. > > Yeah, that's clearly a bug. > > >diff --git a/src/backend/access/transam/xlogutils.c > >b/src/backend/access/transam/xlogutils.c > >index 5429d5e..ee7c892 100644 > >--- a/src/backend/access/transam/xlogutils.c > >+++ b/src/backend/access/transam/xlogutils.c > >@@ -433,6 +433,7 @@ CreateFakeRelcacheEntry(RelFileNode rnode) > > void > > FreeFakeRelcacheEntry(Relation fakerel) > > { > >+ RelationCloseSmgr(fakerel); > > pfree(fakerel); > > } > > Hmm, I don't think that's a very good solution. Firstly, it will force the > underlying files to be closed, hurting performance. Fake relcache entries > are used in heapam when clearing visibility map bits, which might happen > frequently enough for that to matter.
Well, it will only close them when they actually were smgropen()ed which will not always be the case. Although it probably is in the visibility map case. Feels like premature optimization to me. > Secondly, it will fail if you create > two fake relcache entries for the same relfilenode. Freeing the first will > close the smgr entry, and freeing the second will try to close the same smgr > entry again. We never do that in the current code, but it seems like a > possible source of bugs in the future. I think if we go there we'll need refcounted FakeRelcacheEntry's anyway. Otherwise we'll end up with a relation smgropen()ed multiple times in the same backend and such which doesn't seem like a promising course to me since the smgr itself isn't refcounted. > How about the attached instead? > diff --git a/src/backend/access/transam/xlogutils.c > b/src/backend/access/transam/xlogutils.c > index 5429d5e..f732e71 100644 > --- a/src/backend/access/transam/xlogutils.c > +++ b/src/backend/access/transam/xlogutils.c > @@ -422,7 +422,11 @@ CreateFakeRelcacheEntry(RelFileNode rnode) > rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode; > rel->rd_lockInfo.lockRelId.relId = rnode.relNode; > > - rel->rd_smgr = NULL; > + /* > + * Open the underlying storage, but don't set rd_owner. We want the > + * SmgrRelation to persist after the fake relcache entry is freed. > + */ > + rel->rd_smgr = smgropen(rnode, InvalidBackendId); > > return rel; > } I don't really like that - that will mean we'll leak open smgr handles for every relation touched via smgr during recovery since they will never be closed unless the relation is dropped or such. And in some databases there can be huge amounts of relations. Since recovery is long lived that doesn't seem like a good idea, besides the memory usage it will also bloat smgr's internal hash which actually seems just as likely to hurt performance. I think intentionally not using the owner mechanism also is dangerous - what if we have longer lived fake relcache entries and we've just processed sinval messages? Then we'll have a ->rd_smgr pointing into nowhere. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers