Re: [BUGS] Memory-leak in BackgroundWriter(and Checkpointer)

2013-06-11 Thread Pawel Kozik
Any idea when it will be available in official PostgreSQL release  9.1.x or
9.2.x ?



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Memory-leak-in-BackgroundWriter-and-Checkpointer-tp5757869p5758783.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Memory-leak in BackgroundWriter(and Checkpointer)

2013-06-11 Thread Stephen Frost
* Pawel Kozik (pawel.ko...@alcatel-lucent.com) wrote:
 Any idea when it will be available in official PostgreSQL release  9.1.x or
 9.2.x ?

Yes, the next set of point releases should include Tom's patch to fix
this leak.

Thanks,

Stephen


signature.asc
Description: Digital signature


[BUGS] Memory-leak in BackgroundWriter(and Checkpointer)

2013-06-04 Thread Naoya Anzai
Hi All.

I've found a memory-leak bug in PostgreSQL 9.1.9's background 
writer process.

When following parameter is set, it occurs every time CHECKPOINT runs.

 wal_level = hot_standby

I've also confirmed REL9_1_STABLE and it is not fixed yet.

In additional, it also occurs in 9.3.beta1's checkpointer.

I created a patch (attached file) for 9.1.9, 
so could anyone confirm?

Best regards.
---
Naoya Anzai
NEC Soft,Ltd.
http://www.necsoft.com/eng/


standby.patch
Description: Binary data

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Memory-leak in BackgroundWriter(and Checkpointer)

2013-06-04 Thread Stephen Frost
* Naoya Anzai (anzai-na...@mxu.nes.nec.co.jp) wrote:
 I've found a memory-leak bug in PostgreSQL 9.1.9's background 
 writer process.

This looks legit, but probably not the right approach to fixing it.
Looks like it'd be better to work out a way to use a static variable to
reuse the same memory, ala what GetRunningTransactionData() does, and
avoid having to do allocation while holding all the locks (or at least,
not very often).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [BUGS] Memory-leak in BackgroundWriter(and Checkpointer)

2013-06-04 Thread Heikki Linnakangas

On 04.06.2013 15:27, Stephen Frost wrote:

* Naoya Anzai (anzai-na...@mxu.nes.nec.co.jp) wrote:

I've found a memory-leak bug in PostgreSQL 9.1.9's background
writer process.


This looks legit, but probably not the right approach to fixing it.
Looks like it'd be better to work out a way to use a static variable to
reuse the same memory, ala what GetRunningTransactionData() does, and
avoid having to do allocation while holding all the locks (or at least,
not very often).


I can't get too excited about the overhead of a single palloc here. It's 
a fairly heavy operation anyway, and only runs once per checkpoint. And 
we haven't heard any actual complaints of latency hiccups with 
wal_level=hot_standby.


- Heikki


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Memory-leak in BackgroundWriter(and Checkpointer)

2013-06-04 Thread Andres Freund
On 2013-06-04 16:23:00 +0300, Heikki Linnakangas wrote:
 On 04.06.2013 15:27, Stephen Frost wrote:
 * Naoya Anzai (anzai-na...@mxu.nes.nec.co.jp) wrote:
 I've found a memory-leak bug in PostgreSQL 9.1.9's background
 writer process.
 
 This looks legit, but probably not the right approach to fixing it.
 Looks like it'd be better to work out a way to use a static variable to
 reuse the same memory, ala what GetRunningTransactionData() does, and
 avoid having to do allocation while holding all the locks (or at least,
 not very often).
 
 I can't get too excited about the overhead of a single palloc here. It's a
 fairly heavy operation anyway, and only runs once per checkpoint. And we
 haven't heard any actual complaints of latency hiccups with
 wal_level=hot_standby.

I think we will have to resort to running it more frequently in the not
too far away future, its way to easy to get into a situation where all
of the checkpoints/xl_running_xact's are suboverflowed delaying
consistency considerably.

Seems more consistent with the rest of the code too. But anyway, I am
fine with fixing it either way.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Memory-leak in BackgroundWriter(and Checkpointer)

2013-06-04 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 On 2013-06-04 16:23:00 +0300, Heikki Linnakangas wrote:
  I can't get too excited about the overhead of a single palloc here. It's a
  fairly heavy operation anyway, and only runs once per checkpoint. And we
  haven't heard any actual complaints of latency hiccups with
  wal_level=hot_standby.

Perhaps, but we don't always hear about things that we should- this long
standing memory leak being one of them.

 Seems more consistent with the rest of the code too. But anyway, I am
 fine with fixing it either way.

And this is really the other point- having LogStandbySnapshot() need to
clean up after GetRunningTransactionLocks() but not
GetRunningTransactionData() would strike me as very odd.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [BUGS] Memory-leak in BackgroundWriter(and Checkpointer)

2013-06-04 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Andres Freund (and...@2ndquadrant.com) wrote:
 Seems more consistent with the rest of the code too. But anyway, I am
 fine with fixing it either way.

 And this is really the other point- having LogStandbySnapshot() need to
 clean up after GetRunningTransactionLocks() but not
 GetRunningTransactionData() would strike me as very odd.

Meh.  I'm not impressed with permanently allocating an array large
enough to hold all the locks GetRunningTransactionLocks
might return --- that's potentially much larger than the other array,
and in fact I don't think we have a hard limit on its size at all.
Besides which, it's not like there is *no* cleanup for
GetRunningTransactionData --- it has a lock that has to be released ...

I think the proposed fix is fine code-wise; the real problem here is
crummy commenting.  GetRunningTransactionLocks isn't documented as
returning a palloc'd array, and why the heck do we have a long comment
about its implementation in LogStandbySnapshot?

regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Memory-leak in BackgroundWriter(and Checkpointer)

2013-06-04 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 Seems more consistent with the rest of the code too. But anyway, I am
 fine with fixing it either way.

Patch attached which mirrors what GetRunningTransactionData() (the other
function called from LogStandbySnapshot) does, more-or-less- uses a
static variable to keep track of memory allocated for the data being
returned to the caller.

Looks like this will need to be back-patched to 9.0.  Barring objections
or better ideas, I'll do that in a few hours.

Thanks,

Stephen
colordiff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
new file mode 100644
index 8cd871f..1fb7884
*** a/src/backend/storage/lmgr/lock.c
--- b/src/backend/storage/lmgr/lock.c
*** GetLockStatusData(void)
*** 3401,3415 
   * Returns a list of currently held AccessExclusiveLocks, for use
   * by GetRunningTransactionData().
   */
  xl_standby_lock *
  GetRunningTransactionLocks(int *nlocks)
  {
  	PROCLOCK   *proclock;
  	HASH_SEQ_STATUS seqstat;
  	int			i;
  	int			index;
! 	int			els;
! 	xl_standby_lock *accessExclusiveLocks;
  
  	/*
  	 * Acquire lock on the entire shared lock data structure.
--- 3401,3445 
   * Returns a list of currently held AccessExclusiveLocks, for use
   * by GetRunningTransactionData().
   */
+ #define INIT_STANDBY_LOCK_ARR_SIZE 32	/* Initial return workspace size */
  xl_standby_lock *
  GetRunningTransactionLocks(int *nlocks)
  {
+ 	/* result workspace */
+ 	static xl_standby_locks *accessExclusiveLocks;
+ 
  	PROCLOCK   *proclock;
  	HASH_SEQ_STATUS seqstat;
  	int			i;
  	int			index;
! 
! 	/*
! 	 * Allocating space for all locks is overkill.	We only need space for
! 	 * access exclusive locks, but we can't count the number of locks without
! 	 * locking all of the lock partitions.	We don't really want to do
! 	 * allocations while holding all those locks, so instead do a bit of
! 	 * allocation up front, to hopefully avoid having to do them later.
! 	 */
! 	if (accessExclusiveLocks == NULL)
! 	{
! 		/*
! 		 * First call
! 		 *
! 		 * Note: We are tracking the allocated space for locks in -nlocks,
! 		 * not the actual number of locks found, that's handled with index
! 		 * below. Also, allocating the structure will start us off with space
! 		 * for one, so discount that from the malloc request.
! 		 */
! 		accessExclusiveLocks = (xl_standby_locks *)
! 			malloc(sizeof(xl_standby_locks) +
! 			   ((INIT_STANDBY_LOCK_ARR_SIZE - 1) * sizeof(xl_standby_lock)));
! 		if (accessExclusiveLocks == NULL)
! 			ereport(ERROR,
! 	(errcode(ERRCODE_OUT_OF_MEMORY),
! 	 errmsg(out of memory)));
! 
! 		accessExclusiveLocks-nlocks = INIT_STANDBY_LOCK_ARR_SIZE;
! 	}
  
  	/*
  	 * Acquire lock on the entire shared lock data structure.
*** GetRunningTransactionLocks(int *nlocks)
*** 3419,3433 
  	for (i = 0; i  NUM_LOCK_PARTITIONS; i++)
  		LWLockAcquire(FirstLockMgrLock + i, LW_SHARED);
  
- 	/* Now we can safely count the number of proclocks */
- 	els = hash_get_num_entries(LockMethodProcLockHash);
- 
- 	/*
- 	 * Allocating enough space for all locks in the lock table is overkill,
- 	 * but it's more convenient and faster than having to enlarge the array.
- 	 */
- 	accessExclusiveLocks = palloc(els * sizeof(xl_standby_lock));
- 
  	/* Now scan the tables to copy the data */
  	hash_seq_init(seqstat, LockMethodProcLockHash);
  
--- 3449,3454 
*** GetRunningTransactionLocks(int *nlocks)
*** 3459,3467 
  			if (!TransactionIdIsValid(xid))
  continue;
  
! 			accessExclusiveLocks[index].xid = xid;
! 			accessExclusiveLocks[index].dbOid = lock-tag.locktag_field1;
! 			accessExclusiveLocks[index].relOid = lock-tag.locktag_field2;
  
  			index++;
  		}
--- 3480,3508 
  			if (!TransactionIdIsValid(xid))
  continue;
  
! 			/*
! 			 * Check if we need to allocate more space in our workspace due to
! 			 * the number of access exclusive locks actually found.  If we do
! 			 * run out, double the size of our return area to hopefully avoid
! 			 * having to do this again any time soon.
! 			 */
! 			if (index = accessExclusiveLocks-nlocks)
! 			{
! accessExclusiveLocks-nlocks *= 2;
! 
! accessExclusiveLocks = (xl_standby_locks *)
! 	realloc(accessExclusiveLocks,
! 			sizeof(xl_standby_locks) +
! 			((accessExclusiveLocks-nlocks - 1) *
! 			 sizeof(xl_standby_lock)));
! if (accessExclusiveLocks == NULL)
! 	ereport(ERROR,
! 			(errcode(ERRCODE_OUT_OF_MEMORY),
! 			 errmsg(out of memory)));
! 			}
! 			accessExclusiveLocks-locks[index].xid = xid;
! 			accessExclusiveLocks-locks[index].dbOid = lock-tag.locktag_field1;
! 			accessExclusiveLocks-locks[index].relOid = lock-tag.locktag_field2;
  
  			index++;
  		}
*** GetRunningTransactionLocks(int *nlocks)
*** 3477,3484 
  	for (i = NUM_LOCK_PARTITIONS; --i = 0;)
  		LWLockRelease(FirstLockMgrLock + i);
  
  	*nlocks = index;
! 	return accessExclusiveLocks;
  }

Re: [BUGS] Memory-leak in BackgroundWriter(and Checkpointer)

2013-06-04 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Meh.  I'm not impressed with permanently allocating an array large
 enough to hold all the locks GetRunningTransactionLocks
 might return --- that's potentially much larger than the other array,
 and in fact I don't think we have a hard limit on its size at all.

Well, sure, which is why I didn't actually do that- but I did end up
having to make it resize when necessary, which isn't entirely ideal
either.

 Besides which, it's not like there is *no* cleanup for
 GetRunningTransactionData --- it has a lock that has to be released ...

That's true..  I guess my general feeling is that it'd be good to do
this all one way or the other- having it use a static variable into
which we stick the pointer to some reused space for one and then doing a
palloc for the other which needs to be pfree'd struck me as odd.

 I think the proposed fix is fine code-wise; the real problem here is
 crummy commenting.  GetRunningTransactionLocks isn't documented as
 returning a palloc'd array, and why the heck do we have a long comment
 about its implementation in LogStandbySnapshot?

Certainly good questions and better comments would have helped here.  I
can go back and rework the patch either way.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [BUGS] Memory-leak in BackgroundWriter(and Checkpointer)

2013-06-04 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 I think the proposed fix is fine code-wise; the real problem here is
 crummy commenting.  GetRunningTransactionLocks isn't documented as
 returning a palloc'd array, and why the heck do we have a long comment
 about its implementation in LogStandbySnapshot?

 Certainly good questions and better comments would have helped here.  I
 can go back and rework the patch either way.

I'm already working on back-patching the attached.

regards, tom lane

diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 615278b8ca2adf3057e5f21057c3cedfc66480aa..c704412366d5bc4b6512e9ab673855c46f7d993f 100644
*** a/src/backend/storage/ipc/standby.c
--- b/src/backend/storage/ipc/standby.c
*** LogStandbySnapshot(void)
*** 865,880 
  
  	/*
  	 * Get details of any AccessExclusiveLocks being held at the moment.
- 	 *
- 	 * XXX GetRunningTransactionLocks() currently holds a lock on all
- 	 * partitions though it is possible to further optimise the locking. By
- 	 * reference counting locks and storing the value on the ProcArray entry
- 	 * for each backend we can easily tell if any locks need recording without
- 	 * trying to acquire the partition locks and scanning the lock table.
  	 */
  	locks = GetRunningTransactionLocks(nlocks);
  	if (nlocks  0)
  		LogAccessExclusiveLocks(nlocks, locks);
  
  	/*
  	 * Log details of all in-progress transactions. This should be the last
--- 865,875 
  
  	/*
  	 * Get details of any AccessExclusiveLocks being held at the moment.
  	 */
  	locks = GetRunningTransactionLocks(nlocks);
  	if (nlocks  0)
  		LogAccessExclusiveLocks(nlocks, locks);
+ 	pfree(locks);
  
  	/*
  	 * Log details of all in-progress transactions. This should be the last
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 8cd871f4b40d7bfc8b0aaa7650241d5865c79ffe..273c72230274b46ba6a46bb8b295ef5375aaa36d 100644
*** a/src/backend/storage/lmgr/lock.c
--- b/src/backend/storage/lmgr/lock.c
*** GetLockStatusData(void)
*** 3398,3415 
  }
  
  /*
!  * Returns a list of currently held AccessExclusiveLocks, for use
!  * by GetRunningTransactionData().
   */
  xl_standby_lock *
  GetRunningTransactionLocks(int *nlocks)
  {
  	PROCLOCK   *proclock;
  	HASH_SEQ_STATUS seqstat;
  	int			i;
  	int			index;
  	int			els;
- 	xl_standby_lock *accessExclusiveLocks;
  
  	/*
  	 * Acquire lock on the entire shared lock data structure.
--- 3398,3423 
  }
  
  /*
!  * Returns a list of currently held AccessExclusiveLocks, for use by
!  * LogStandbySnapshot().  The result is a palloc'd array,
!  * with the number of elements returned into *nlocks.
!  *
!  * XXX This currently takes a lock on all partitions of the lock table,
!  * but it's possible to do better.  By reference counting locks and storing
!  * the value in the ProcArray entry for each backend we could tell if any
!  * locks need recording without having to acquire the partition locks and
!  * scan the lock table.  Whether that's worth the additional overhead
!  * is pretty dubious though.
   */
  xl_standby_lock *
  GetRunningTransactionLocks(int *nlocks)
  {
+ 	xl_standby_lock *accessExclusiveLocks;
  	PROCLOCK   *proclock;
  	HASH_SEQ_STATUS seqstat;
  	int			i;
  	int			index;
  	int			els;
  
  	/*
  	 * Acquire lock on the entire shared lock data structure.
*** GetRunningTransactionLocks(int *nlocks)
*** 3467,3472 
--- 3475,3482 
  		}
  	}
  
+ 	Assert(index = els);
+ 
  	/*
  	 * And release locks.  We do this in reverse order for two reasons: (1)
  	 * Anyone else who needs more than one of the locks will be trying to lock

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Memory-leak in BackgroundWriter(and Checkpointer)

2013-06-04 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 I'm already working on back-patching the attached.

Works for me,

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [BUGS] Memory-leak in BackgroundWriter(and Checkpointer)

2013-06-04 Thread Naoya Anzai
Tom, Stephen, Heikki and Andres,

You do FAST work!
Thanks for your prompt responses.

---
Naoya Anzai
NEC Soft,Ltd.
http://www.necsoft.com/eng/


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs