Re: DSA failed to allocate memory

2023-07-03 Thread Thomas Munro
Pushed.

I wasn't sure it was worth keeping the test in the tree.  It's here in
the mailing list archives for future reference.




Re: DSA failed to allocate memory

2023-06-13 Thread Thomas Munro
On Mon, Feb 20, 2023 at 5:52 PM Thomas Munro  wrote:
> I'm wondering about this bit in rebin_segment():
>
> +   if (segment_map->header == NULL)
> +   return;
>
> Why would we be rebinning an uninitialised/unused segment?

Answering my own question: because destroy_superblock() can do that.
So I think destroy_superblock() should test for that case, not
rebin_segment().  See attached.
From f0f808595145c0eabb7ccdcc5b8701798d5722d1 Mon Sep 17 00:00:00 2001
From: Liu Dongming 
Date: Fri, 18 Mar 2022 11:49:06 +0800
Subject: [PATCH v4 1/2] Re-bin segment when memory pages are freed.

It's OK to be lazy about re-binning memory segments when allocating,
because that can only leave segments in a bin that's too high.  We'll
search higher bins if necessary while allocating next time, and
also eventually re-bin, so no memory can become unreachable that way.

However, when freeing memory, the largest contiguous range of free pages
might go up, so we should re-bin eagerly to make sure we don't leave the
segment in a bin that is too low for get_best_segment() to find.

The re-binning code is moved into a function of its own, so it can be
called whenever free pages are returned to the segment's free page map.

Back-patch to all supported releases.

Author: Dongming Liu 
Reviewed-by: Robert Haas 
Reviewed-by: Thomas Munro 
Discussion: https://postgr.es/m/CAL1p7e8LzB2LSeAXo2pXCW4%2BRya9s0sJ3G_ReKOU%3DAjSUWjHWQ%40mail.gmail.com
---
 src/backend/utils/mmgr/dsa.c | 65 
 1 file changed, 43 insertions(+), 22 deletions(-)

diff --git a/src/backend/utils/mmgr/dsa.c b/src/backend/utils/mmgr/dsa.c
index 7a3781466e..8d1aace40a 100644
--- a/src/backend/utils/mmgr/dsa.c
+++ b/src/backend/utils/mmgr/dsa.c
@@ -418,6 +418,7 @@ static dsa_area *attach_internal(void *place, dsm_segment *segment,
  dsa_handle handle);
 static void check_for_freed_segments(dsa_area *area);
 static void check_for_freed_segments_locked(dsa_area *area);
+static void rebin_segment(dsa_area *area, dsa_segment_map *segment_map);
 
 /*
  * Create a new shared area in a new DSM segment.  Further DSM segments will
@@ -869,7 +870,11 @@ dsa_free(dsa_area *area, dsa_pointer dp)
 		FreePageManagerPut(segment_map->fpm,
 		   DSA_EXTRACT_OFFSET(span->start) / FPM_PAGE_SIZE,
 		   span->npages);
+
+		/* Move segment to appropriate bin if necessary. */
+		rebin_segment(area, segment_map);
 		LWLockRelease(DSA_AREA_LOCK(area));
+
 		/* Unlink span. */
 		LWLockAcquire(DSA_SCLASS_LOCK(area, DSA_SCLASS_SPAN_LARGE),
 	  LW_EXCLUSIVE);
@@ -1858,6 +1863,11 @@ destroy_superblock(dsa_area *area, dsa_pointer span_pointer)
 			segment_map->mapped_address = NULL;
 		}
 	}
+
+	/* Move segment to appropriate bin if necessary. */
+	if (segment_map->header != NULL)
+		rebin_segment(area, segment_map);
+
 	LWLockRelease(DSA_AREA_LOCK(area));
 
 	/*
@@ -2021,28 +2031,7 @@ get_best_segment(dsa_area *area, size_t npages)
 			/* Re-bin it if it's no longer in the appropriate bin. */
 			if (contiguous_pages < threshold)
 			{
-size_t		new_bin;
-
-new_bin = contiguous_pages_to_segment_bin(contiguous_pages);
-
-/* Remove it from its current bin. */
-unlink_segment(area, segment_map);
-
-/* Push it onto the front of its new bin. */
-segment_map->header->prev = DSA_SEGMENT_INDEX_NONE;
-segment_map->header->next =
-	area->control->segment_bins[new_bin];
-segment_map->header->bin = new_bin;
-area->control->segment_bins[new_bin] = segment_index;
-if (segment_map->header->next != DSA_SEGMENT_INDEX_NONE)
-{
-	dsa_segment_map *next;
-
-	next = get_segment_by_index(area,
-segment_map->header->next);
-	Assert(next->header->bin == new_bin);
-	next->header->prev = segment_index;
-}
+rebin_segment(area, segment_map);
 
 /*
  * But fall through to see if it's enough to satisfy this
@@ -2297,3 +2286,35 @@ check_for_freed_segments_locked(dsa_area *area)
 		area->freed_segment_counter = freed_segment_counter;
 	}
 }
+
+/*
+ * Re-bin segment if it's no longer in the appropriate bin.
+ */
+static void
+rebin_segment(dsa_area *area, dsa_segment_map *segment_map)
+{
+	size_t		new_bin;
+	dsa_segment_index segment_index;
+
+	new_bin = contiguous_pages_to_segment_bin(fpm_largest(segment_map->fpm));
+	if (segment_map->header->bin == new_bin)
+		return;
+
+	/* Remove it from its current bin. */
+	unlink_segment(area, segment_map);
+
+	/* Push it onto the front of its new bin. */
+	segment_index = get_segment_index(area, segment_map);
+	segment_map->header->prev = DSA_SEGMENT_INDEX_NONE;
+	segment_map->header->next = area->control->segment_bins[new_bin];
+	segment_map->header->bin = new_bin;
+	area->control->segment_bins[new_bin] = segment_index;
+	if (segment_map->header->next != DSA_SEGMENT_INDEX_NONE)
+	{
+		dsa_segment_map *next;
+
+		next = get_segment_by_index(area, segment_map->header->next);
+		Assert(next->header->bin == new_bin);
+		next->header->prev 

Re: DSA failed to allocate memory

2023-02-19 Thread Thomas Munro
On Fri, Jan 20, 2023 at 11:02 PM Thomas Munro  wrote:
> Yeah.  I think the analysis looks good, but I'll do some testing next
> week with the aim of getting it committed.  Looks like it now needs
> Meson changes, but I'll look after that as my penance.

Here's an updated version that I'm testing...

Changes to the main patch:

* Adjust a few comments
* pgindent
* Explained a bit more in the commit message

I'm wondering about this bit in rebin_segment():

+   if (segment_map->header == NULL)
+   return;

Why would we be rebinning an uninitialised/unused segment?  Does
something in your DSA-client code (I guess you have an extension?) hit
this case?  The tests certainly don't; I'm not sure how the case could
be reached.

Changes to the test:

* Update copyright year
* Size -> size_t
* pgindent
* Add Meson glue
* Re-alphabetise the makefile
* Make sure we get BGWH_STOPPED while waiting for bgworkers to exit
* Background worker main function return type is fixed (void)
* results[1] -> results[FLEXIBLE_ARRAY_MEMBER]
* getpid() -> MyProcPid

I wonder if this code would be easier to understand, but not
materially less efficient, if we re-binned eagerly when allocating
too, so the bin is always correct/optimal.  Checking fpm_largest()
again after allocating should be cheap, I guess (it just reads a
member variable that we already paid the cost of maintaining).  We
don't really seem to amortise much, we just transfer the rebinning
work to the next caller to consider the segment.  I haven't tried out
that theory though.
From 5b6a3626dd0f8bacd7e26e490a20340e7ae6769e Mon Sep 17 00:00:00 2001
From: Liu Dongming 
Date: Fri, 18 Mar 2022 11:49:06 +0800
Subject: [PATCH v3 1/2] Re-bin segment when memory pages are freed.

It's OK to be lazy about re-binning memory segments when allocating,
because that can only leave segments in a bin that's too high.  We'll
search higher bins if necessary while allocating next time, and
also eventually re-bin, so no memory can become unreachable that way.

However, when freeing memory, the largest contiguous range of free pages
might go up, so we should re-bin eagerly to make sure we don't leave the
segment in a bin that is too low for get_best_segment() to find.

The re-binning code is moved into a function of its own, so it can be
called whenever free pages are returned to the segment's free page map.

Author: Dongming Liu 
Reviewed-by: Robert Haas 
Reviewed-by: Thomas Munro 
Discussion: https://postgr.es/m/CAL1p7e8LzB2LSeAXo2pXCW4%2BRya9s0sJ3G_ReKOU%3DAjSUWjHWQ%40mail.gmail.com
---
 src/backend/utils/mmgr/dsa.c | 67 
 1 file changed, 45 insertions(+), 22 deletions(-)

diff --git a/src/backend/utils/mmgr/dsa.c b/src/backend/utils/mmgr/dsa.c
index f5a62061a3..0ef99dcba5 100644
--- a/src/backend/utils/mmgr/dsa.c
+++ b/src/backend/utils/mmgr/dsa.c
@@ -418,6 +418,7 @@ static dsa_area *attach_internal(void *place, dsm_segment *segment,
  dsa_handle handle);
 static void check_for_freed_segments(dsa_area *area);
 static void check_for_freed_segments_locked(dsa_area *area);
+static void rebin_segment(dsa_area *area, dsa_segment_map *segment_map);
 
 /*
  * Create a new shared area in a new DSM segment.  Further DSM segments will
@@ -869,7 +870,11 @@ dsa_free(dsa_area *area, dsa_pointer dp)
 		FreePageManagerPut(segment_map->fpm,
 		   DSA_EXTRACT_OFFSET(span->start) / FPM_PAGE_SIZE,
 		   span->npages);
+
+		/* Move segment to appropriate bin if necessary. */
+		rebin_segment(area, segment_map);
 		LWLockRelease(DSA_AREA_LOCK(area));
+
 		/* Unlink span. */
 		LWLockAcquire(DSA_SCLASS_LOCK(area, DSA_SCLASS_SPAN_LARGE),
 	  LW_EXCLUSIVE);
@@ -1858,6 +1863,10 @@ destroy_superblock(dsa_area *area, dsa_pointer span_pointer)
 			segment_map->mapped_address = NULL;
 		}
 	}
+
+	/* Move segment to appropriate bin if necessary. */
+	rebin_segment(area, segment_map);
+
 	LWLockRelease(DSA_AREA_LOCK(area));
 
 	/*
@@ -2021,28 +2030,7 @@ get_best_segment(dsa_area *area, size_t npages)
 			/* Re-bin it if it's no longer in the appropriate bin. */
 			if (contiguous_pages < threshold)
 			{
-size_t		new_bin;
-
-new_bin = contiguous_pages_to_segment_bin(contiguous_pages);
-
-/* Remove it from its current bin. */
-unlink_segment(area, segment_map);
-
-/* Push it onto the front of its new bin. */
-segment_map->header->prev = DSA_SEGMENT_INDEX_NONE;
-segment_map->header->next =
-	area->control->segment_bins[new_bin];
-segment_map->header->bin = new_bin;
-area->control->segment_bins[new_bin] = segment_index;
-if (segment_map->header->next != DSA_SEGMENT_INDEX_NONE)
-{
-	dsa_segment_map *next;
-
-	next = get_segment_by_index(area,
-segment_map->header->next);
-	Assert(next->header->bin == new_bin);
-	next->header->prev = segment_index;
-}
+rebin_segment(area, segment_map);
 
 /*
  * But fall through to see if it's enough to satisfy this
@@ -2297,3 

Re: DSA failed to allocate memory

2023-01-20 Thread Thomas Munro
On Fri, Jan 20, 2023 at 11:44 AM Tom Lane  wrote:
> Thomas Munro  writes:
> > Thanks for the report, and for working on the fix.  Can you please
> > create a commitfest entry (if you haven't already)?  I plan to look at
> > this soon, after the code freeze.
>
> Hi Thomas, are you still intending to look at this DSA bug fix?
> It's been sitting idle for months.

Yeah.  I think the analysis looks good, but I'll do some testing next
week with the aim of getting it committed.  Looks like it now needs
Meson changes, but I'll look after that as my penance.




Re: DSA failed to allocate memory

2023-01-19 Thread Tom Lane
Thomas Munro  writes:
> Thanks for the report, and for working on the fix.  Can you please
> create a commitfest entry (if you haven't already)?  I plan to look at
> this soon, after the code freeze.

Hi Thomas, are you still intending to look at this DSA bug fix?
It's been sitting idle for months.

regards, tom lane




Re: DSA failed to allocate memory

2022-04-06 Thread Dongming Liu
On Mon, Mar 28, 2022 at 3:53 PM Thomas Munro  wrote:

> Hi Dongming,
>
> Thanks for the report, and for working on the fix.  Can you please
> create a commitfest entry (if you haven't already)?  I plan to look at
> this soon, after the code freeze.

I created a commitfest entry https://commitfest.postgresql.org/38/3607/.
Thanks for your review.

Are you proposing that the test_dsa module should be added to the
> tree?  If so, some trivial observations: "#ifndef
> HAVE_INT64_TIMESTAMP" isn't needed anymore (see commit b6aa17e0, which
> is in all supported branches), the year should be updated, and we use
> size_t instead of Size in new code.
>
Yes, I think test_dsa is very helpful and necessary to develop dsa related
features. I have removed the HAVE_INT64_TIMESTAMP related code.
Most of the code for test_dsa comes from your patch[1] and I add some
test cases.

In addition, I add a few OOM test cases that allocate a fixed size of
memory
until the memory overflows, run it twice and compare the amount of memory
they allocate. These cases will fail on the current master branch.

[1]
https://www.postgresql.org/message-id/CAEepm%3D3U7%2BRo7%3DECeQuAZoeFXs8iDVX56NXGCV7z3%3D%2BH%2BWd0Sw%40mail.gmail.com
-- 
Best Regards,
Dongming


0001-Re-bin-segment-when-dsa-memory-is-freed-v2.patch
Description: Binary data


0002-port-test_dsa-v2.patch
Description: Binary data


Re: DSA failed to allocate memory

2022-03-28 Thread Thomas Munro
On Mon, Mar 28, 2022 at 8:14 PM Dongming Liu  wrote:
> On Fri, Mar 18, 2022 at 3:30 PM Dongming Liu  wrote:
>> I'm trying to move segments into appropriate bins in dsa_free().
>> In 0001-Re-bin-segment-when-dsa-memory-is-freed.patch, I extract
>> the re-bin segment logic into a separate function called rebin_segment,
>> call it to move the segment to the appropriate bin when dsa memory is
>> freed. Otherwise, when allocating memory, due to the segment with
>> enough contiguous pages is in a smaller bin, a suitable segment
>> may not be found to allocate memory.
>>
>> Fot test, I port the test_dsa patch from [1] and add an OOM case to
>> test memory allocation until OOM, free and then allocation, compare
>> the number of allocated memory before and after.

Hi Dongming,

Thanks for the report, and for working on the fix.  Can you please
create a commitfest entry (if you haven't already)?  I plan to look at
this soon, after the code freeze.

Are you proposing that the test_dsa module should be added to the
tree?  If so, some trivial observations: "#ifndef
HAVE_INT64_TIMESTAMP" isn't needed anymore (see commit b6aa17e0, which
is in all supported branches), the year should be updated, and we use
size_t instead of Size in new code.




Re: DSA failed to allocate memory

2022-03-28 Thread Dongming Liu
On Fri, Mar 18, 2022 at 3:30 PM Dongming Liu  wrote:

> So it's OK for a segment to be in a bin that suggests that it has more
>> consecutive free pages than it really does. But it's NOT ok for a
>> segment to be in a bin that suggests it has fewer consecutive pages
>> than it really does. If dsa_free() is putting things back into the
>> wrong place, that's what we need to fix.
>
>
> I'm trying to move segments into appropriate bins in dsa_free().
> In 0001-Re-bin-segment-when-dsa-memory-is-freed.patch, I extract
> the re-bin segment logic into a separate function called rebin_segment,
> call it to move the segment to the appropriate bin when dsa memory is
> freed. Otherwise, when allocating memory, due to the segment with
> enough contiguous pages is in a smaller bin, a suitable segment
> may not be found to allocate memory.
>
> Fot test, I port the test_dsa patch from [1] and add an OOM case to
> test memory allocation until OOM, free and then allocation, compare
> the number of allocated memory before and after.
>
> Any thoughts?
>
> [1]
> https://www.postgresql.org/message-id/CAEepm%3D3U7%2BRo7%3DECeQuAZoeFXs8iDVX56NXGCV7z3%3D%2BH%2BWd0Sw%40mail.gmail.com
>
>
Fix rebin_segment not working on in-place dsa.

-- 
Best Regards,
Dongming


0001-Re-bin-segment-when-dsa-memory-is-freed.patch
Description: Binary data


0002-port-test_dsa.patch
Description: Binary data


Re: DSA failed to allocate memory

2022-03-18 Thread Dongming Liu
>
> So it's OK for a segment to be in a bin that suggests that it has more
> consecutive free pages than it really does. But it's NOT ok for a
> segment to be in a bin that suggests it has fewer consecutive pages
> than it really does. If dsa_free() is putting things back into the
> wrong place, that's what we need to fix.


I'm trying to move segments into appropriate bins in dsa_free().
In 0001-Re-bin-segment-when-dsa-memory-is-freed.patch, I extract
the re-bin segment logic into a separate function called rebin_segment,
call it to move the segment to the appropriate bin when dsa memory is
freed. Otherwise, when allocating memory, due to the segment with
enough contiguous pages is in a smaller bin, a suitable segment
may not be found to allocate memory.

Fot test, I port the test_dsa patch from [1] and add an OOM case to
test memory allocation until OOM, free and then allocation, compare
the number of allocated memory before and after.

Any thoughts?

[1]
https://www.postgresql.org/message-id/CAEepm%3D3U7%2BRo7%3DECeQuAZoeFXs8iDVX56NXGCV7z3%3D%2BH%2BWd0Sw%40mail.gmail.com


0001-port-test_dsa.patch
Description: Binary data


0001-Re-bin-segment-when-dsa-memory-is-freed.patch
Description: Binary data


Re: DSA failed to allocate memory

2022-01-24 Thread Robert Haas
On Mon, Jan 24, 2022 at 4:59 AM Dongming Liu  wrote:
> Maybe we can use one of the following methods to fix it:
> 1. re-bin segment to suitable segment index when called dsa_free
> 2. get_best_segment search all bins

(2) is definitely the wrong idea. The comments say:

/*
 * What is the lowest bin that holds segments that *might* have n contiguous
 * free pages?  There is no point in looking in segments in lower bins; they
 * definitely can't service a request for n free pages.
 */
#define contiguous_pages_to_segment_bin(n) Min(fls(n), DSA_NUM_SEGMENT_BINS - 1)

So it's OK for a segment to be in a bin that suggests that it has more
consecutive free pages than it really does. But it's NOT ok for a
segment to be in a bin that suggests it has fewer consecutive pages
than it really does. If dsa_free() is putting things back into the
wrong place, that's what we need to fix.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




DSA failed to allocate memory

2022-01-24 Thread Dongming Liu
Hi,

I am using Dynamic shared memory areas(DSA) to manage some variable
length shared memory, I've found that in some cases allocation fails even
though there are enough contiguous pages.

The steps to reproduce are as follows:
1. create a dsa area with a 1MB DSM segment
2. set its size limit to 1MB
3. allocate 4KB memory until fails
4. free all allocated memory in step 3
5. repeat step 3 and step 4

When I first run the step 3, there is 240 4KB memory allocated successfully.
But when I free all and allocate again, no memory can be allocated even
though there are 252 contiguous pages. IMO, this should not be expected
to happen, right?

The call stack is as follows:
#0  get_best_segment (area=0x200cc70, npages=16) at dsa.c:1972
#1  0x00b46b36 in ensure_active_superblock (area=0x200cc70,
pool=0x7fa7b51f46f0,
size_class=33) at dsa.c:1666
#2  0x00b46555 in alloc_object (area=0x200cc70, size_class=33)
at dsa.c:1460
#3  0x00b44f05 in dsa_allocate_extended (area=0x200cc70,
size=4096, flags=2) at dsa.c:795

I read the relevant code and found that get_best_segment re-bin the segment
to segment index 4 when first run the step 3. But when free all and run the
step 3 again, get_best_segment search from the first bin that *might*
have enough
contiguous pages, it is calculated by contiguous_pages_to_segment_bin(),
for a superblock with 16 pages, contiguous_pages_to_segment_bin is 5.
So the second time, get_best_segment search bin from segment index 5 to 16,
but the suitable segment has been re-bin to 4 that we do not check.
Finally, the get_best_segment return NULL and dsa_allocate_extended return
a invalid dsa pointer.

Maybe we can use one of the following methods to fix it:
1. re-bin segment to suitable segment index when called dsa_free
2. get_best_segment search all bins

I wrote a simple test code that is attached to reproduce it.

Ant thoughts?

--
Best Regards
Dongming(https://www.aliyun.com/)


0001-add-dsa-test-module.patch
Description: Binary data