Re: Speeding up COPY TO for uuids and arrays

2024-02-26 Thread Ranier Vilela
Em seg., 26 de fev. de 2024 às 02:28, Michael Paquier 
escreveu:

> On Thu, Feb 22, 2024 at 04:42:37PM -0300, Ranier Vilela wrote:
> > Can you share exactly script used to create a table?
>
> Stressing the internals of array_out() for the area of the patch is
> not that difficult, as we want to quote each element that's returned
> in output.
>
> The trick is to have the following to stress the second quoting loop a
> maximum:
> - a high number of rows.
> - a high number of items in the arrays.
> - a *minimum* number of characters in each element of the array, with
> characters that require quoting.
>
> The best test case I can think of to demonstrate the patch would be
> something like that (adjust rows and elts as you see fit):
> -- Number of rows
> \set rows 6
> -- Number of elements
> \set elts 4
> create table tab as
>   with data as (
> select array_agg(a) as array
>   from (
> select '{'::text
>   from generate_series(1, :elts) as int(a)) as index(a))
>   select data.array from data, generate_series(1,:rows);
>
> Then I get:
>array
> ---
>  {"{","{","{","{"}
>  {"{","{","{","{"}
>  {"{","{","{","{"}
>  {"{","{","{","{"}
>  {"{","{","{","{"}
>  {"{","{","{","{"}
> (6 rows)
>
> With "\set rows 10" and "\set elts 1", giving 100MB of data
> with 100k rows with 10k elements each, I get for HEAD when data is in
> shared buffers:
> =# copy tab to '/dev/null';
> COPY 10
> Time: 48620.927 ms (00:48.621)
> And with v3:
> =# copy tab to '/dev/null';
> COPY 10
> Time: 47993.183 ms (00:47.993)
>
Thanks Michael, for the script.

It is easier to make comparisons, using the exact same script.

best regards,
Ranier Vilela


Re: Speeding up COPY TO for uuids and arrays

2024-02-25 Thread Michael Paquier
On Thu, Feb 22, 2024 at 04:42:37PM -0300, Ranier Vilela wrote:
> Can you share exactly script used to create a table?

Stressing the internals of array_out() for the area of the patch is
not that difficult, as we want to quote each element that's returned
in output.

The trick is to have the following to stress the second quoting loop a
maximum:
- a high number of rows.
- a high number of items in the arrays.
- a *minimum* number of characters in each element of the array, with
characters that require quoting.

The best test case I can think of to demonstrate the patch would be
something like that (adjust rows and elts as you see fit):
-- Number of rows
\set rows 6
-- Number of elements
\set elts 4
create table tab as
  with data as (
select array_agg(a) as array
  from (
select '{'::text
  from generate_series(1, :elts) as int(a)) as index(a))
  select data.array from data, generate_series(1,:rows);

Then I get:
   array
---
 {"{","{","{","{"}
 {"{","{","{","{"}
 {"{","{","{","{"}
 {"{","{","{","{"}
 {"{","{","{","{"}
 {"{","{","{","{"}
(6 rows)

With "\set rows 10" and "\set elts 1", giving 100MB of data
with 100k rows with 10k elements each, I get for HEAD when data is in
shared buffers:
=# copy tab to '/dev/null';
COPY 10
Time: 48620.927 ms (00:48.621)
And with v3:
=# copy tab to '/dev/null';
COPY 10
Time: 47993.183 ms (00:47.993)

Profiles don't fundamentally change much, array_out() gets a 30.76% ->
29.72% in self runtime, with what looks like a limited impact to me.

With 1k rows and 1M elements, COPY TO gets reduced from 54338.436 ms
to 54129.978 ms, and a 29.51% -> 29.12% increase (looks like noise).

Perhaps I've hit some noise while running this set of tests, but the
impact of the proposed patch looks very limited to me.  If you have a
better set of tests and/or ideas, feel free of course.
--
Michael


signature.asc
Description: PGP signature


re: Speeding up COPY TO for uuids and arrays

2024-02-22 Thread Ranier Vilela
Hi,

On 2024-02-17 17:48:23 +0100, Laurenz Albe wrote:
> As a test case, I created a table with 1 rows, each of which
> had an array of 1 uuids. The table resided in shared buffers.
Can you share exactly script used to create a table?

best regards,

Ranier Vilela


Re: Speeding up COPY TO for uuids and arrays

2024-02-21 Thread Michael Paquier
On Thu, Feb 22, 2024 at 08:16:09AM +0100, Laurenz Albe wrote:
> I'm attaching the remaining patch for the Juli commitfest, if you
> don't get inspired before that.

There are good chances that I get inspired some time next week.  We'll
see ;)
--
Michael


signature.asc
Description: PGP signature


Re: Speeding up COPY TO for uuids and arrays

2024-02-21 Thread Laurenz Albe
On Thu, 2024-02-22 at 10:34 +0900, Michael Paquier wrote:
> This part is done as of 011d60c4352c.  I did not evaluate the rest
> yet.

Thanks!

I'm attaching the remaining patch for the Juli commitfest, if you
don't get inspired before that.

Yours,
Laurenz Albe
From e8346ea88785a763d2bd3f99800ae928b7469f64 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Sat, 17 Feb 2024 17:24:19 +0100
Subject: [PATCH v1 3/3] Small speedup for array_out

Avoid writing zero bytes where it is not necessary.
This offers only a small, but measurable speed gain
for larger arrays.
---
 src/backend/utils/adt/arrayfuncs.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index f3fee54e37..306c5062f7 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -1200,7 +1200,7 @@ array_out(PG_FUNCTION_ARGS)
 	p = retval;
 
 #define APPENDSTR(str)	(strcpy(p, (str)), p += strlen(p))
-#define APPENDCHAR(ch)	(*p++ = (ch), *p = '\0')
+#define APPENDCHAR(ch)	(*p++ = (ch))
 
 	if (needdims)
 		APPENDSTR(dims_str);
@@ -1222,10 +1222,9 @@ array_out(PG_FUNCTION_ARGS)
 char		ch = *tmp;
 
 if (ch == '"' || ch == '\\')
-	*p++ = '\\';
-*p++ = ch;
+	APPENDCHAR('\\');
+APPENDCHAR(ch);
 			}
-			*p = '\0';
 			APPENDCHAR('"');
 		}
 		else
@@ -1248,6 +1247,8 @@ array_out(PG_FUNCTION_ARGS)
 		j = i;
 	} while (j != -1);
 
+	*p = '\0';
+
 #undef APPENDSTR
 #undef APPENDCHAR
 
-- 
2.43.2



Re: Speeding up COPY TO for uuids and arrays

2024-02-21 Thread Michael Paquier
On Wed, Feb 21, 2024 at 02:29:05PM +0900, Michael Paquier wrote:
> That's very good, so I'd like to apply this part.  Let me know if
> there are any objections.

This part is done as of 011d60c4352c.  I did not evaluate the rest
yet.
--
Michael


signature.asc
Description: PGP signature


Re: Speeding up COPY TO for uuids and arrays

2024-02-20 Thread Michael Paquier
On Mon, Feb 19, 2024 at 03:08:45PM +0100, Laurenz Albe wrote:
> On Sat, 2024-02-17 at 12:24 -0800, Andres Freund wrote:
>> I wonder if we should move the core part for converting to hex to numutils.c,
>> we already have code the for the inverse. There does seem to be further
>> optimization potential in the conversion, and that seems better done 
>> somewhere
>> central rather than one type's output function. OTOH, it might not be worth
>> it, given the need to add the dashes.
>
> I thought about it, but then decided not to do that.
> Calling a function that converts the bytes to hex and then adding the
> hyphens will slow down processing, and I think the code savings would be
> minimal at best.

Yeah, I'm not sure either if that's worth doing, the current
conversion code is simple enough.  I'd be curious to hear about ideas
to optimize that more locally, though.

I was curious about the UUID one, and COPYing out 10M rows with a
single UUID attribute brings down the runtime of a COPY from 3.8s to
2.3s here on a simple benchmark, with uuid_out showing up at the top
of profiles easily on HEAD.  Some references for HEAD:
31.63%  5300  postgres  postgres[.] uuid_out
19.79%  3316  postgres  postgres[.] appendStringInfoChar
11.27%  1887  postgres  postgres[.] CopyAttributeOutText
 6.36%  1065  postgres  postgres[.] 
pgstat_progress_update_param

And with the patch for uuid_out:
22.66%  2147  postgres  postgres   [.] CopyAttributeOutText
12.99%  1231  postgres  postgres   [.] uuid_out
11.41%  1081  postgres  postgres   [.] 
pgstat_progress_update_param
 4.79%   454  postgres  postgres   [.] CopyOneRowTo

That's very good, so I'd like to apply this part.  Let me know if
there are any objections.
--
Michael


signature.asc
Description: PGP signature


Re: Speeding up COPY TO for uuids and arrays

2024-02-19 Thread Laurenz Albe
On Sat, 2024-02-17 at 12:24 -0800, Andres Freund wrote:
> On 2024-02-17 17:48:23 +0100, Laurenz Albe wrote:
> > - Patch 0001 speeds up pq_begintypsend with a custom macro.
> >   That brought the binary copy down to 3.7 seconds, which is a
> >   speed gain of 15%.
> 
> Nice win, but I think we can do a bit better than this. Certainly it shouldn't
> be a macro, given the multiple evaluation risks.  I don't think we actually
> need to zero those bytes, in fact that seems more likely to hide bugs than
> prevent them.
>
> I have an old patch series improving performance in this area.

The multiple evaluation danger could be worked around with an automatic
variable, or the macro could just carry a warning (like 
appendStringInfoCharMacro).

But considering the thread in [1], I guess I'll retract that patch; I'm
sure the more invasive improvement you have in mind will do better.

> > - Patch 0001 speeds up uuid_out by avoiding the overhead of
> >   a Stringinfo.  This brings text mode COPY to 19.4 seconds,
> >   which is speed gain of 21%.
> 
> Nice!
> 
> I wonder if we should move the core part for converting to hex to numutils.c,
> we already have code the for the inverse. There does seem to be further
> optimization potential in the conversion, and that seems better done somewhere
> central rather than one type's output function. OTOH, it might not be worth
> it, given the need to add the dashes.

I thought about it, but then decided not to do that.
Calling a function that converts the bytes to hex and then adding the
hyphens will slow down processing, and I think the code savings would be
minimal at best.

> > - Patch 0003 speeds up array_out a bit by avoiding some zero
> >   byte writes.  The measured speed gain is under 2%.
> 
> Makes sense.

Thanks for your interest and approval.  The attached patches are the
renamed, but unmodified patches 2 and 3 from before.

Yours,
Laurenz Albe


 [1]: 
https://postgr.es/m/CAMkU%3D1whbRDUwa4eayD9%2B59K-coxO9senDkPRbTn3cg0pUz4AQ%40mail.gmail.com
From de87d249e3f55c38062e563821af9fa32d15e341 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Sat, 17 Feb 2024 17:23:39 +0100
Subject: [PATCH v1 2/3] Speed up uuid_out

Since the size of the string representation of an uuid is
fixed, there is no benefit in using a StringInfo.
Avoiding the overhead makes the function substantially faster.
---
 src/backend/utils/adt/uuid.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/backend/utils/adt/uuid.c b/src/backend/utils/adt/uuid.c
index 73dfd711c7..b48bbf01e1 100644
--- a/src/backend/utils/adt/uuid.c
+++ b/src/backend/utils/adt/uuid.c
@@ -53,10 +53,11 @@ uuid_out(PG_FUNCTION_ARGS)
 {
 	pg_uuid_t  *uuid = PG_GETARG_UUID_P(0);
 	static const char hex_chars[] = "0123456789abcdef";
-	StringInfoData buf;
+	char	   *buf, *p;
 	int			i;
 
-	initStringInfo();
+	buf = palloc(2 * UUID_LEN + 5);
+	p = buf;
 	for (i = 0; i < UUID_LEN; i++)
 	{
 		int			hi;
@@ -68,16 +69,17 @@ uuid_out(PG_FUNCTION_ARGS)
 		 * ("-"). Therefore, add the hyphens at the appropriate places here.
 		 */
 		if (i == 4 || i == 6 || i == 8 || i == 10)
-			appendStringInfoChar(, '-');
+			*p++ = '-';
 
 		hi = uuid->data[i] >> 4;
 		lo = uuid->data[i] & 0x0F;
 
-		appendStringInfoChar(, hex_chars[hi]);
-		appendStringInfoChar(, hex_chars[lo]);
+		*p++ = hex_chars[hi];
+		*p++ = hex_chars[lo];
 	}
+	*p = '\0';
 
-	PG_RETURN_CSTRING(buf.data);
+	PG_RETURN_CSTRING(buf);
 }
 
 /*
-- 
2.43.2

From e8346ea88785a763d2bd3f99800ae928b7469f64 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Sat, 17 Feb 2024 17:24:19 +0100
Subject: [PATCH v1 3/3] Small speedup for array_out

Avoid writing zero bytes where it is not necessary.
This offers only a small, but measurable speed gain
for larger arrays.
---
 src/backend/utils/adt/arrayfuncs.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index f3fee54e37..306c5062f7 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -1200,7 +1200,7 @@ array_out(PG_FUNCTION_ARGS)
 	p = retval;
 
 #define APPENDSTR(str)	(strcpy(p, (str)), p += strlen(p))
-#define APPENDCHAR(ch)	(*p++ = (ch), *p = '\0')
+#define APPENDCHAR(ch)	(*p++ = (ch))
 
 	if (needdims)
 		APPENDSTR(dims_str);
@@ -1222,10 +1222,9 @@ array_out(PG_FUNCTION_ARGS)
 char		ch = *tmp;
 
 if (ch == '"' || ch == '\\')
-	*p++ = '\\';
-*p++ = ch;
+	APPENDCHAR('\\');
+APPENDCHAR(ch);
 			}
-			*p = '\0';
 			APPENDCHAR('"');
 		}
 		else
@@ -1248,6 +1247,8 @@ array_out(PG_FUNCTION_ARGS)
 		j = i;
 	} while (j != -1);
 
+	*p = '\0';
+
 #undef APPENDSTR
 #undef APPENDCHAR
 
-- 
2.43.2



Re: Speeding up COPY TO for uuids and arrays

2024-02-18 Thread Michael Paquier
On Sat, Feb 17, 2024 at 12:24:33PM -0800, Andres Freund wrote:
> I wonder if we should move the core part for converting to hex to numutils.c,
> we already have code the for the inverse. There does seem to be further
> optimization potential in the conversion, and that seems better done somewhere
> central rather than one type's output function. OTOH, it might not be worth
> it, given the need to add the dashes.

I'd tend to live with the current location of the code, but I'm OK if
people feel differently on this one, so I'm OK with what Laurenz is
proposing. 

>> - Patch 0003 speeds up array_out a bit by avoiding some zero
>>   byte writes.  The measured speed gain is under 2%.
> 
> Makes sense.

+1.
--
Michael


signature.asc
Description: PGP signature


Re: Speeding up COPY TO for uuids and arrays

2024-02-17 Thread Andres Freund
Hi,

On 2024-02-17 17:48:23 +0100, Laurenz Albe wrote:
> - Patch 0001 speeds up pq_begintypsend with a custom macro.
>   That brought the binary copy down to 3.7 seconds, which is a
>   speed gain of 15%.

Nice win, but I think we can do a bit better than this. Certainly it shouldn't
be a macro, given the multiple evaluation risks.  I don't think we actually
need to zero those bytes, in fact that seems more likely to hide bugs than
prevent them.

I have an old patch series improving performance in this area. A big win is to
simply not allocate as much memory for a new stringinfo, when we already know
the upper bound, as we do in many of the send functions.


> - Patch 0001 speeds up uuid_out by avoiding the overhead of
>   a Stringinfo.  This brings text mode COPY to 19.4 seconds,
>   which is speed gain of 21%.

Nice!

I wonder if we should move the core part for converting to hex to numutils.c,
we already have code the for the inverse. There does seem to be further
optimization potential in the conversion, and that seems better done somewhere
central rather than one type's output function. OTOH, it might not be worth
it, given the need to add the dashes.


> - Patch 0003 speeds up array_out a bit by avoiding some zero
>   byte writes.  The measured speed gain is under 2%.

Makes sense.

Greetings,

Andres Freund