Re: Use COPY for populating all pgbench tables

2023-07-24 Thread Tristan Partin
Michael, Once again I appreciate your patience with this patchset. Thanks for your help and reviews. -- Tristan Partin Neon (https://neon.tech)

Re: Use COPY for populating all pgbench tables

2023-07-23 Thread Michael Paquier
On Sun, Jul 23, 2023 at 08:21:51PM +0900, Michael Paquier wrote: > Cool. I have applied the new tests for now to move on with this > thread. I have done a few more things on this patch today, including measurements with a local host and large scaling numbers. One of my hosts was taking for

Re: Use COPY for populating all pgbench tables

2023-07-23 Thread Michael Paquier
On Fri, Jul 21, 2023 at 12:22:06PM -0500, Tristan Partin wrote: > v7 looks good from my perspective. Thanks for working through this patch > with me. Much appreciated. Cool. I have applied the new tests for now to move on with this thread. -- Michael signature.asc Description: PGP signature

Re: Use COPY for populating all pgbench tables

2023-07-21 Thread Tristan Partin
On Thu Jul 20, 2023 at 9:14 PM CDT, Michael Paquier wrote: Attached is a v7, with these tests (should be a patch on its own but I'm lazy to split this morning) and some more adjustments that I have done while going through the patch. What do you think? v7 looks good from my perspective.

Re: Use COPY for populating all pgbench tables

2023-07-20 Thread Michael Paquier
On Thu, Jul 20, 2023 at 02:22:51PM -0500, Tristan Partin wrote: > Thanks for your testing Michael. I went ahead and added a test to make sure > that this behavior doesn't regress accidentally, but I am struggling to get > the test to fail using the previous version of this patch. Do you have any >

Re: Use COPY for populating all pgbench tables

2023-07-20 Thread Tristan Partin
On Wed Jul 19, 2023 at 10:07 PM CDT, Michael Paquier wrote: So this patch causes pgbench to not stick with its historical behavior, and the change is incompatible with the comments because the tellers and branches tables don't use NULL for their filler attribute anymore. Great find. This was a

Re: Use COPY for populating all pgbench tables

2023-07-19 Thread Michael Paquier
On Wed, Jul 19, 2023 at 01:03:21PM -0500, Tristan Partin wrote: > Didn't actually include the changes in the previous patch. -initGenerateDataClientSide(PGconn *con) +initBranch(PQExpBufferData *sql, int64 curr) { - PQExpBufferData sql; + /* "filler" column defaults to NULL */ +

Re: Use COPY for populating all pgbench tables

2023-07-19 Thread Tristan Partin
Didn't actually include the changes in the previous patch. -- Tristan Partin Neon (https://neon.tech) From 5b934691b88b3b2c5675bc778b0a10e9eeff3dbe Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Tue, 23 May 2023 11:48:16 -0500 Subject: [PATCH v5] Use COPY instead of INSERT for populating

Re: Use COPY for populating all pgbench tables

2023-07-19 Thread Tristan Partin
On Wed Jul 12, 2023 at 10:52 PM CDT, Michael Paquier wrote: On Wed, Jul 12, 2023 at 09:29:35AM -0500, Tristan Partin wrote: > On Wed Jul 12, 2023 at 1:06 AM CDT, Michael Paquier wrote: >> This would use the freeze option only on pgbench_accounts when no >> partitioning is defined, but my point

Re: Use COPY for populating all pgbench tables

2023-07-12 Thread Michael Paquier
On Wed, Jul 12, 2023 at 09:29:35AM -0500, Tristan Partin wrote: > On Wed Jul 12, 2023 at 1:06 AM CDT, Michael Paquier wrote: >> This would use the freeze option only on pgbench_accounts when no >> partitioning is defined, but my point was a bit different. We could >> use the FREEZE option on the

Re: Use COPY for populating all pgbench tables

2023-07-12 Thread Tristan Partin
On Wed Jul 12, 2023 at 1:06 AM CDT, Michael Paquier wrote: > On Tue, Jul 11, 2023 at 09:46:43AM -0500, Tristan Partin wrote: > > On Tue Jul 11, 2023 at 12:03 AM CDT, Michael Paquier wrote: > >> This seems a bit incorrect because partitioning only applies to > >> pgbench_accounts, no? This change

Re: Use COPY for populating all pgbench tables

2023-07-12 Thread Michael Paquier
On Tue, Jul 11, 2023 at 09:46:43AM -0500, Tristan Partin wrote: > On Tue Jul 11, 2023 at 12:03 AM CDT, Michael Paquier wrote: >> This seems a bit incorrect because partitioning only applies to >> pgbench_accounts, no? This change means that the teller and branch >> tables would not benefit from

Re: Use COPY for populating all pgbench tables

2023-07-11 Thread Tristan Partin
On Tue Jul 11, 2023 at 12:03 AM CDT, Michael Paquier wrote: > On Wed, Jun 14, 2023 at 10:58:06AM -0500, Tristan Partin wrote: > static void > -initGenerateDataClientSide(PGconn *con) > +initBranch(PGconn *con, PQExpBufferData *sql, int64 curr) > +{ > + /* "filler" column defaults to NULL */

Re: Use COPY for populating all pgbench tables

2023-07-10 Thread Michael Paquier
On Wed, Jun 14, 2023 at 10:58:06AM -0500, Tristan Partin wrote: > Again, I forget to actually attach. Holy guacamole. Looks rather OK seen from here, applied 0001 while browsing the series. I have a few comments about 0002. static void -initGenerateDataClientSide(PGconn *con)

Re: Use COPY for populating all pgbench tables

2023-06-14 Thread Tristan Partin
Again, I forget to actually attach. Holy guacamole. -- Tristan Partin Neon (https://neon.tech) From 8c4eb4849b1282f1a0947ddcf3f599e384a5a428 Mon Sep 17 00:00:00 2001 From: Tristan Partin Date: Tue, 23 May 2023 09:21:55 -0500 Subject: [PATCH v2 1/2] Move constant into format string If we are

Re: Use COPY for populating all pgbench tables

2023-06-14 Thread Tristan Partin
Here is a v2. It cleans up the output when printing to a tty. The last "x of y tuples" line gets overwritten now, so the final output looks like: dropping old tables... creating tables... generating data (client-side)... vacuuming... creating primary keys... done in 0.14 s (drop tables 0.01 s,

Re: Use COPY for populating all pgbench tables

2023-06-13 Thread Tristan Partin
On Thu Jun 8, 2023 at 11:38 AM CDT, Tristan Partin wrote: > On Thu Jun 8, 2023 at 12:33 AM CDT, David Rowley wrote: > > On Thu, 8 Jun 2023 at 07:16, Tristan Partin wrote: > > > > > > master: > > > > > > 5000 of 5000 tuples (100%) done (elapsed 260.93 s, remaining 0.00 > > > s)) > > >

Re: Use COPY for populating all pgbench tables

2023-06-13 Thread Tristan Partin
I think I am partial to number 2. Removing a context switch always leads to more productivity. -- Tristan Partin Neon (https://neon.tech)

Re: Use COPY for populating all pgbench tables

2023-06-09 Thread Gurjeet Singh
On Fri, Jun 9, 2023 at 6:20 PM Gurjeet Singh wrote: > On Fri, Jun 9, 2023 at 5:42 PM Gregory Smith wrote: > > On Fri, Jun 9, 2023 at 1:25 PM Gurjeet Singh wrote: > >> > >> > $ pgbench -i -I dtGvp -s 500 > >> > >> The steps are severely under-documented in pgbench --help output. > > > > I

Re: Use COPY for populating all pgbench tables

2023-06-09 Thread Gurjeet Singh
On Fri, Jun 9, 2023 at 5:42 PM Gregory Smith wrote: > > On Fri, Jun 9, 2023 at 1:25 PM Gurjeet Singh wrote: >> >> > $ pgbench -i -I dtGvp -s 500 >> >> The steps are severely under-documented in pgbench --help output. > > > I agree it's not easy to find information. I just went through

Re: Use COPY for populating all pgbench tables

2023-06-09 Thread Gregory Smith
On Fri, Jun 9, 2023 at 1:25 PM Gurjeet Singh wrote: > > $ pgbench -i -I dtGvp -s 500 > > The steps are severely under-documented in pgbench --help output. > I agree it's not easy to find information. I just went through double checking I had the order recently enough to remember what I

Re: Use COPY for populating all pgbench tables

2023-06-09 Thread Tristan Partin
On Fri Jun 9, 2023 at 12:25 PM CDT, Gurjeet Singh wrote: > On Fri, Jun 9, 2023 at 6:24 AM Gregory Smith wrote: > > > > Unfortunately there's no simple command line option to change just that one > > thing about how pgbench runs. You have to construct a command line that > > documents each and

Re: Use COPY for populating all pgbench tables

2023-06-09 Thread Gurjeet Singh
On Fri, Jun 9, 2023 at 6:24 AM Gregory Smith wrote: > > Unfortunately there's no simple command line option to change just that one > thing about how pgbench runs. You have to construct a command line that > documents each and every step you want instead. You probably just want this > form:

Re: Use COPY for populating all pgbench tables

2023-06-09 Thread Tristan Partin
David, I think you should submit this patch standalone. I don't see any reason this shouldn't be reviewed and committed when fully fleshed out. -- Tristan Partin Neon (https://neon.tech)

Re: Use COPY for populating all pgbench tables

2023-06-09 Thread Tristan Partin
> this problem. Some employees work on a different continent than the > > databases they might be benchmarking. By moving pgbench to use COPY for > > populating all tables, we can reduce some of the time pgbench takes for > > this particular step. > > > > When latency is

Re: Use COPY for populating all pgbench tables

2023-06-09 Thread Gregory Smith
s they might be benchmarking. By moving pgbench to use COPY for > populating all tables, we can reduce some of the time pgbench takes for > this particular step. > When latency is continent size high, pgbench should be run with server-side table generation instead of using COPY

Re: Use COPY for populating all pgbench tables

2023-06-08 Thread Tristan Partin
On Thu Jun 8, 2023 at 12:33 AM CDT, David Rowley wrote: > On Thu, 8 Jun 2023 at 07:16, Tristan Partin wrote: > > > > master: > > > > 5000 of 5000 tuples (100%) done (elapsed 260.93 s, remaining 0.00 > > s)) > > vacuuming... > > creating primary keys... > > done in 1414.26 s (drop tables

Re: Use COPY for populating all pgbench tables

2023-06-08 Thread Hannu Krosing
I guess that COPY will still be slower than generating the data server-side ( --init-steps=...G... ) ? What I'd really like to see is providing all the pgbench functions also on the server. Specifically the various random(...) functions - random_exponential(...), random_gaussian(...),

Re: Use COPY for populating all pgbench tables

2023-06-07 Thread David Rowley
On Thu, 8 Jun 2023 at 07:16, Tristan Partin wrote: > > master: > > 5000 of 5000 tuples (100%) done (elapsed 260.93 s, remaining 0.00 s)) > vacuuming... > creating primary keys... > done in 1414.26 s (drop tables 0.20 s, create tables 0.82 s, client-side > generate 1280.43 s, vacuum 2.55

Re: Use COPY for populating all pgbench tables

2023-06-07 Thread Tristan Partin
On Tue May 23, 2023 at 12:33 PM CDT, Tristan Partin wrote: > I wanted to come with benchmarks, but unfortunately I won't have them > until next month. I can follow-up in a future email. I finally got around to benchmarking. master: $ ./build/src/bin/pgbench/pgbench -i -s 500 CONNECTION_STRING

Use COPY for populating all pgbench tables

2023-05-23 Thread Tristan Partin
Hello, We (Neon) have noticed that pgbench can be quite slow to populate data in regard to higher latency connections. Higher scale factors exacerbate this problem. Some employees work on a different continent than the databases they might be benchmarking. By moving pgbench to use COPY