Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

Fabien COELHO Thu, 03 Aug 2017 02:40:38 -0700


Hello,

My motivation of this proposal is same as what Robert has. I
understand that ad-hoc option can solve only the part of big problem
and it could be cause of mess. However It seems me that the script
especially for table initialization will not be flexible than we
expected. I mean, even if we provide some meta commands for table
initialization or data loading, these meta commands work for only
pgbench tables (i.g., pgbench_accounts, pgbench_branches and so on).
If we want to create other tables and load data to them as we want we
can do that using psql -f. So an alternative ways is having a flexible
style option for example --custom-initialize = { [load, create_pkey,
create_fkey, vacuum], ... }. That would solve this in a better way.

Personnaly, I could be fine with a limited number of long options toadjust pgbench initialization to various needs, eg --use-hash-index,--skip-whetever-index, etc.


The flexible --custom-init idea outlined above looks nice as well.

As for a more generic solution, the easy part are the "CREATE" stuff andthe transaction script stuff (existing pgbench scripts).

For the CREATE stuff, the script language is SQL, the command to use it is"psql"...

The real and hard part is to fill tables with meaningful pseudo-randomtest data which do not violate constraints for any non trivial schemainvolving foreign keys and various unique constraints.


The solution for this is SQL for trivial cases, think of:

  "INSERT INTO Foo() SELECT ... FROM generate_series(...);"

For instance the pgbench initialization is really close to:

 psql -Dscale=10 <<EOF
   CREATE TABLE ... ;
   INSERT INTO pgbench_account(...)
    SELECT ... FROM generate_series(1, 100000 * :scale) AS i;
   INSERT ...
   CREATE INDEX ...;
   VACUUM FULL ANALYZE;
 EOF

And all existing options could probably be implemented easilly with therecently added conditional (\if).

So my 0.02€ is that if something is to be done, I would suggest to turnthe creation and initialization stuff into a standard "psql" script thatcould be called from pgbench instead of integrating much more ad-hoc stuffinto pgbench.

Note that non trivial schema initialization requires more generalprogramming, so I do not believe in doing a lot at pgbench or psql levels.The best I could come with is a data generator which takes as input theschema with added directives on how to generate the various attributes(tool named "datafiller", that some people use:-).


--
Fabien.
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization

Reply via email to