Hello,

My motivation of this proposal is same as what Robert has. I
understand that ad-hoc option can solve only the part of big problem
and it could be cause of mess. However It seems me that the script
especially for table initialization will not be flexible than we
expected. I mean, even if we provide some meta commands for table
initialization or data loading, these meta commands work for only
pgbench tables (i.g., pgbench_accounts, pgbench_branches and so on).
If we want to create other tables and load data to them as we want we
can do that using psql -f. So an alternative ways is having a flexible
style option for example --custom-initialize = { [load, create_pkey,
create_fkey, vacuum], ... }. That would solve this in a better way.

Personnaly, I could be fine with a limited number of long options to adjust pgbench initialization to various needs, eg --use-hash-index, --skip-whetever-index, etc.

The flexible --custom-init idea outlined above looks nice as well.


As for a more generic solution, the easy part are the "CREATE" stuff and the transaction script stuff (existing pgbench scripts).

For the CREATE stuff, the script language is SQL, the command to use it is "psql"...

The real and hard part is to fill tables with meaningful pseudo-random test data which do not violate constraints for any non trivial schema involving foreign keys and various unique constraints.

The solution for this is SQL for trivial cases, think of:

  "INSERT INTO Foo() SELECT ... FROM generate_series(...);"

For instance the pgbench initialization is really close to:

 psql -Dscale=10 <<EOF
   CREATE TABLE ... ;
   INSERT INTO pgbench_account(...)
    SELECT ... FROM generate_series(1, 100000 * :scale) AS i;
   INSERT ...
   CREATE INDEX ...;
   VACUUM FULL ANALYZE;
 EOF

And all existing options could probably be implemented easilly with the recently added conditional (\if).

So my 0.02€ is that if something is to be done, I would suggest to turn the creation and initialization stuff into a standard "psql" script that could be called from pgbench instead of integrating much more ad-hoc stuff into pgbench.

Note that non trivial schema initialization requires more general programming, so I do not believe in doing a lot at pgbench or psql levels. The best I could come with is a data generator which takes as input the schema with added directives on how to generate the various attributes (tool named "datafiller", that some people use:-).

--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to