On 11-09-2018 18:29, Fabien COELHO wrote:
Hello Marina,

Hmm, but we can say the same for serialization or deadlock errors that were not retried (the client test code itself could not run correctly or the SQL sent was somehow wrong, which is also the client's fault), can't we?

I think not.

If a client asks for something "legal", but some other client in
parallel happens to make an incompatible change which result in a
serialization or deadlock error, the clients are not responsible for
the raised errors, it is just that they happen to ask for something
incompatible at the same time. So there is no user error per se, but
the server is reporting its (temporary) inability to process what was
asked for. For these errors, retrying is fine. If the client was
alone, there would be no such errors, you cannot deadlock with
yourself. This is really an isolation issue linked to parallel
execution.

You can get other errors that cannot happen for only one client if you use shell commands in meta commands:

starting vacuum...end.
transaction type: pgbench_meta_concurrent_error.sql
scaling factor: 1
query mode: simple
number of clients: 2
number of threads: 1
number of transactions per client: 10
number of transactions actually processed: 20/20
maximum number of tries: 1
latency average = 6.953 ms
tps = 287.630161 (including connections establishing)
tps = 303.232242 (excluding connections establishing)
statement latencies in milliseconds and failures:
         1.636           0  BEGIN;
         1.497           0  \setshell var mkdir my_directory && echo 1
         0.007           0  \sleep 1 us
         1.465           0  \setshell var rmdir my_directory && echo 1
         1.622           0  END;

starting vacuum...end.
mkdir: cannot create directory ‘my_directory’: File exists
mkdir: could not read result of shell command
client 1 got an error in command 1 (setshell) of script 0; execution of meta-command failed
transaction type: pgbench_meta_concurrent_error.sql
scaling factor: 1
query mode: simple
number of clients: 2
number of threads: 1
number of transactions per client: 10
number of transactions actually processed: 19/20
number of failures: 1 (5.000%)
number of meta-command failures: 1 (5.000%)
maximum number of tries: 1
latency average = 11.782 ms (including failures)
tps = 161.269033 (including connections establishing)
tps = 167.733278 (excluding connections establishing)
statement latencies in milliseconds and failures:
         2.731           0  BEGIN;
         2.909           1  \setshell var mkdir my_directory && echo 1
         0.231           0  \sleep 1 us
         2.366           0  \setshell var rmdir my_directory && echo 1
         2.664           0  END;

Or if you use untrusted procedural languages in SQL expressions (see the used file in the attachments):

starting vacuum...ERROR:  relation "pgbench_branches" does not exist
(ignoring this error and continuing anyway)
ERROR:  relation "pgbench_tellers" does not exist
(ignoring this error and continuing anyway)
ERROR:  relation "pgbench_history" does not exist
(ignoring this error and continuing anyway)
end.
client 1 got an error in command 0 (SQL) of script 0; ERROR: could not create the directory "my_directory": File exists at line 3.
CONTEXT:  PL/Perl anonymous code block

client 1 got an error in command 0 (SQL) of script 0; ERROR: could not create the directory "my_directory": File exists at line 3.
CONTEXT:  PL/Perl anonymous code block

transaction type: pgbench_concurrent_error.sql
scaling factor: 1
query mode: simple
number of clients: 2
number of threads: 1
number of transactions per client: 10
number of transactions actually processed: 18/20
number of failures: 2 (10.000%)
number of serialization failures: 0 (0.000%)
number of deadlock failures: 0 (0.000%)
number of other SQL failures: 2 (10.000%)
maximum number of tries: 1
latency average = 3.282 ms (including failures)
tps = 548.437196 (including connections establishing)
tps = 637.662753 (excluding connections establishing)
statement latencies in milliseconds and failures:
         1.566           2  DO $$

starting vacuum...ERROR:  relation "pgbench_branches" does not exist
(ignoring this error and continuing anyway)
ERROR:  relation "pgbench_tellers" does not exist
(ignoring this error and continuing anyway)
ERROR:  relation "pgbench_history" does not exist
(ignoring this error and continuing anyway)
end.
transaction type: pgbench_concurrent_error.sql
scaling factor: 1
query mode: simple
number of clients: 2
number of threads: 1
number of transactions per client: 10
number of transactions actually processed: 20/20
maximum number of tries: 1
latency average = 2.760 ms
tps = 724.746078 (including connections establishing)
tps = 853.131985 (excluding connections establishing)
statement latencies in milliseconds and failures:
         1.893           0  DO $$

Or if you try to create a function and perhaps replace an existing one:

starting vacuum...end.
client 0 got an error in command 0 (SQL) of script 0; ERROR: duplicate key value violates unique constraint "pg_proc_proname_args_nsp_index" DETAIL: Key (proname, proargtypes, pronamespace)=(my_function, , 2200) already exists.

client 0 got an error in command 0 (SQL) of script 0; ERROR: tuple concurrently updated

client 1 got an error in command 0 (SQL) of script 0; ERROR: tuple concurrently updated

client 1 got an error in command 0 (SQL) of script 0; ERROR: tuple concurrently updated

client 1 got an error in command 0 (SQL) of script 0; ERROR: tuple concurrently updated

client 1 got an error in command 0 (SQL) of script 0; ERROR: tuple concurrently updated

client 0 got an error in command 0 (SQL) of script 0; ERROR: tuple concurrently updated

client 1 got an error in command 0 (SQL) of script 0; ERROR: tuple concurrently updated

client 1 got an error in command 0 (SQL) of script 0; ERROR: tuple concurrently updated

client 0 got an error in command 0 (SQL) of script 0; ERROR: tuple concurrently updated

transaction type: pgbench_create_function.sql
scaling factor: 1
query mode: simple
number of clients: 2
number of threads: 1
number of transactions per client: 10
number of transactions actually processed: 10/20
number of failures: 10 (50.000%)
number of serialization failures: 0 (0.000%)
number of deadlock failures: 0 (0.000%)
number of other SQL failures: 10 (50.000%)
maximum number of tries: 1
latency average = 82.881 ms (including failures)
tps = 12.065492 (including connections establishing)
tps = 12.092216 (excluding connections establishing)
statement latencies in milliseconds and failures:
82.549 10 CREATE OR REPLACE FUNCTION my_function() RETURNS integer AS 'select 1;' LANGUAGE SQL;

Why not handle client errors that can occur (but they may also not occur) the same way? (For example, always abort the client, or conversely do not make aborts in these cases.) Here's an example of such error:

client 5 got an error in command 1 (SQL) of script 0; ERROR: division by zero

This is an interesting case. For me we must stop the script because
the client is asking for something "stupid", and retrying the same
won't change the outcome, the division will still be by zero. It is
the client responsability not to ask for something stupid, the bench
script is buggy, it should not submit illegal SQL queries. This is
quite different from submitting something legal which happens to fail.
...
I'm not sure that having "--debug" implying this option
is useful: As there are two distinct options, the user may be allowed
to trigger one or the other as they wish?

I'm not sure that the main debugging output will give a good clue of what's happened without full messages about errors, retries and failures...

I'm more argumenting about letting the user decide what they want.

These lines are quite long - do you suggest to wrap them this way?

Sure, if it is too long, then wrap.

Ok!

Function getTransactionStatus name does not seem to correspond fully to what the function does. There is a passthru case which should be either avoided or clearly commented.

I don't quite understand you - do you mean that in fact this function finds out whether we are in a (failed) transaction block or not? Or do you mean that the case of PQTRANS_INTRANS is also ok?...

The former: although the function is named "getTransactionStatus", it
does not really return the "status" of the transaction (aka
PQstatus()?).

Thank you, I'll think how to improve it. Perhaps the name checkTransactionStatus will be better...

I'd insist in a comment that "cnt" does not include "skipped" transactions
(anymore).

If you mean CState.cnt I'm not sure if this is practically useful because the code uses only the sum of all client transactions including skipped and failed... Maybe we can rename this field to nxacts or total_cnt?

I'm fine with renaming the field if it makes thinks clearer. They are
all counters, so naming them "cnt" or "total_cnt" does not help much.
Maybe "succeeded" or "success" to show what is really counted?

Perhaps renaming of StatsData.cnt is better than just adding a comment to this field. But IMO we have the same problem (They are all counters, so naming them "cnt" or "total_cnt" does not help much.) for CState.cnt which cannot be named in the same way because it also includes skipped and failed transactions.

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
DO $$
  my $directory = "my_directory";
  mkdir $directory
    or elog(ERROR, qq{could not create the directory "$directory": $!});
  select(undef, undef, undef, 0.00000000001);
  rmdir $directory or elog(ERROR, qq{could not delete the directory 
"$directory": $!});
$$ LANGUAGE plperlu;

Reply via email to