Re: [Maria-developers] e92037989f7: MDEV-21117: refine the server binlog-based recovery for semisync

2021-05-26 Thread Sergei Golubchik
Hi, Andrei!

On Apr 27, Andrei Elkin wrote:
> >> +--let $case = B: one engine has committed its transaction branch
> >> +# Hold off after one engine has committed.
> >> +--let $shutdown_timeout=0
> >> +--let $debug_sync_action = "commit_after_run_commit_ordered SIGNAL 
> >> con1_ready WAIT_FOR signal_no_signal"
> >> +# Both debug_sync and debug-dbug are required to make sure Engines 
> >> remember the commit state
> >> +# debug_sync alone will not help.
> >> +--let $restart_parameters = --rpl-semi-sync-slave-enabled=1 
> >> --debug-dbug=d,binlog_truncate_partial_commit
> >
> > in the first review I wrote
> >
> >   this seems to be a rather crude way of faking a partially committed
> >   transaction. better to crash after the first engine has committed,
> >   that'd be much more natural.
> >
> > and you replied
> >
> >   This simulation aimed at (allows for) more complicated recovery time
> >   event sequences.
> >   In this case, indeed, crashing by demand is about of the same efforts.
> >   I can convert to that.
> >
> >   [x]
> 
> I had a rather sobering followup, slipped communicating to you though, sorry.
> 
> It's actually turns out not to be easy. A sequence of execution events
> 
>   E1.prepare(trx), E2.prepare(trx), E1.commit(trx), *crash*
> 
> does not make E1 such as Innodb committed persistently in file.  At
> recovery in E1 trx may be found in prepared state, unless before the
> crash I'd do something like Binlog CheckPoint (BCP) notification request
> and wait of it.

I don't understand, what a binlog checkpoint has to do with it?

> In this just a selected engine would be requested, unlike in BCP event
> case that is generated when both (in this case) have persistently committed.
> (That is we even can't involve BCP event to wait for, and the involvement
> would require some simulation as well).
> 
> So I propose to keep the current method leaving the "honest" crash to
> RQG that Alice has been training the patch with.

Alice was loading the server with transactions and doing `kill -9` at
some point. Many interesting cases that we discuss (for example, the one
above) happen in a very small time window, so I don't know whether
random killing could provide an adequate test coverage.

> >> diff --git a/mysql-test/suite/rpl/t/rpl_semi_sync_fail_over.test 
> >> b/mysql-test/suite/rpl/t/rpl_semi_sync_fail_over.test
> >> --- /dev/null
> >> +++ b/mysql-test/suite/rpl/t/rpl_semi_sync_fail_over.test
> >> @@ -0,0 +1,143 @@
> >> +
> >> +--source include/have_innodb.inc
> >> +--source include/have_debug_sync.inc
> >> +--source include/have_binlog_format_row.inc
> >> +--let $rpl_topology=1->2
> >> +--source include/rpl_init.inc
> >
> > why not to source master-slave.inc if you're using a standard master-slave
> > topology anyway?
> 
> Using numbers it's easier to manipulate with failover operatations.
> Besides it's truly uncomfortable both read and write
> when `master` would be serving the slave role as it happens in the test :-).

I believe I already replied to that earlier.
master-slave.inc creates server_1 and server_2 connections all right.

> >> diff --git a/sql/handler.h b/sql/handler.h
> >> --- a/sql/handler.h
> >> +++ b/sql/handler.h
> >> @@ -873,6 +874,15 @@ typedef struct xid_t XID;
> >>  /* The 'buf' has to have space for at least SQL_XIDSIZE bytes. */
> >>  uint get_sql_xid(XID *xid, char *buf);
> >>  
> >> +/* struct for semisync slave binlog truncate recovery */
> >> +struct xid_recovery_member
> >> +{
> >> +  my_xid xid;
> >> +  uint in_engine_prepare;  // number of engines that have xid prepared
> >> +  bool decided_to_commit;
> >> +  std::pair binlog_coord; // semisync recovery binlog 
> >> offset
> >
> > wouldn't it be clearer to have a struct with named members?
> 
> I am not sure if there's a need for naming. But we do need
> comparison that std::pair provides for granted.
> 
> I could consider (a) new typedef(s) to represent the binlog id and
> offset in the file to make it more readable.
> 
> You say?
> 
> > in fact, I'm somewhat surprised there's no such struct for binlog coords
> > already.
> 
> I had started that back at HB implementation actually, but that struct
> did not attact much of attention.

I'd still suggest a structure, like

struct binlog_coords_t {
  uint binlog_num;
  my_off_t offset;
};

> >
> >> +};
> >> +
> >>  /* for recover() handlerton call */
> >>  #define MIN_XID_LIST_SIZE  128
> >>  #define MAX_XID_LIST_SIZE  (1024*128)
> >> @@ -4820,7 +4830,8 @@ int ha_commit_one_phase(THD *thd, bool all);
> >>  int ha_commit_trans(THD *thd, bool all);
> >>  int ha_rollback_trans(THD *thd, bool all);
> >>  int ha_prepare(THD *thd);
> >> -int ha_recover(HASH *commit_list);
> >> +int ha_recover(HASH *commit_list, MEM_ROOT *mem_root= NULL);
> >> +uint ha_recover_complete(HASH *commit_list, std::pair 
> >> *coord= NULL);
> >
> > is coord a truncation position?
> 
> So NULL is for the normal recovery.
> Non-NULL is an estimate of the 1st to rollback.
> Trx with lesser offsets 

[Maria-developers] JSON_TABLE: on default values

2021-05-26 Thread Sergey Petrunia
Hi Alexey,

At the moment MariaDB requires that the values in DEFAULT clauses are quoted. 
Example:

select * 
from 
  json_table(
'{"intval": 1000}',
'$' columns( 
 col1 int path '$.intval_'
default '100' on empty
   )
   ) as T;

here, "100" must be quoted, otherwise one gets a parse error.  However, the
quoted value is interpreted as an SQL literal.  This looks puzzling.

MySQL-8 also requires that the default value is quoted, but they have a (very
odd) reason for it: they interpret the default value as JSON:

https://docs.oracle.com/cd/E17952_01/mysql-8.0-en/json-table-functions.html
says:

  DEFAULT json_string ON EMPTY: the provided json_string is parsed as JSON, as 
  long as it is valid, and stored instead of the missing value. Column type 
  rules also apply to the default value.

I am not sure why MySQL chose to do this. Looking into the SQL Standard, one 
can 
see:

 ::=
 
  [ PATH  ]
  [  ON EMPTY ]
  [  ON ERROR ]

 ::=
ERROR
  | NULL
  | DEFAULT 

...
This doesn't say whether the  should be interepreted as JSON
or just as a value.  But one can find this passage:
 

Without Feature T826, “General value expression in ON ERROR or ON EMPTY 
clauses”, the  contained in  or 
contained in a  JTRCD shall be a 
 that can be cast to the
data type specified by the  contained in JTRCD without raising an 
exception condition
according to the General Rules of Subclause 6.13, “”.


The important part is:

... shall be a  that can be cast to the data type specified ...

which means it is not JSON. It is just a literal, and literal can be a string
literal (in quotes, 'string') or an integer literal (without quotes) or other
kind of literal.

Btw, Oracle Database allows non-string literals in the default clause:
https://dbfiddle.uk/?rdbms=oracle_18=9af7e43ede77ee285e1a65f1f419d3bd

What are your thoughts on this?
Is MariaDB's behavior intentional? Should we follow the standard and allow all
kinds of literals?  What was the reason for the limitation that default values
are quoted?

BR
 Sergei
-- 
Sergei Petrunia, Software Developer
MariaDB Corporation | Skype: sergefp | Blog: http://petrunia.net


___
Mailing list: https://launchpad.net/~maria-developers
Post to : maria-developers@lists.launchpad.net
Unsubscribe : https://launchpad.net/~maria-developers
More help   : https://help.launchpad.net/ListHelp