Re: [Maria-developers] 880e92a48ba: MDEV-27760 event may non stop replicate in circular semisync setup
Hi, Andrei, I don't see a point in caching the value in memory that is very cheap to calculate per event. I've pushed f8ec4dd38d0 to 10.4-serg three hours ago and Jenkins said it's fine. So I'd suggest we go with it, because we really have to close this blocker and release ES now. And MDEV-27837 will be fixed as a separate bug. queue_event checks global_system_variables.server_id in multiple places, moving only do_accept_own_server_id into Master_info will not make it any better or worse against MDEV-27837. > > As far as I can see, you can calculate it for every event just the same. > > The per event computation is extraneous and not consistent with the > nature of the flag that belongs to the group of events. > > Also notice (or remember) a "FR" > MDEV-27837 disallow `set @@session.server_id` within transaction > so until this bug is fixed it's not 'just the same'. > > I am recomming the patch, assuming that you'll be fine with the Gtid time > only flag computation. > Regards, Sergei VP of MariaDB Server Engineering and secur...@mariadb.org ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Re: [Maria-developers] 880e92a48ba: MDEV-27760 event may non stop replicate in circular semisync setup
Hi, Andrei, On Mar 01, Andrei Elkin wrote: > > On Feb 28, Andrei wrote: > >> revision-id: 880e92a48ba (mariadb-10.6.6-16-g880e92a48ba) > >> parent(s): 4030a9fb2eb > >> author: Andrei > >> committer: Andrei > >> timestamp: 2022-02-21 14:58:13 +0200 > >> message: > >> > >> MDEV-27760 event may non stop replicate in circular semisync setup > >> > >> diff --git a/sql/slave.cc b/sql/slave.cc > >> index c0eef02ca7a..a1f4f4f0081 100644 > >> --- a/sql/slave.cc > >> +++ b/sql/slave.cc > >> @@ -6186,13 +6186,13 @@ static int queue_event(Master_info* mi, const > >> uchar *buf, ulong event_len) > >>bool is_rows_event= false; > >>/* > >> The flag has replicate_same_server_id semantics and is raised to > >> accept > >> -a same-server-id event on the semisync slave, for both the gtid and > >> legacy > >> -connection modes. > >> -Such events can appear as result of this server recovery so the event > >> -was created there and replicated elsewhere right before the crash. At > >> recovery > >> -it could be evicted from the server's binlog. > >> - */ > >> - bool do_accept_own_server_id= false; > >> +a same-server-id event group by the gtid strict mode semisync slave. > >> +Own server-id events can appear as result of this server > >> crash-recovery: > >> +the transaction was created on this server then being master, got > >> replicated > >> +elsewhere right before the crash before commit; > >> +finally at recovery the transaction gets evicted from the server's > >> binlog. > >> + */ > >> + static bool do_accept_own_server_id= false; > > > > No, sorry, it cannot be static. If you need to preserve it between > > events, you can keep it, for example, in Master_info. > > The var is set by Gtid events to serve as a marker of the whole current > Gtid-headed > group of event. > Only next Gtid event may change the current value. > > Why can't it? Why it cannot be static? Because it's not a property of the server, it's a property of a specific replication stream. Technically one can have a multi-source and do_accept_own_server_id could have different values for different masters. > > So, the question now is, why do you need to preserve the value of > > do_accept_own_server_id between queue_event invocations? > > The flag as said above applies to the whole group of events, so it's > preserved at least through so many > queue_event() calls as many event in the group (I count Gtid itself in). As far as I can see, you can calculate it for every event just the same. > >> @@ -6281,6 +6281,8 @@ static int queue_event(Master_info* mi, const uchar > >> *buf, ulong event_len) > >> dbug_rows_event_count = 0; > >>};); > >> #endif > >> + s_id= uint4korr(buf + SERVER_ID_OFFSET); > >> + > >>mysql_mutex_lock(>data_lock); > >> > >>switch (buf[EVENT_TYPE_OFFSET]) { > >> @@ -6722,6 +6724,11 @@ static int queue_event(Master_info* mi, const uchar > >> *buf, ulong event_len) > >> > >> ++mi->events_queued_since_last_gtid; > >> inc_pos= event_len; > >> + > >> +do_accept_own_server_id= > >> + (s_id == global_system_variables.server_id && > >> rpl_semi_sync_slave_enabled > >> + && mi->using_gtid != Master_info::USE_GTID_NO && > >> opt_gtid_strict_mode) ? > >> + true : false; Regards, Sergei VP of MariaDB Server Engineering and secur...@mariadb.org ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp
Re: [Maria-developers] 880e92a48ba: MDEV-27760 event may non stop replicate in circular semisync setup
Hi, Andrei, On Feb 28, Andrei wrote: > revision-id: 880e92a48ba (mariadb-10.6.6-16-g880e92a48ba) > parent(s): 4030a9fb2eb > author: Andrei > committer: Andrei > timestamp: 2022-02-21 14:58:13 +0200 > message: > > MDEV-27760 event may non stop replicate in circular semisync setup > > diff --git a/sql/slave.cc b/sql/slave.cc > index c0eef02ca7a..a1f4f4f0081 100644 > --- a/sql/slave.cc > +++ b/sql/slave.cc > @@ -6186,13 +6186,13 @@ static int queue_event(Master_info* mi, const uchar > *buf, ulong event_len) >bool is_rows_event= false; >/* > The flag has replicate_same_server_id semantics and is raised to accept > -a same-server-id event on the semisync slave, for both the gtid and > legacy > -connection modes. > -Such events can appear as result of this server recovery so the event > -was created there and replicated elsewhere right before the crash. At > recovery > -it could be evicted from the server's binlog. > - */ > - bool do_accept_own_server_id= false; > +a same-server-id event group by the gtid strict mode semisync slave. > +Own server-id events can appear as result of this server crash-recovery: > +the transaction was created on this server then being master, got > replicated > +elsewhere right before the crash before commit; > +finally at recovery the transaction gets evicted from the server's > binlog. > + */ > + static bool do_accept_own_server_id= false; No, sorry, it cannot be static. If you need to preserve it between events, you can keep it, for example, in Master_info. yes, I remember, that's where it was and I asked why it's in Master_info. You could've answered, that you need to preserve the value between queue_event invocations - I didn't realize that at the time. So, the question now is, why do you need to preserve the value of do_accept_own_server_id between queue_event invocations? >/* > FD_q must have been prepared for the first R_a event > inside get_master_version_and_clock() > @@ -6281,6 +6281,8 @@ static int queue_event(Master_info* mi, const uchar > *buf, ulong event_len) > dbug_rows_event_count = 0; >};); > #endif > + s_id= uint4korr(buf + SERVER_ID_OFFSET); > + >mysql_mutex_lock(>data_lock); > >switch (buf[EVENT_TYPE_OFFSET]) { > @@ -6722,6 +6724,11 @@ static int queue_event(Master_info* mi, const uchar > *buf, ulong event_len) > > ++mi->events_queued_since_last_gtid; > inc_pos= event_len; > + > +do_accept_own_server_id= > + (s_id == global_system_variables.server_id && > rpl_semi_sync_slave_enabled > + && mi->using_gtid != Master_info::USE_GTID_NO && > opt_gtid_strict_mode) ? > + true : false; this is a ridiculous pattern bool var = (bool expr) ? true : false; don't do that. >} >break; >/* > @@ -6909,7 +6916,6 @@ static int queue_event(Master_info* mi, const uchar > *buf, ulong event_len) >*/ > >mysql_mutex_lock(log_lock); > - s_id= uint4korr(buf + SERVER_ID_OFFSET); >/* > Write the event to the relay log, unless we reconnected in the middle > of an event group and now need to skip the initial part of the group that > @@ -6955,7 +6961,7 @@ static int queue_event(Master_info* mi, const uchar > *buf, ulong event_len) >else >if ((s_id == global_system_variables.server_id && > !(mi->rli.replicate_same_server_id || > - (do_accept_own_server_id= rpl_semi_sync_slave_enabled))) || > + do_accept_own_server_id)) || >event_that_should_be_ignored(buf) || >/* > the following conjunction deals with IGNORE_SERVER_IDS, if set > Regards, Sergei VP of MariaDB Server Engineering and secur...@mariadb.org ___ Mailing list: https://launchpad.net/~maria-developers Post to : maria-developers@lists.launchpad.net Unsubscribe : https://launchpad.net/~maria-developers More help : https://help.launchpad.net/ListHelp