Hi, Seppo, Jan!
Note, this is 10.2 patch below.
> commit 4b164f176e6
> Author: Seppo Jaakola
> Date: Wed Sep 15 09:16:44 2021 +0300
>
> MDEV-25114 Crash: WSREP: invalid state ROLLED_BACK (FATAL)
I think this should say
MDEV-23328 Server hang due to Galera lock conflict resolution
it'
Hi Sergei,
On Fri, Oct 1, 2021 at 9:05 PM Sergei Golubchik wrote:
> Hi, Seppo, Jan!
>
> Note, this is 10.2 patch below.
>
> >
> > MDEV-25114 Crash: WSREP: invalid state ROLLED_BACK (FATAL)
>
> I think this should say
>
> MDEV-23328 Server hang due to Galera lock conflict resolution
>
>
Sur
Hi, Jan!
On Oct 04, Jan Lindström wrote:
> Hi Sergei,
> > > +/* This is wrapper for wsrep_break_lock in thr_lock.c */
> > > +static int wsrep_thr_abort_thd(void *bf_thd_ptr, void *victim_thd_ptr,
> > > my_bool signal)
> > > +{
> > > + THD* victim_thd= (THD *) victim_thd_ptr;
> > > + /* We need
Hi Sergei,
Answers below:
>
> > > > +/* This is wrapper for wsrep_break_lock in thr_lock.c */
> > > > +static int wsrep_thr_abort_thd(void *bf_thd_ptr, void
> *victim_thd_ptr, my_bool signal)
> > > > +{
> > > > + THD* victim_thd= (THD *) victim_thd_ptr;
> > > > + /* We need to lock THD::LOCK_th
Hi, Jan!
On Oct 06, Jan Lindström wrote:
> >
> > > > > +/* This is wrapper for wsrep_break_lock in thr_lock.c */
> > > > > +static int wsrep_thr_abort_thd(void *bf_thd_ptr, void
> > > > > *victim_thd_ptr, my_bool signal)
> > > > > +{
> > > > > + THD* victim_thd= (THD *) victim_thd_ptr;
> > > > >
Hi Sergei,
Answers to your questions below:
On Wed, Oct 6, 2021 at 5:03 PM Sergei Golubchik wrote:
> Hi, Jan!
>
> On Oct 06, Jan Lindström wrote:
> > >
> > > > > > +/* This is wrapper for wsrep_break_lock in thr_lock.c */
> > > > > > +static int wsrep_thr_abort_thd(void *bf_thd_ptr, void
> *vic
Hi, Jan!
On Oct 06, Jan Lindström wrote:
> > > > >
> > > > > I must say the thr_lock code is not familiar to me but there
> > > > > are mysql_mutex_lock() calls to lock->mutex. After code review
> > > > > it is not clear to me what that mutex is.
> > > >
> > this is for table locks. `lock` is `dat
Hi Sergei,
>
> > if (victim_trx) {
> > const trx_id_t victim_trx_id= victim_trx->id;
> > const longlong victim_thread= thd_get_thread_id(victim_thd);
> > /* This is necessary as correct mutexing order is
> > lock_sys -> trx -> THD::LOCK_thd_data and below
> > function assume
Hi Sergei,
Update on what happens after TOI failure.
> What I mean it, what if KILL would ignore WSREP_TO_ISOLATION_BEGIN
> failure and will just proceed killing? Perhaps if
> WSREP_TO_ISOLATION_BEGIN fails it means that there can be no bf aborts
> anyway? Could you try to find it out?
>
After
Update on disconnect
>
> > // As trx is now referenced it can't go away
>
> Hmm. What happens if the thd that owns this transaction is killed or the
> user disconnects? THD gets freed. What happens to the referenced trx?
>
I created new mtr-tests (galera_disconnect_debug) to try disconnecti
Hi Sergei,
After QA runs done by Ramesh, we now know the latest fix candidate i.e.
what is in bb-10.2-MDEV-25114-galera-v2 is incorrect. Problem is in
wsrep_close_connections() as it holds LOCK_thread_count while it does
abort_replicated that will call wsrep_abort_transaction and there we use
find
Hi, Jan!
On Oct 10, Jan Lindström wrote:
> Hi Sergei,
> >
> > > if (victim_trx) {
> > > const trx_id_t victim_trx_id= victim_trx->id;
> > > const longlong victim_thread= thd_get_thread_id(victim_thd);
> > > /* This is necessary as correct mutexing order is
> > > lock_sys -> trx -
Hi, Jan!
Great, thanks!
On Oct 11, Jan Lindström wrote:
> Update on disconnect
>
> >
> > > // As trx is now referenced it can't go away
> >
> > Hmm. What happens if the thd that owns this transaction is killed or the
> > user disconnects? THD gets freed. What happens to the referenced trx?
Hi Sergei,
>
> > > trx_rw_is_active needs to be modified to do that, right?
> >
> > No this is current behaviour, I did not change anything on
> > trx_rw_is_active
>
> In xtradb trx_rw_is_active returns bool.
> I think xtradb is still the default innodb in 10.2.
>
> In innobase it returns, indeed
Hi Sergei,
Update on wsrep_close_connections problem. My suggestion to fix this issue
is on
https://github.com/MariaDB/server/commit/99cbe03a44cc95e6f548550df51e7201ebea3b9d
If you have a better solution, please advise.
R: Jan
On Mon, Oct 11, 2021 at 12:52 PM Jan Lindström
wrote:
> Hi Sergei
Hi, Jan!
Here's an idea of the fix:
Let's always use the KILL mutex locking order, that is
victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex
For this we need to fix wsrep_abort_transaction(), which is called from the
server, and wsrep_innobase_kill_one_trx(), which is calle
Hi,
Few questions:
(1) Is this review for a full patch or just problems on
wsrep_abort_transaction ?
(2) In case at wsrep_abort_transaction we do not have a transaction idea is
that we do not anymore want to enter InnoDB i.e. innobase_kill_query,
that is the reason we set MUST_ABORT to wsrep_conf
Hi, Jan!
On Oct 15, Jan Lindström wrote:
> Few questions:
>
> (1) Is this review for a full patch or just problems on
> wsrep_abort_transaction ?
a full patch
> (2) In case at wsrep_abort_transaction we do not have a transaction idea is
> that we do not anymore want to enter InnoDB i.e. innobas
Hi Sergei,
I have implemented PlanE as agreed on
branch bb-10.2-MDEV-25114-planE-galera and mostly regression testing looks
promising. However,
I have problems with MDL-locks. For example test case
galera.galera_toi_lock_exclusive hangs and I have not yet found out why. I
will ask
help from Seppo.
Hi Sergei,
This does not seem to work. Consider following:
CREATE TABLE t1 (id INT PRIMARY KEY) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1);
connection node_2;
SET AUTOCOMMIT=OFF;
START TRANSACTION;
INSERT INTO t1 VALUES (2);
connection node_2a;
ALTER TABLE t1 ADD COLUMN f2 INTEGER, LOCK=EXCLUSIVE;
Hi Sergei,
Your suggestion does not work. There are more than one problem
(1) wsrep_abort_transaction does not release MDL-lock
(2) innobase_kill_one_trx crashes at wsrep->abort_pre_commit() because
transaction registered inside wsrep has disappeared (this does not happen
if THD::LOCK_thd_data is
21 matches
Mail list logo