Hello,

I'm trying to troubleshoot a performance a replication issue between two 
version of MariaDB.  I know that we are using two EOL versions, but I want to 
rule out all issues before prioritizing migrating to a new version.

I have built a new dedicated replication server (B) to replace the current 
dedicated replica (A).  A and B are both running identical OSes and versions 
and near identical hardware in a RAID 10 configuration.  The main difference is 
that B has newer disks that slightly outperform A.  I benchmarked both A and B 
to validate this.  Both A and B are configured identically (other then 
server-id).

Replica A (and primary)  is running MariaDB 10.3.31.  Replica B is running  
10.3.39.

A few other things to note:

  *   B is not feeding off of the primary, but A -- that is until we sunset A.
  *   A is handing moderate read only traffic
  *   B is just running as a replica with no other responsibilities

I have external monitoring to alert when either hits 500+ 
Seconds_Behind_Master.  Since standing B up, I am seeing it fall behind a few 
times a day.   The interesting thing is that A is not falling behind from the 
primary.

Upon further examination, a scheduled job that executes hundreds of similar 
delete operations off the same table (that has approx. 3 million records) is 
where B is largely getting hung up.  Something like this:

DELETE FROM table WHERE x = 800000 AND y = 40000

The table has an index on just x.

Running ANALYZE FORMAT=JSON SELECT * FROM table WHERE x = 800000 AND y = 40000  
from B show a striking difference in total time spent:

REPLICA B:
{
  "query_block": {
    "select_id": 1,
    "r_loops": 1,
    "r_total_time_ms": 490.13,
    "table": {
      "table_name": "table",
      "access_type": "ref",
      "possible_keys": ["table_index"],
      "key": "table_index",
      "key_length": "4",
      "used_key_parts": ["x"],
      "ref": ["const"],
      "r_loops": 1,
      "rows": 795555,
      "r_rows": 31,
      "r_total_time_ms": 490.08,
      "filtered": 100,
      "r_filtered": 100,
      "index_condition": "table.y = '40000'"
    }
  }
}
vs:
REPLICA A:
{
  "query_block": {
    "select_id": 1,
    "r_loops": 1,
    "r_total_time_ms": 106.37,
    "table": {
      "table_name": "table",
      "access_type": "ref",
      "possible_keys": ["table_index"],
      "key": "table_index",
      "key_length": "4",
      "used_key_parts": ["x"],
      "ref": ["const"],
      "r_loops": 1,
      "rows": 1146482,
      "r_rows": 31,
      "r_total_time_ms": 106.34,
      "filtered": 100,
      "r_filtered": 100,
      "index_condition": "table.y = '40000'"
    }
  }
}

That is a significant difference performance wise.  From a hardware point of 
view, B should have a slight edge over A.  But the contrary is true by a large 
margin.


As a short term "fix", I added a secondary index: create index table_index2 on 
table(x, y);   This significantly speeds up this operation, and, for now, seems 
to deal with much of the experienced replication lab in B:

SERVER B:
{
  "query_block": {
    "select_id": 1,
    "r_loops": 1,
    "r_total_time_ms": 0.253,
    "table": {
      "table_name": "table",
      "access_type": "ref",
      "possible_keys": [
        "table_index",
        "table_index2"
      ],
      "key": "table_index2,
      "key_length": "8",
      "used_key_parts": ["x", "y"],
      "ref": ["const", "const"],
      "r_loops": 1,
      "rows": 31,
      "r_rows": 31,
      "r_total_time_ms": 0.2037,
      "filtered": 100,
      "r_filtered": 100
    }
  }
}

This seems reasonable as a general optimization, but provides little comfort in 
understanding the root cause of my performance issue with B.  I'm hesitant to 
retire A in favor of B until I can get to the bottom of this.

The question I have for the community is: what would you recommend that I do to 
get further perspective on the issue to determine a course of action?

Thanks everyone,

Mike



Confidentiality Notice: The information contained in this electronic message 
and any attachments to this message are intended only for the individual(s) 
addressed in the message and may contain proprietary and confidential 
information. If you are not the intended recipient, you should not disseminate, 
distribute, or copy this e-mail. Please notify the sender and destroy this 
message. WARNING: Computer viruses can be transmitted via email. The recipient 
should scan this email before opening it. The company accepts no liability for 
any damage caused by any virus transmitted by this email.
_______________________________________________
discuss mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to