all,

I recently completed upgrading the core database pool for our site from
4.0.18 (32-bit) to 5.0.27 (64-bit) but am now experiencing intermittent
replication instability.

we replicate ~20M DMLs/day across 18 DB nodes in three datacenters.  about
once/week I'm getting a 2013 error (error reading packet from server) but
only on the two slaves whose master is in a different datacenter (never once
among intra-datacenter nodes).  this would make me suspicious of the network
(at least WAN links/devices) except this never happened once in two years
w/4.0.18.  when it happens I am able to fix it by doing a slave stop/change
master (to last execute)/slave start but I would like to find the root of
the problem.

is anyone aware of any reported replication stability issues w/5.0.27?  are
their any my.cnf parameters I can change to minimize the frequency?  does
this sound like a network issue and if so why did 4.0.18 not fail in this
way?

it's not critical at this point but it's extremely annoying so any advice
would be appreciated...

Reply via email to