RE: Replication corruption and 64 bit mysql

2004-06-30 Thread Matthew Kent
For the record/list archives,

The solution seems to have been upgrading to Fedora Core 2
kernel-smp-2.6.6-1.435.x86_64.rpm. What fix it contained that affected
my case... I'm not sure :)

Been running okay for 18 hours at high volume!

- Matt

 -Original Message-
 From: Matthew Kent
 Sent: Monday, June 28, 2004 4:11 PM
 To: [EMAIL PROTECTED]
 Subject: Replication corruption and 64 bit mysql
 
 After several long days trying to fix this I'm running out of ideas.
 
 Master: RedHat 7.3 kernel 2.4, MySQL 4.0.20 32 bit (mysql.com rpm) -
 Slave: Fedora Core 2 64 bit kernel 2.6.5, MySQL-Max-4.0.20-0 64 bit
 (mysql.com rpm)
 
 In a varying amount of time after a few hundred thousand queries
 replication dies with
 
 snippy
 040625 16:19:12  Error in Log_event::read_log_event(): 'Event too
 small', data_len: 0, event_type: 0
 040625 16:19:12  Error reading relay log event: slave SQL thread
 aborted
 because of I/O error
 /snipped
 
 Using instructions from Sasha Pachev
 http://groups.google.ca/groups?hl=enlr=ie=UTF-
 8selm=c400pk%245pd%241%
 40FreeBSD.csie.NCTU.edu.tw I've looked at the binlog on the slave and
 can indeed verify a large chunk of empty space and that query is
 indeed
 logged on the master.
 
 Fun part is that it does work when I point our 32 bit master to
 different 32 bit slave. So I know it's not a problem with our old
 servers, just this fancy new one.
 
 So far I've
 
 - Tried a different master (we have a pool of 5 similar servers to use
 as a master).
 - Tried 32-bit server instead of 64-bit Max on the slave (couldn't get
 64 bit non-Max to start at all, would just dump).
 - Tried swapping nic to a different brand.
 - Used tcpdump to attempt to spot any network level issues.
 - Tried pointing the binlogs on the master to another local disk
 separate from the data.
 - Examined the changelogs for the nic drivers.
 - Googled this to no end.
 
 With no luck.
 
 I'm open for suggestions.
 
 I suppose the next step is to install core 2 32-bit and try again.
 
 Thanks,
 
 Matthew Kent \ SA \ bravenet.com \ 1-250-954-3203 ext 108
 
 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Replication corruption and 64 bit mysql

2004-06-30 Thread Andrew Pattison
I've a funny feeling the kernel authors re-wrote much of the SMP code for
2.6 with the aim of getting it to scale better to 8 processor systems, so I
would expect there to be a few stray bugs in it. You could always downgrade
to 2.4 if it doesn't work out ;-)

Cheers

Andrew.

- Original Message - 
From: Matthew Kent [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, June 30, 2004 6:08 PM
Subject: RE: Replication corruption and 64 bit mysql


For the record/list archives,

The solution seems to have been upgrading to Fedora Core 2
kernel-smp-2.6.6-1.435.x86_64.rpm. What fix it contained that affected
my case... I'm not sure :)

Been running okay for 18 hours at high volume!

- Matt

 -Original Message-
 From: Matthew Kent
 Sent: Monday, June 28, 2004 4:11 PM
 To: [EMAIL PROTECTED]
 Subject: Replication corruption and 64 bit mysql

 After several long days trying to fix this I'm running out of ideas.

 Master: RedHat 7.3 kernel 2.4, MySQL 4.0.20 32 bit (mysql.com rpm) -
 Slave: Fedora Core 2 64 bit kernel 2.6.5, MySQL-Max-4.0.20-0 64 bit
 (mysql.com rpm)

 In a varying amount of time after a few hundred thousand queries
 replication dies with

 snippy
 040625 16:19:12  Error in Log_event::read_log_event(): 'Event too
 small', data_len: 0, event_type: 0
 040625 16:19:12  Error reading relay log event: slave SQL thread
 aborted
 because of I/O error
 /snipped

 Using instructions from Sasha Pachev
 http://groups.google.ca/groups?hl=enlr=ie=UTF-
 8selm=c400pk%245pd%241%
 40FreeBSD.csie.NCTU.edu.tw I've looked at the binlog on the slave and
 can indeed verify a large chunk of empty space and that query is
 indeed
 logged on the master.

 Fun part is that it does work when I point our 32 bit master to
 different 32 bit slave. So I know it's not a problem with our old
 servers, just this fancy new one.

 So far I've

 - Tried a different master (we have a pool of 5 similar servers to use
 as a master).
 - Tried 32-bit server instead of 64-bit Max on the slave (couldn't get
 64 bit non-Max to start at all, would just dump).
 - Tried swapping nic to a different brand.
 - Used tcpdump to attempt to spot any network level issues.
 - Tried pointing the binlogs on the master to another local disk
 separate from the data.
 - Examined the changelogs for the nic drivers.
 - Googled this to no end.

 With no luck.

 I'm open for suggestions.

 I suppose the next step is to install core 2 32-bit and try again.

 Thanks,

 Matthew Kent \ SA \ bravenet.com \ 1-250-954-3203 ext 108

 --
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:
 http://lists.mysql.com/[EMAIL PROTECTED]


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Replication corruption and 64 bit mysql

2004-06-28 Thread Matthew Kent
After several long days trying to fix this I'm running out of ideas.

Master: RedHat 7.3 kernel 2.4, MySQL 4.0.20 32 bit (mysql.com rpm) -
Slave: Fedora Core 2 64 bit kernel 2.6.5, MySQL-Max-4.0.20-0 64 bit
(mysql.com rpm)

In a varying amount of time after a few hundred thousand queries
replication dies with 

snippy
040625 16:19:12  Error in Log_event::read_log_event(): 'Event too
small', data_len: 0, event_type: 0
040625 16:19:12  Error reading relay log event: slave SQL thread aborted
because of I/O error
/snipped

Using instructions from Sasha Pachev
http://groups.google.ca/groups?hl=enlr=ie=UTF-8selm=c400pk%245pd%241%
40FreeBSD.csie.NCTU.edu.tw I've looked at the binlog on the slave and
can indeed verify a large chunk of empty space and that query is indeed
logged on the master.

Fun part is that it does work when I point our 32 bit master to
different 32 bit slave. So I know it's not a problem with our old
servers, just this fancy new one.

So far I've 

- Tried a different master (we have a pool of 5 similar servers to use
as a master).
- Tried 32-bit server instead of 64-bit Max on the slave (couldn't get
64 bit non-Max to start at all, would just dump).
- Tried swapping nic to a different brand.
- Used tcpdump to attempt to spot any network level issues.
- Tried pointing the binlogs on the master to another local disk
separate from the data. 
- Examined the changelogs for the nic drivers.
- Googled this to no end.

With no luck.

I'm open for suggestions. 

I suppose the next step is to install core 2 32-bit and try again.

Thanks,

Matthew Kent \ SA \ bravenet.com \ 1-250-954-3203 ext 108

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]