Christian,
the patch below may fix the problem if it is in the initialization of the
raw device to zero.
InnoDB does all normal i/o to/from the data files to memory addresses
aligned by UNIV_PAGE_SIZE. This is because earlier versions used the Windows
native AIO and that requires aligned memory addresses in i/o operations. But
since the initialization of the files was normal i/o, not AIO, I did not
remember to align the buffer there.
InnoDB has its own read-ahead for the buffer pool. Thus the read-ahead of
the OS is not necessarily needed.
Best regards,
Heikki Tuuri
Innobase Oy
---
InnoDB - transactions, hot backup, and foreign key support for MySQL
See http://www.innodb.com, download MySQL-Max from http://www.mysql.com
ChangeSet
1.1097 02/07/19 08:33:52 [EMAIL PROTECTED] +1 -0
os0file.c:
Align the buffer used in initing a data file to zero; this may be needed
if the data file is actually a raw device
innobase/os/os0file.c
1.39 02/07/19 08:33:36 [EMAIL PROTECTED] +6 -2
Align the buffer used in initing a data file to zero; this may be needed
if the data file is actually a raw device
# This is a BitKeeper patch. What follows are the unified diffs for the
# set of deltas contained in the patch. The rest of the patch, the part
# that BitKeeper cares about, is below these diffs.
# User: heikki
# Host: hundin.mysql.fi
# Root: /home/heikki/mysql3
--- 1.38/innobase/os/os0file.c Mon Jul 8 19:28:42 2002
+++ 1.39/innobase/os/os0file.c Fri Jul 19 08:33:36 2002
@@ -690,6 +690,7 @@
ulint n_bytes;
ibool ret;
byte* buf;
+ byte* buf2;
ulint i;
ut_a(size == (size 0x));
@@ -697,7 +698,10 @@
/* We use a very big 8 MB buffer in writing because Linux may be
extremely slow in fsync on 1 MB writes */
- buf = ut_malloc(UNIV_PAGE_SIZE * 512);
+ buf2 = ut_malloc(UNIV_PAGE_SIZE * 513);
+
+ /* Align the buffer for possible raw i/o */
+ buf = ut_align(buf2, UNIV_PAGE_SIZE);
/* Write buffer full of zeros */
for (i = 0; i UNIV_PAGE_SIZE * 512; i++) {
@@ -725,7 +729,7 @@
offset += n_bytes;
}
- ut_free(buf);
+ ut_free(buf2);
ret = os_file_flush(file);
- Original Message -
From: Christian Jaeger [EMAIL PROTECTED]
Newsgroups: mailing.database.mysql
Sent: Friday, July 19, 2002 6:37 AM
Subject: Innodb and unbuffered raw io on linux?
Hello Heikki and all,
I've already asked about this a year ago, but didn't finish my
investigations then.
What's the status with innodb and *unbuffered raw* io on linux?
The manual describes the use of the newraw and raw options, and I
know these work on disk devices (like /dev/sda8), but this isn't raw
io, it's still cached by the kernel and so takes up RAM additional to
the cache from innodb (as well as a bit CPU to copy over the data
between kernel and user space). If you want to do direct IO, the use
of the 'raw' tool to set up a 'raw character device' mapped to the
disk block device is needed:
cd /dev
mkdir raw
umask 077
mknod rawctl u 162 0
umask 007
mknod raw/raw1 u 162 1
mknod raw/raw2 u 162 2
chgrp mysql raw/raw1
# ^- I'm not sure whether the access rights of the mapped device
# take precedence over those of the original block device, though
raw raw/raw1 sda8
I've tried Mysql with this config:
#innodb_data_file_path=/dev/sda8:1906Mraw - did work, but buffered
innodb_data_file_path=/dev/raw/raw1:1906Mraw
020719 00:59:24 mysqld started
InnoDB: Operating system error number 22 in a file operation.
InnoDB: See http://www.innodb.com/ibman.html for installation help.
InnoDB: Look from section 13.2 at http://www.innodb.com/ibman.html
InnoDB: what the error number means or use the perror program of MySQL.
InnoDB: Cannot continue operation.
020719 00:59:25 mysqld ended
perror 22
Error code 22: Invalid argument
This error code is typical for when buffers are not aligned to sector
sized memory boundaries, which is necessary for unbuffered io to work
on linux.
I've written an experimental program that shows this and put it here:
http://pflanze.mine.nu/~chris/mysql/o_direct.c
So I guess Innodb is not ready for unbuffered io. I'm also guessing
that it's probably not that easy to achieve good performance with
unbuffered io, since you would probably have to do readahead and so
on yourself.
I'm also unsure about the current status of rawio in linux (2.4).
Reading on http://oss.sgi.com/projects/rawio/ (under the FAQ), they
say that they have a better implementation than the one from
Stephen Tweedie/Redhat. But the code in kernel 2.4 seems to be only
the one from Stephen Tweedie.
This is what the source code of the 'dd' tool (as found in
Debian/testing) shows, btw:
/* ...
The page alignment is necessary on any linux system that supports
either the SGI raw I/O patch or Stephen Tweedies raw I/O patch.
It is necessary when accessing raw (i.e. character special) disk
devices on Unixware or other SVR4-derived system