Christian, the patch below may fix the problem if it is in the initialization of the raw device to zero.
InnoDB does all normal i/o to/from the data files to memory addresses aligned by UNIV_PAGE_SIZE. This is because earlier versions used the Windows native AIO and that requires aligned memory addresses in i/o operations. But since the initialization of the files was normal i/o, not AIO, I did not remember to align the buffer there. InnoDB has its own read-ahead for the buffer pool. Thus the read-ahead of the OS is not necessarily needed. Best regards, Heikki Tuuri Innobase Oy --- InnoDB - transactions, hot backup, and foreign key support for MySQL See http://www.innodb.com, download MySQL-Max from http://www.mysql.com ChangeSet 1.1097 02/07/19 08:33:52 [EMAIL PROTECTED] +1 -0 os0file.c: Align the buffer used in initing a data file to zero; this may be needed if the data file is actually a raw device innobase/os/os0file.c 1.39 02/07/19 08:33:36 [EMAIL PROTECTED] +6 -2 Align the buffer used in initing a data file to zero; this may be needed if the data file is actually a raw device # This is a BitKeeper patch. What follows are the unified diffs for the # set of deltas contained in the patch. The rest of the patch, the part # that BitKeeper cares about, is below these diffs. # User: heikki # Host: hundin.mysql.fi # Root: /home/heikki/mysql3 --- 1.38/innobase/os/os0file.c Mon Jul 8 19:28:42 2002 +++ 1.39/innobase/os/os0file.c Fri Jul 19 08:33:36 2002 @@ -690,6 +690,7 @@ ulint n_bytes; ibool ret; byte* buf; + byte* buf2; ulint i; ut_a(size == (size & 0xFFFFFFFF)); @@ -697,7 +698,10 @@ /* We use a very big 8 MB buffer in writing because Linux may be extremely slow in fsync on 1 MB writes */ - buf = ut_malloc(UNIV_PAGE_SIZE * 512); + buf2 = ut_malloc(UNIV_PAGE_SIZE * 513); + + /* Align the buffer for possible raw i/o */ + buf = ut_align(buf2, UNIV_PAGE_SIZE); /* Write buffer full of zeros */ for (i = 0; i < UNIV_PAGE_SIZE * 512; i++) { @@ -725,7 +729,7 @@ offset += n_bytes; } - ut_free(buf); + ut_free(buf2); ret = os_file_flush(file); ----- Original Message ----- From: "Christian Jaeger" <[EMAIL PROTECTED]> Newsgroups: mailing.database.mysql Sent: Friday, July 19, 2002 6:37 AM Subject: Innodb and unbuffered raw io on linux? > Hello Heikki and all, > > I've already asked about this a year ago, but didn't finish my > investigations then. > > What's the status with innodb and *unbuffered raw* io on linux? > > The manual describes the use of the "newraw" and "raw" options, and I > know these work on disk devices (like /dev/sda8), but this isn't raw > io, it's still cached by the kernel and so takes up RAM additional to > the cache from innodb (as well as a bit CPU to copy over the data > between kernel and user space). If you want to do direct IO, the use > of the 'raw' tool to set up a 'raw character device' mapped to the > disk block device is needed: > > cd /dev > mkdir raw > umask 077 > mknod rawctl u 162 0 > umask 007 > mknod raw/raw1 u 162 1 > mknod raw/raw2 u 162 2 > chgrp mysql raw/raw1 > # ^- I'm not sure whether the access rights of the mapped device > # take precedence over those of the original block device, though > raw raw/raw1 sda8 > > I've tried Mysql with this config: > #innodb_data_file_path=/dev/sda8:1906Mraw <- did work, but buffered > innodb_data_file_path=/dev/raw/raw1:1906Mraw > > 020719 00:59:24 mysqld started > InnoDB: Operating system error number 22 in a file operation. > InnoDB: See http://www.innodb.com/ibman.html for installation help. > InnoDB: Look from section 13.2 at http://www.innodb.com/ibman.html > InnoDB: what the error number means or use the perror program of MySQL. > InnoDB: Cannot continue operation. > 020719 00:59:25 mysqld ended > > perror 22 > Error code 22: Invalid argument > > This error code is typical for when buffers are not aligned to sector > sized memory boundaries, which is necessary for unbuffered io to work > on linux. > I've written an experimental program that shows this and put it here: > http://pflanze.mine.nu/~chris/mysql/o_direct.c > > So I guess Innodb is not ready for unbuffered io. I'm also guessing > that it's probably not that easy to achieve good performance with > unbuffered io, since you would probably have to do readahead and so > on yourself. > > I'm also unsure about the current status of rawio in linux (2.4). > Reading on http://oss.sgi.com/projects/rawio/ (under the FAQ), they > say that they have a "better" implementation than the one from > Stephen Tweedie/Redhat. But the code in kernel 2.4 seems to be only > the one from Stephen Tweedie. > This is what the source code of the 'dd' tool (as found in > Debian/testing) shows, btw: > /* ... > The page alignment is necessary on any linux system that supports > either the SGI raw I/O patch or Stephen Tweedies raw I/O patch. > It is necessary when accessing raw (i.e. character special) disk > devices on Unixware or other SVR4-derived system. */ > > > Hope this helps a bit. > What do you think about it? > I could put a bit of time aside for testing (or maybe more, but who > would pay me?...:) > > Cheers, > Christian. > -- > Christian Jaeger Programmer & System Engineer +41 1 430 45 26 > ETHLife CMS Project - www.ethlife.ethz.ch/newcms - www.ethlife.ethz.ch > > --------------------------------------------------------------------- > Before posting, please check: > http://www.mysql.com/manual.php (the manual) > http://lists.mysql.com/ (the list archive) > > To request this thread, e-mail <[EMAIL PROTECTED]> > To unsubscribe, e-mail <[EMAIL PROTECTED]> > Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php > --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php