Re: Zero Copy IO

2001-04-09 Thread Kai Makisara

This answer is longish and I send this only to linux-scsi to save
linux-kernel bandwidth.

On Sun, 8 Apr 2001, Alex Q Chen wrote:

> I am trying to find a way to pin down user space memory from kernel, so
> that these user space buffer can be used for direct IO transfer or
> otherwise known as "zero copying IO".  Searching through the Internet and
> reading comments on various news groups, it would appear that most
> developers including Linus himself doesn't believe in the benefit of "zero
> copying IO".  Most of the discussion however was based on network card
> drivers.  For certain other drivers such as SCSI Tape driver, which need to
> handle great deal of data transfer, it would seemed still be more
> advantageous to enable zero copy IO than copy_from_user() and copy_to_user
> () all the data.  Other OS such as AIX and OS2 have kernel functions that

Whether zero-copy for tapes is more efficient than the current method
using a kernel buffer depends on several factors and there is no unique
answer. If we want to maximize the usage of a CPU in a multitasking system
and don't care about the speed of the task writing to tape, zero-copy may
be better. This is assuming the transfers are long enough compared to the
increased overhead. (I would like to see someone to provide a quantitative
answer what is the minimum transfer size where zero-copy is advantageous).

On the other hand, if we want to maximize the throughput of the task
writing to tape, this gets more complicated. The memory-memory transfers
are much faster than the memory-tape buffer transfers. If we have some
spare bus cycles, the extra transfer is not a decisive factor.

The current tape driver uses by default so-called "asynchronous writes".
This means that a SCSI write is started and the write() function does not
wait for it to finish. The status is checked at the next write(). The user
task can use the SCSI transaction time to collect more data for writing.
This is not possible with zero-copy if we want to retain the Unix write
semantics. (Well, we are not completely obeying the semantics with
asynchronous writes because error reporting is delayed. However, this is
the case anyway if we enable drive buffering. Without drive buffering we
don't get decent throughput.)

In fixed block mode the driver can buffer several write()s worth of data
and this may increase throughput. The same applies to reading in fixed
block mode. Reading in variable block mode is a case where no speedup is
used.

If we use kernel space buffers, we can increase the parallelism without
changes to the user programs. How far this is useful is another question.
When using zero-copy, a program can use parallelism but this requires
changes to the program (multiple buffers and some kind of asynchrous i/o
framework).

Earlier it has been easy to postpone any discussion about using zero-copy
in the drivers. As you have seen from answers from others, this is
changing and fairly soon we will have the the necessary support for
zero-copy in the kernel. I am not against modifying the tape driver to use
zero-copy, but before doing it I would like to see/do a quantitative
analysis of the advantages and disadvantages in the common and not so
common cases.

Kai


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



Re: Zero Copy IO

2001-04-09 Thread Jeremy Jackson

Douglas Gilbert wrote:

> "Alex Q Chen" <[EMAIL PROTECTED]> wrote:
>
> > I am trying to find a way to pin down user space
> > memory from kernel, so that these user space buffer
> > can be used for direct IO transfer or otherwise
> > known as "zero copying IO".  Searching through the
> > Internet and reading comments on various news groups,
> > it would appear that most developers including Linus
> > himself doesn't believe in the benefit of "zero
> > copying IO".  Most of the discussion however was based
> > on network card drivers.  For certain other drivers
> > such as SCSI Tape driver, which need to handle great
> > deal of data transfer, it would seemed still be more
> > advantageous to enable zero copy IO than copy_from_user()
> > and copy_to_user() all the data.  Other OS such as AIX
> > and OS2 have kernel functions that can be used to
> > accomplish such a task.  Has any ground work been done
> > in Linux 2.4 to enable "zero copying IO"?
>
> Alex,
> The kiobufs mechanism in the 2.4 series is the appropriate
> tool for avoiding copy_from_user() and copy_to_user().
> The definitive driver is in drivers/char/raw.c which
> does synchronous IO to block devices such as disks
> (but is probably not appropriate for tapes).
>
> The SCSI generic (sg) driver supports direct IO. The driver
> in lk 2.4.3 has the direct IO code commented out while
> a version that I'm currently testing (sg 3.1.18 at
> www.torque.net/sg) has its direct IO code activated. I have
> a web page comparing throughput times and CPU utilizations
> at http://www.torque.net/sg/rbuf_tbl.html . My testing
> indicates that the kiobufs mechanism is now working
> quite well. For various reasons I still think that it
> is best to default to indirect IO and let speed hungry
> users enable dio (which is done in sg via procfs). Even
> when the user selects direct IO is should be possible to
> fall back to indirect IO. [Sg does this when a SCSI
> adapter can't support direct IO (e.g. an ISA adapter).]
>
> Since the SCSI tape (st) driver is structurally similar
> to sg, it should be possible to add direct IO support
> to st.
>
> One thing to note is that when you let the user provide
> the buffer for direct IO (e.g. with malloc) then on
> the i386 it won't be contiguous from a bus address POV.
> This means large scatter gather lists (typically with
> each element 4 KB on i386) which can be time consuming
> to load on some SCSI adapters. One way around this would
> be for a driver to provide "malloc/free" like ioctls.

I'm not very knowledgeable, but doesn't the sound driver
use mmap() to do this?  Either way,  AGP GART is
basically a paged-MMU allowing non-contiguous phys-mem
to be made to look contiguous from the AGP *and* from PCI
(on most chipsets?).  perhaps this would be helpful.

Large contiguous phys-mem seems to be difficult currently,
but would be nice as it would allow use of larger MMU pages
with many cpus.  Someone mentioned reverse page table support
would be required first...

>
> Doug Gilbert
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]



Re: Zero Copy IO

2001-04-08 Thread Douglas Gilbert

"Alex Q Chen" <[EMAIL PROTECTED]> wrote:

> I am trying to find a way to pin down user space 
> memory from kernel, so that these user space buffer 
> can be used for direct IO transfer or otherwise 
> known as "zero copying IO".  Searching through the 
> Internet and reading comments on various news groups, 
> it would appear that most developers including Linus 
> himself doesn't believe in the benefit of "zero 
> copying IO".  Most of the discussion however was based 
> on network card drivers.  For certain other drivers 
> such as SCSI Tape driver, which need to handle great 
> deal of data transfer, it would seemed still be more
> advantageous to enable zero copy IO than copy_from_user() 
> and copy_to_user() all the data.  Other OS such as AIX 
> and OS2 have kernel functions that can be used to 
> accomplish such a task.  Has any ground work been done 
> in Linux 2.4 to enable "zero copying IO"?

Alex,
The kiobufs mechanism in the 2.4 series is the appropriate
tool for avoiding copy_from_user() and copy_to_user().
The definitive driver is in drivers/char/raw.c which
does synchronous IO to block devices such as disks
(but is probably not appropriate for tapes).

The SCSI generic (sg) driver supports direct IO. The driver
in lk 2.4.3 has the direct IO code commented out while
a version that I'm currently testing (sg 3.1.18 at
www.torque.net/sg) has its direct IO code activated. I have
a web page comparing throughput times and CPU utilizations
at http://www.torque.net/sg/rbuf_tbl.html . My testing
indicates that the kiobufs mechanism is now working
quite well. For various reasons I still think that it 
is best to default to indirect IO and let speed hungry
users enable dio (which is done in sg via procfs). Even 
when the user selects direct IO is should be possible to
fall back to indirect IO. [Sg does this when a SCSI
adapter can't support direct IO (e.g. an ISA adapter).]

Since the SCSI tape (st) driver is structurally similar 
to sg, it should be possible to add direct IO support 
to st.

One thing to note is that when you let the user provide
the buffer for direct IO (e.g. with malloc) then on
the i386 it won't be contiguous from a bus address POV.
This means large scatter gather lists (typically with
each element 4 KB on i386) which can be time consuming
to load on some SCSI adapters. One way around this would
be for a driver to provide "malloc/free" like ioctls.

Doug Gilbert
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]