Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-30 Thread Andrew Grover
On 3/30/06, Kumar Gala <[EMAIL PROTECTED]> wrote:
> I was under the impression that the DMA engine would provide a "sync"
> cpu based memcpy (PIO) if a real HW channel wasn't avail, if this is
> left to the client that's fine.  So how does the client know he
> should use normal memcpy()?

It has to keep track of what DMA channel to use, which it gets when
the channel ADDED callback happens. So it's basically

if (some_client_struct->dma_chan)
dma_memcpy()
else
memcpy()

The async memcpy has the added requirement that at some point the
client must verify the copies have been completed, so doing async
memcopies does require more work on the client's part.

> Sounds good for a start.  Have you given any thoughts on handling
> priorities between clients?
>
> I need to take a look at the latest patches. How would you guys like
> modifications?

Haven't given any thought to priorities yet -- we've been focusing on
getting the 1 client case to perform well. :)

Chris posted a link to this: git://198.78.49.142/~cleech/linux-2.6
branch ioat-2.6.17

So you can post patches against that, or the patches posted here apply
against davem's git tree.

Regards -- Andy
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-30 Thread Kumar Gala


On Mar 29, 2006, at 5:05 PM, Andrew Grover wrote:


On 3/28/06, Kumar Gala <[EMAIL PROTECTED]> wrote:

Do you only get callback when a channel is available?


Yes


How do you
decide to do to provide PIO to the client?


The client is responsible for using any channels it gets, or falling
back to memcpy() if it doesn't get any. (I don't understand how PIO
comes into the picture..?)


I was under the impression that the DMA engine would provide a "sync"  
cpu based memcpy (PIO) if a real HW channel wasn't avail, if this is  
left to the client that's fine.  So how does the client know he  
should use normal memcpy()?



A client should only request multiple channel to handle multiple
concurrent operations.


Correct, if there aren't any CPU concurrency issues then 1 channel
will use the device's full bandwidth (unless some other client has
acquired the other channels and is using them, of course.)

This gets around the problem of DMA clients registering (and  
therefore

not getting) channels simply because they init before the DMA device
is discovered.


What do you expect to happen in a system in which the channels are
over subscribed?

Do you expect the DMA device driver to handle scheduling of channels
between multiple clients?


It does the simplest thing that could possibly work right now:
channels are allocated first come first serve. When there is a need,
it should be straightforward to allow multiple clients to share DMA
channels.


Sounds good for a start.  Have you given any thoughts on handling  
priorities between clients?


I need to take a look at the latest patches. How would you guys like  
modifications?


- k
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-29 Thread Andrew Grover
On 3/28/06, Kumar Gala <[EMAIL PROTECTED]> wrote:
> Do you only get callback when a channel is available?

Yes

> How do you
> decide to do to provide PIO to the client?

The client is responsible for using any channels it gets, or falling
back to memcpy() if it doesn't get any. (I don't understand how PIO
comes into the picture..?)

> A client should only request multiple channel to handle multiple
> concurrent operations.

Correct, if there aren't any CPU concurrency issues then 1 channel
will use the device's full bandwidth (unless some other client has
acquired the other channels and is using them, of course.)

> > This gets around the problem of DMA clients registering (and therefore
> > not getting) channels simply because they init before the DMA device
> > is discovered.
>
> What do you expect to happen in a system in which the channels are
> over subscribed?
>
> Do you expect the DMA device driver to handle scheduling of channels
> between multiple clients?

It does the simplest thing that could possibly work right now:
channels are allocated first come first serve. When there is a need,
it should be straightforward to allow multiple clients to share DMA
channels.

Regards -- Andy
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-28 Thread Kumar Gala


On Mar 28, 2006, at 4:01 PM, Andrew Grover wrote:


On 3/28/06, Kumar Gala <[EMAIL PROTECTED]> wrote:

Also, what do you think about adding an operation type (MEMCPY,  
XOR,

CRYPTO_AES, etc).  We can than validate if the operation type
expected is supported by the devices that exist.


No objections, but this speculative support doesn't need to be in  
our

initial patchset.


I don't consider it speculative.  The patch is for a generic DMA
engine interface.  That interface should encompass all users.  I have
a security/crypto DMA engine that I'd like to front with the generic
DMA interface today.  Also, I believe there is another Intel group
with an XOR engine that had a similar concept called ADMA posted a
while ago.


Please submit patches then. We will be doing another rev of the I/OAT
patch very soon, which you will be able to patch against. Or, once the
patch gets in mainline then we can enhance it. Code in the Linux
kernel is never "done", and the burden of implementing additional
functionality falls on those who want it.


I completely understand that.  However, I think putting something  
into mainline that only works or solves the particular problem you  
have is a bad idea.  I'll provide patches for the changes I'd like to  
see.  However, I figured a little discussion on the subject before I  
went off an spent time on it was worth while.



Can you explain what the semantics are.

It's been a little while since I posted so my thoughts on the subject
are going to take a little while to come back to me :)


Yeah. Basically you register as a DMA client, and say how many DMA
channels you want. Our net_dma patch for example uses multiple
channels to help lock contention. Then when channels are available
(i.e. a DMA device added or another client gives them up) then you get
a callback. If the channel goes away (i.e. DMA device is removed
(theoretically possible but practically never happens) or *you* are
going away and change your request to 0 channels) then you get a
remove callback.


Do you only get callback when a channel is available?  How do you  
decide to do to provide PIO to the client?


A client should only request multiple channel to handle multiple  
concurrent operations.



This gets around the problem of DMA clients registering (and therefore
not getting) channels simply because they init before the DMA device
is discovered.


What do you expect to happen in a system in which the channels are  
over subscribed?


Do you expect the DMA device driver to handle scheduling of channels  
between multiple clients?


- kumar

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-28 Thread Andrew Grover
On 3/28/06, Kumar Gala <[EMAIL PROTECTED]> wrote:

> >> Also, what do you think about adding an operation type (MEMCPY, XOR,
> >> CRYPTO_AES, etc).  We can than validate if the operation type
> >> expected is supported by the devices that exist.
> >
> > No objections, but this speculative support doesn't need to be in our
> > initial patchset.
>
> I don't consider it speculative.  The patch is for a generic DMA
> engine interface.  That interface should encompass all users.  I have
> a security/crypto DMA engine that I'd like to front with the generic
> DMA interface today.  Also, I believe there is another Intel group
> with an XOR engine that had a similar concept called ADMA posted a
> while ago.

Please submit patches then. We will be doing another rev of the I/OAT
patch very soon, which you will be able to patch against. Or, once the
patch gets in mainline then we can enhance it. Code in the Linux
kernel is never "done", and the burden of implementing additional
functionality falls on those who want it.

> Can you explain what the semantics are.
>
> It's been a little while since I posted so my thoughts on the subject
> are going to take a little while to come back to me :)

Yeah. Basically you register as a DMA client, and say how many DMA
channels you want. Our net_dma patch for example uses multiple
channels to help lock contention. Then when channels are available
(i.e. a DMA device added or another client gives them up) then you get
a callback. If the channel goes away (i.e. DMA device is removed
(theoretically possible but practically never happens) or *you* are
going away and change your request to 0 channels) then you get a
remove callback.

This gets around the problem of DMA clients registering (and therefore
not getting) channels simply because they init before the DMA device
is discovered.

Regards -- Andy
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-28 Thread Kumar Gala


On Mar 28, 2006, at 12:44 PM, Andrew Grover wrote:


On 3/16/06, Kumar Gala <[EMAIL PROTECTED]> wrote:

It would seem that when a client registers (or shortly there after
when they call dma_async_client_chan_request()) they would expect to
get the number of channels they need by some given time period.

For example, lets say a client registers but no dma device exists.
They will never get called to be aware of this condition.

I would think most clients would either spin until they have all the
channels they need or fall back to a non-async mechanism.


Clients *are* expected to fall back to non-async if they are not given
channels. The reason it was implemented with callbacks for
added/removed was that the client may be initializing before the
channels are enumerated. For example, the net subsystem will ask for
channels and not get them for a while, until the ioatdma PCI device is
found and its driver loads. In this scenario, we'd like the net
subsystem to be given these channels, instead of them going unused.


Fair, I need to think on that a little more.


Also, what do you think about adding an operation type (MEMCPY, XOR,
CRYPTO_AES, etc).  We can than validate if the operation type
expected is supported by the devices that exist.


No objections, but this speculative support doesn't need to be in our
initial patchset.


I don't consider it speculative.  The patch is for a generic DMA  
engine interface.  That interface should encompass all users.  I have  
a security/crypto DMA engine that I'd like to front with the generic  
DMA interface today.  Also, I believe there is another Intel group  
with an XOR engine that had a similar concept called ADMA posted a  
while ago.


http://marc.theaimsgroup.com/?t=11260312014&r=1&w=2


Shouldn't we also have a dma_async_client_chan_free()?


Well we could just define it to be chan_request(0) but it doesn't seem
to be needed. Also, the allocation mechanism we have for channels is
different from alloc/free's semantics, so it may be best to not muddy
the water in this area.


Can you explain what the semantics are.

It's been a little while since I posted so my thoughts on the subject  
are going to take a little while to come back to me :)


- kumar
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-28 Thread Andrew Grover
On 3/16/06, Kumar Gala <[EMAIL PROTECTED]> wrote:
> It would seem that when a client registers (or shortly there after
> when they call dma_async_client_chan_request()) they would expect to
> get the number of channels they need by some given time period.
>
> For example, lets say a client registers but no dma device exists.
> They will never get called to be aware of this condition.
>
> I would think most clients would either spin until they have all the
> channels they need or fall back to a non-async mechanism.

Clients *are* expected to fall back to non-async if they are not given
channels. The reason it was implemented with callbacks for
added/removed was that the client may be initializing before the
channels are enumerated. For example, the net subsystem will ask for
channels and not get them for a while, until the ioatdma PCI device is
found and its driver loads. In this scenario, we'd like the net
subsystem to be given these channels, instead of them going unused.

> Also, what do you think about adding an operation type (MEMCPY, XOR,
> CRYPTO_AES, etc).  We can than validate if the operation type
> expected is supported by the devices that exist.

No objections, but this speculative support doesn't need to be in our
initial patchset.

> Shouldn't we also have a dma_async_client_chan_free()?

Well we could just define it to be chan_request(0) but it doesn't seem
to be needed. Also, the allocation mechanism we have for channels is
different from alloc/free's semantics, so it may be best to not muddy
the water in this area.

Regards -- Andy
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-16 Thread Kumar Gala

[snip]


+/**
+ * dma_async_client_register - allocate and register a &dma_client
+ * @event_callback: callback for notification of channel addition/ 
removal

+ */
+struct dma_client *dma_async_client_register(dma_event_callback  
event_callback)

+{
+   struct dma_client *client;
+
+   client = kzalloc(sizeof(*client), GFP_KERNEL);
+   if (!client)
+   return NULL;
+
+   INIT_LIST_HEAD(&client->channels);
+   spin_lock_init(&client->lock);
+
+   client->chans_desired = 0;
+   client->chan_count = 0;
+   client->event_callback = event_callback;
+
+   spin_lock(&dma_list_lock);
+   list_add_tail(&client->global_node, &dma_client_list);
+   spin_unlock(&dma_list_lock);
+
+   return client;
+}


It would seem that when a client registers (or shortly there after  
when they call dma_async_client_chan_request()) they would expect to  
get the number of channels they need by some given time period.


For example, lets say a client registers but no dma device exists.   
They will never get called to be aware of this condition.


I would think most clients would either spin until they have all the  
channels they need or fall back to a non-async mechanism.


Also, what do you think about adding an operation type (MEMCPY, XOR,  
CRYPTO_AES, etc).  We can than validate if the operation type  
expected is supported by the devices that exist.



+
+/**
+ * dma_async_client_unregister - unregister a client and free the  
&dma_client

+ * @client:
+ *
+ * Force frees any allocated DMA channels, frees the &dma_client  
memory

+ */
+void dma_async_client_unregister(struct dma_client *client)
+{
+   struct dma_chan *chan;
+
+   if (!client)
+   return;
+
+   rcu_read_lock();
+   list_for_each_entry_rcu(chan, &client->channels, client_node) {
+   dma_client_chan_free(chan);
+   }
+   rcu_read_unlock();
+
+   spin_lock(&dma_list_lock);
+   list_del(&client->global_node);
+   spin_unlock(&dma_list_lock);
+
+   kfree(client);
+   dma_chans_rebalance();
+}
+
+/**
+ * dma_async_client_chan_request - request DMA channels
+ * @client: &dma_client
+ * @number: count of DMA channels requested
+ *
+ * Clients call dma_async_client_chan_request() to specify how many
+ * DMA channels they need, 0 to free all currently allocated.
+ * The resulting allocations/frees are indicated to the client via  
the

+ * event callback.
+ */
+void dma_async_client_chan_request(struct dma_client *client,
+   unsigned int number)
+{
+   client->chans_desired = number;
+   dma_chans_rebalance();
+}
+


Shouldn't we also have a dma_async_client_chan_free()?

[snip]


+/* --- public DMA engine API --- */
+
+struct dma_client *dma_async_client_register(dma_event_callback  
event_callback);

+void dma_async_client_unregister(struct dma_client *client);
+void dma_async_client_chan_request(struct dma_client *client,
+   unsigned int number);
+
+/**
+ * dma_async_memcpy_buf_to_buf - offloaded copy between virtual  
addresses

+ * @chan: DMA channel to offload copy to
+ * @dest: destination address (virtual)
+ * @src: source address (virtual)
+ * @len: length
+ *
+ * Both @dest and @src must be mappable to a bus address according  
to the

+ * DMA mapping API rules for streaming mappings.
+ * Both @dest and @src must stay memory resident (kernel memory or  
locked

+ * user space pages)
+ */
+static inline dma_cookie_t dma_async_memcpy_buf_to_buf(struct  
dma_chan *chan,

+   void *dest, void *src, size_t len)
+{
+   int cpu = get_cpu();
+   per_cpu_ptr(chan->local, cpu)->bytes_transferred += len;
+   per_cpu_ptr(chan->local, cpu)->memcpy_count++;
+   put_cpu();
+
+   return chan->device->device_memcpy_buf_to_buf(chan, dest, src, len);
+}


What about renaming the dma_async_memcpy_* to something like  
dma_async_op_* and have them take an additional operation argument.



+
+/**
+ * dma_async_memcpy_buf_to_pg - offloaded copy
+ * @chan: DMA channel to offload copy to
+ * @page: destination page
+ * @offset: offset in page to copy to
+ * @kdata: source address (virtual)
+ * @len: length
+ *
+ * Both @page/@offset and @kdata must be mappable to a bus address  
according

+ * to the DMA mapping API rules for streaming mappings.
+ * Both @page/@offset and @kdata must stay memory resident (kernel  
memory or

+ * locked user space pages)
+ */
+static inline dma_cookie_t dma_async_memcpy_buf_to_pg(struct  
dma_chan *chan,

+   struct page *page, unsigned int offset, void *kdata, size_t len)
+{
+   int cpu = get_cpu();
+   per_cpu_ptr(chan->local, cpu)->bytes_transferred += len;
+   per_cpu_ptr(chan->local, cpu)->memcpy_count++;
+   put_cpu();
+
+   return chan->device->device_memcpy_buf_to_pg(chan, page, offset,
+kdata, len);
+}
+
+/**
+ * dma_async_memcpy_buf_to_pg - offloaded copy
+ * @chan: DMA cha

Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-14 Thread Pavel Machek
Hi!

> --- /dev/null
> +++ b/drivers/dma/dmaengine.c
> @@ -0,0 +1,360 @@
> +/*
> +Copyright(c) 2004 - 2006 Intel Corporation. All rights reserved.
> +
> +This program is free software; you can redistribute it and/or modify it
> +under the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 2 of the License, or (at your option)
> +any later version.
> +
> +This program is distributed in the hope that it will be useful, but WITHOUT
> +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> +more details.
> +
> +You should have received a copy of the GNU General Public License along with
> +this program; if not, write to the Free Software Foundation, Inc., 59
> +Temple Place - Suite 330, Boston, MA  02111-1307, USA.
> +
> +The full GNU General Public License is included in this distribution in the
> +file called LICENSE.
> +*/


Could you use 
/*
 *
 */

comment style, and describe in one or two lines what the source does
in the header?

Pavel
-- 
209:using System.IO;
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-11 Thread Andrew Morton
Chris Leech <[EMAIL PROTECTED]> wrote:
>
> +void dma_async_device_cleanup(struct kref *kref);
>

Declarations go in header files, please.  Or give it static scope.

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-06 Thread Chris Leech
On 3/4/06, Benjamin LaHaise <[EMAIL PROTECTED]> wrote:
> On Fri, Mar 03, 2006 at 01:42:20PM -0800, Chris Leech wrote:
> > +void dma_async_device_unregister(struct dma_device* device)
> > +{
> ...
> > + kref_put(&device->refcount, dma_async_device_cleanup);
> > + wait_for_completion(&device->done);
> > +}
>
> This looks like a bug: device is dereferenced after it is potentially
> freed.

Actually, this is where the code is waiting to make sure it's safe to
free device.  The release function for the kref completes
device->done.  Each of the devices channels holds a reference to the
device.  When a device is unregistered it's channels are removed from
the clients, which hold a reference for each outstanding transaction. 
When all the outstanding transactions complete, the channels kref goes
to 0, and the reference to the device is dropped.  When the device
kref goes to 0 the completion is set, and it's now safe to free the
memory for the device and channel structures.

I have a writeup of the locking and reference counting that I'll
finish and add in as a big comment to the code.

-Chris
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-06 Thread Chris Leech
On 3/3/06, David S. Miller <[EMAIL PROTECTED]> wrote:
> > +static spinlock_t dma_list_lock;
>
> Please use DEFINE_SPINLOCK().
>
> > +static void dma_chan_free_rcu(struct rcu_head *rcu) {
>
> Newline before the brace please.
>
> > +static void dma_async_device_cleanup(struct kref *kref) {
>
> Newline before the brace please.
>
> > +struct dma_chan_percpu
> > +{
>
> Left brace on the same line as "struct dma_chan_percpu" please.
>
> > +struct dma_chan
> > +{
>
> Similarly.
>
> Otherwise this patch looks mostly ok.

Thanks Dave,

I'll apply these and other feedback and get updated patches generated.

- Chris
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-04 Thread Benjamin LaHaise
On Fri, Mar 03, 2006 at 01:42:20PM -0800, Chris Leech wrote:
> +void dma_async_device_unregister(struct dma_device* device)
> +{
...
> + kref_put(&device->refcount, dma_async_device_cleanup);
> + wait_for_completion(&device->done);
> +}

This looks like a bug: device is dereferenced after it is potentially 
freed.

-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <[EMAIL PROTECTED]>.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem

2006-03-03 Thread David S. Miller
From: Chris Leech <[EMAIL PROTECTED]>
Date: Fri, 03 Mar 2006 13:42:20 -0800

> +static spinlock_t dma_list_lock;

Please use DEFINE_SPINLOCK().

> +static void dma_chan_free_rcu(struct rcu_head *rcu) {

Newline before the brace please.

> +static void dma_async_device_cleanup(struct kref *kref) {

Newline before the brace please.

> +struct dma_chan_percpu
> +{

Left brace on the same line as "struct dma_chan_percpu" please.

> +struct dma_chan
> +{

Similarly.

Otherwise this patch looks mostly ok.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html