Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
On 3/30/06, Kumar Gala <[EMAIL PROTECTED]> wrote: > I was under the impression that the DMA engine would provide a "sync" > cpu based memcpy (PIO) if a real HW channel wasn't avail, if this is > left to the client that's fine. So how does the client know he > should use normal memcpy()? It has to keep track of what DMA channel to use, which it gets when the channel ADDED callback happens. So it's basically if (some_client_struct->dma_chan) dma_memcpy() else memcpy() The async memcpy has the added requirement that at some point the client must verify the copies have been completed, so doing async memcopies does require more work on the client's part. > Sounds good for a start. Have you given any thoughts on handling > priorities between clients? > > I need to take a look at the latest patches. How would you guys like > modifications? Haven't given any thought to priorities yet -- we've been focusing on getting the 1 client case to perform well. :) Chris posted a link to this: git://198.78.49.142/~cleech/linux-2.6 branch ioat-2.6.17 So you can post patches against that, or the patches posted here apply against davem's git tree. Regards -- Andy - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
On Mar 29, 2006, at 5:05 PM, Andrew Grover wrote: On 3/28/06, Kumar Gala <[EMAIL PROTECTED]> wrote: Do you only get callback when a channel is available? Yes How do you decide to do to provide PIO to the client? The client is responsible for using any channels it gets, or falling back to memcpy() if it doesn't get any. (I don't understand how PIO comes into the picture..?) I was under the impression that the DMA engine would provide a "sync" cpu based memcpy (PIO) if a real HW channel wasn't avail, if this is left to the client that's fine. So how does the client know he should use normal memcpy()? A client should only request multiple channel to handle multiple concurrent operations. Correct, if there aren't any CPU concurrency issues then 1 channel will use the device's full bandwidth (unless some other client has acquired the other channels and is using them, of course.) This gets around the problem of DMA clients registering (and therefore not getting) channels simply because they init before the DMA device is discovered. What do you expect to happen in a system in which the channels are over subscribed? Do you expect the DMA device driver to handle scheduling of channels between multiple clients? It does the simplest thing that could possibly work right now: channels are allocated first come first serve. When there is a need, it should be straightforward to allow multiple clients to share DMA channels. Sounds good for a start. Have you given any thoughts on handling priorities between clients? I need to take a look at the latest patches. How would you guys like modifications? - k - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
On 3/28/06, Kumar Gala <[EMAIL PROTECTED]> wrote: > Do you only get callback when a channel is available? Yes > How do you > decide to do to provide PIO to the client? The client is responsible for using any channels it gets, or falling back to memcpy() if it doesn't get any. (I don't understand how PIO comes into the picture..?) > A client should only request multiple channel to handle multiple > concurrent operations. Correct, if there aren't any CPU concurrency issues then 1 channel will use the device's full bandwidth (unless some other client has acquired the other channels and is using them, of course.) > > This gets around the problem of DMA clients registering (and therefore > > not getting) channels simply because they init before the DMA device > > is discovered. > > What do you expect to happen in a system in which the channels are > over subscribed? > > Do you expect the DMA device driver to handle scheduling of channels > between multiple clients? It does the simplest thing that could possibly work right now: channels are allocated first come first serve. When there is a need, it should be straightforward to allow multiple clients to share DMA channels. Regards -- Andy - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
On Mar 28, 2006, at 4:01 PM, Andrew Grover wrote: On 3/28/06, Kumar Gala <[EMAIL PROTECTED]> wrote: Also, what do you think about adding an operation type (MEMCPY, XOR, CRYPTO_AES, etc). We can than validate if the operation type expected is supported by the devices that exist. No objections, but this speculative support doesn't need to be in our initial patchset. I don't consider it speculative. The patch is for a generic DMA engine interface. That interface should encompass all users. I have a security/crypto DMA engine that I'd like to front with the generic DMA interface today. Also, I believe there is another Intel group with an XOR engine that had a similar concept called ADMA posted a while ago. Please submit patches then. We will be doing another rev of the I/OAT patch very soon, which you will be able to patch against. Or, once the patch gets in mainline then we can enhance it. Code in the Linux kernel is never "done", and the burden of implementing additional functionality falls on those who want it. I completely understand that. However, I think putting something into mainline that only works or solves the particular problem you have is a bad idea. I'll provide patches for the changes I'd like to see. However, I figured a little discussion on the subject before I went off an spent time on it was worth while. Can you explain what the semantics are. It's been a little while since I posted so my thoughts on the subject are going to take a little while to come back to me :) Yeah. Basically you register as a DMA client, and say how many DMA channels you want. Our net_dma patch for example uses multiple channels to help lock contention. Then when channels are available (i.e. a DMA device added or another client gives them up) then you get a callback. If the channel goes away (i.e. DMA device is removed (theoretically possible but practically never happens) or *you* are going away and change your request to 0 channels) then you get a remove callback. Do you only get callback when a channel is available? How do you decide to do to provide PIO to the client? A client should only request multiple channel to handle multiple concurrent operations. This gets around the problem of DMA clients registering (and therefore not getting) channels simply because they init before the DMA device is discovered. What do you expect to happen in a system in which the channels are over subscribed? Do you expect the DMA device driver to handle scheduling of channels between multiple clients? - kumar - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
On 3/28/06, Kumar Gala <[EMAIL PROTECTED]> wrote: > >> Also, what do you think about adding an operation type (MEMCPY, XOR, > >> CRYPTO_AES, etc). We can than validate if the operation type > >> expected is supported by the devices that exist. > > > > No objections, but this speculative support doesn't need to be in our > > initial patchset. > > I don't consider it speculative. The patch is for a generic DMA > engine interface. That interface should encompass all users. I have > a security/crypto DMA engine that I'd like to front with the generic > DMA interface today. Also, I believe there is another Intel group > with an XOR engine that had a similar concept called ADMA posted a > while ago. Please submit patches then. We will be doing another rev of the I/OAT patch very soon, which you will be able to patch against. Or, once the patch gets in mainline then we can enhance it. Code in the Linux kernel is never "done", and the burden of implementing additional functionality falls on those who want it. > Can you explain what the semantics are. > > It's been a little while since I posted so my thoughts on the subject > are going to take a little while to come back to me :) Yeah. Basically you register as a DMA client, and say how many DMA channels you want. Our net_dma patch for example uses multiple channels to help lock contention. Then when channels are available (i.e. a DMA device added or another client gives them up) then you get a callback. If the channel goes away (i.e. DMA device is removed (theoretically possible but practically never happens) or *you* are going away and change your request to 0 channels) then you get a remove callback. This gets around the problem of DMA clients registering (and therefore not getting) channels simply because they init before the DMA device is discovered. Regards -- Andy - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
On Mar 28, 2006, at 12:44 PM, Andrew Grover wrote: On 3/16/06, Kumar Gala <[EMAIL PROTECTED]> wrote: It would seem that when a client registers (or shortly there after when they call dma_async_client_chan_request()) they would expect to get the number of channels they need by some given time period. For example, lets say a client registers but no dma device exists. They will never get called to be aware of this condition. I would think most clients would either spin until they have all the channels they need or fall back to a non-async mechanism. Clients *are* expected to fall back to non-async if they are not given channels. The reason it was implemented with callbacks for added/removed was that the client may be initializing before the channels are enumerated. For example, the net subsystem will ask for channels and not get them for a while, until the ioatdma PCI device is found and its driver loads. In this scenario, we'd like the net subsystem to be given these channels, instead of them going unused. Fair, I need to think on that a little more. Also, what do you think about adding an operation type (MEMCPY, XOR, CRYPTO_AES, etc). We can than validate if the operation type expected is supported by the devices that exist. No objections, but this speculative support doesn't need to be in our initial patchset. I don't consider it speculative. The patch is for a generic DMA engine interface. That interface should encompass all users. I have a security/crypto DMA engine that I'd like to front with the generic DMA interface today. Also, I believe there is another Intel group with an XOR engine that had a similar concept called ADMA posted a while ago. http://marc.theaimsgroup.com/?t=11260312014&r=1&w=2 Shouldn't we also have a dma_async_client_chan_free()? Well we could just define it to be chan_request(0) but it doesn't seem to be needed. Also, the allocation mechanism we have for channels is different from alloc/free's semantics, so it may be best to not muddy the water in this area. Can you explain what the semantics are. It's been a little while since I posted so my thoughts on the subject are going to take a little while to come back to me :) - kumar - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
On 3/16/06, Kumar Gala <[EMAIL PROTECTED]> wrote: > It would seem that when a client registers (or shortly there after > when they call dma_async_client_chan_request()) they would expect to > get the number of channels they need by some given time period. > > For example, lets say a client registers but no dma device exists. > They will never get called to be aware of this condition. > > I would think most clients would either spin until they have all the > channels they need or fall back to a non-async mechanism. Clients *are* expected to fall back to non-async if they are not given channels. The reason it was implemented with callbacks for added/removed was that the client may be initializing before the channels are enumerated. For example, the net subsystem will ask for channels and not get them for a while, until the ioatdma PCI device is found and its driver loads. In this scenario, we'd like the net subsystem to be given these channels, instead of them going unused. > Also, what do you think about adding an operation type (MEMCPY, XOR, > CRYPTO_AES, etc). We can than validate if the operation type > expected is supported by the devices that exist. No objections, but this speculative support doesn't need to be in our initial patchset. > Shouldn't we also have a dma_async_client_chan_free()? Well we could just define it to be chan_request(0) but it doesn't seem to be needed. Also, the allocation mechanism we have for channels is different from alloc/free's semantics, so it may be best to not muddy the water in this area. Regards -- Andy - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
[snip] +/** + * dma_async_client_register - allocate and register a &dma_client + * @event_callback: callback for notification of channel addition/ removal + */ +struct dma_client *dma_async_client_register(dma_event_callback event_callback) +{ + struct dma_client *client; + + client = kzalloc(sizeof(*client), GFP_KERNEL); + if (!client) + return NULL; + + INIT_LIST_HEAD(&client->channels); + spin_lock_init(&client->lock); + + client->chans_desired = 0; + client->chan_count = 0; + client->event_callback = event_callback; + + spin_lock(&dma_list_lock); + list_add_tail(&client->global_node, &dma_client_list); + spin_unlock(&dma_list_lock); + + return client; +} It would seem that when a client registers (or shortly there after when they call dma_async_client_chan_request()) they would expect to get the number of channels they need by some given time period. For example, lets say a client registers but no dma device exists. They will never get called to be aware of this condition. I would think most clients would either spin until they have all the channels they need or fall back to a non-async mechanism. Also, what do you think about adding an operation type (MEMCPY, XOR, CRYPTO_AES, etc). We can than validate if the operation type expected is supported by the devices that exist. + +/** + * dma_async_client_unregister - unregister a client and free the &dma_client + * @client: + * + * Force frees any allocated DMA channels, frees the &dma_client memory + */ +void dma_async_client_unregister(struct dma_client *client) +{ + struct dma_chan *chan; + + if (!client) + return; + + rcu_read_lock(); + list_for_each_entry_rcu(chan, &client->channels, client_node) { + dma_client_chan_free(chan); + } + rcu_read_unlock(); + + spin_lock(&dma_list_lock); + list_del(&client->global_node); + spin_unlock(&dma_list_lock); + + kfree(client); + dma_chans_rebalance(); +} + +/** + * dma_async_client_chan_request - request DMA channels + * @client: &dma_client + * @number: count of DMA channels requested + * + * Clients call dma_async_client_chan_request() to specify how many + * DMA channels they need, 0 to free all currently allocated. + * The resulting allocations/frees are indicated to the client via the + * event callback. + */ +void dma_async_client_chan_request(struct dma_client *client, + unsigned int number) +{ + client->chans_desired = number; + dma_chans_rebalance(); +} + Shouldn't we also have a dma_async_client_chan_free()? [snip] +/* --- public DMA engine API --- */ + +struct dma_client *dma_async_client_register(dma_event_callback event_callback); +void dma_async_client_unregister(struct dma_client *client); +void dma_async_client_chan_request(struct dma_client *client, + unsigned int number); + +/** + * dma_async_memcpy_buf_to_buf - offloaded copy between virtual addresses + * @chan: DMA channel to offload copy to + * @dest: destination address (virtual) + * @src: source address (virtual) + * @len: length + * + * Both @dest and @src must be mappable to a bus address according to the + * DMA mapping API rules for streaming mappings. + * Both @dest and @src must stay memory resident (kernel memory or locked + * user space pages) + */ +static inline dma_cookie_t dma_async_memcpy_buf_to_buf(struct dma_chan *chan, + void *dest, void *src, size_t len) +{ + int cpu = get_cpu(); + per_cpu_ptr(chan->local, cpu)->bytes_transferred += len; + per_cpu_ptr(chan->local, cpu)->memcpy_count++; + put_cpu(); + + return chan->device->device_memcpy_buf_to_buf(chan, dest, src, len); +} What about renaming the dma_async_memcpy_* to something like dma_async_op_* and have them take an additional operation argument. + +/** + * dma_async_memcpy_buf_to_pg - offloaded copy + * @chan: DMA channel to offload copy to + * @page: destination page + * @offset: offset in page to copy to + * @kdata: source address (virtual) + * @len: length + * + * Both @page/@offset and @kdata must be mappable to a bus address according + * to the DMA mapping API rules for streaming mappings. + * Both @page/@offset and @kdata must stay memory resident (kernel memory or + * locked user space pages) + */ +static inline dma_cookie_t dma_async_memcpy_buf_to_pg(struct dma_chan *chan, + struct page *page, unsigned int offset, void *kdata, size_t len) +{ + int cpu = get_cpu(); + per_cpu_ptr(chan->local, cpu)->bytes_transferred += len; + per_cpu_ptr(chan->local, cpu)->memcpy_count++; + put_cpu(); + + return chan->device->device_memcpy_buf_to_pg(chan, page, offset, +kdata, len); +} + +/** + * dma_async_memcpy_buf_to_pg - offloaded copy + * @chan: DMA cha
Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
Hi! > --- /dev/null > +++ b/drivers/dma/dmaengine.c > @@ -0,0 +1,360 @@ > +/* > +Copyright(c) 2004 - 2006 Intel Corporation. All rights reserved. > + > +This program is free software; you can redistribute it and/or modify it > +under the terms of the GNU General Public License as published by the Free > +Software Foundation; either version 2 of the License, or (at your option) > +any later version. > + > +This program is distributed in the hope that it will be useful, but WITHOUT > +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > +more details. > + > +You should have received a copy of the GNU General Public License along with > +this program; if not, write to the Free Software Foundation, Inc., 59 > +Temple Place - Suite 330, Boston, MA 02111-1307, USA. > + > +The full GNU General Public License is included in this distribution in the > +file called LICENSE. > +*/ Could you use /* * */ comment style, and describe in one or two lines what the source does in the header? Pavel -- 209:using System.IO; - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
Chris Leech <[EMAIL PROTECTED]> wrote: > > +void dma_async_device_cleanup(struct kref *kref); > Declarations go in header files, please. Or give it static scope. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
On 3/4/06, Benjamin LaHaise <[EMAIL PROTECTED]> wrote: > On Fri, Mar 03, 2006 at 01:42:20PM -0800, Chris Leech wrote: > > +void dma_async_device_unregister(struct dma_device* device) > > +{ > ... > > + kref_put(&device->refcount, dma_async_device_cleanup); > > + wait_for_completion(&device->done); > > +} > > This looks like a bug: device is dereferenced after it is potentially > freed. Actually, this is where the code is waiting to make sure it's safe to free device. The release function for the kref completes device->done. Each of the devices channels holds a reference to the device. When a device is unregistered it's channels are removed from the clients, which hold a reference for each outstanding transaction. When all the outstanding transactions complete, the channels kref goes to 0, and the reference to the device is dropped. When the device kref goes to 0 the completion is set, and it's now safe to free the memory for the device and channel structures. I have a writeup of the locking and reference counting that I'll finish and add in as a big comment to the code. -Chris - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
On 3/3/06, David S. Miller <[EMAIL PROTECTED]> wrote: > > +static spinlock_t dma_list_lock; > > Please use DEFINE_SPINLOCK(). > > > +static void dma_chan_free_rcu(struct rcu_head *rcu) { > > Newline before the brace please. > > > +static void dma_async_device_cleanup(struct kref *kref) { > > Newline before the brace please. > > > +struct dma_chan_percpu > > +{ > > Left brace on the same line as "struct dma_chan_percpu" please. > > > +struct dma_chan > > +{ > > Similarly. > > Otherwise this patch looks mostly ok. Thanks Dave, I'll apply these and other feedback and get updated patches generated. - Chris - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
On Fri, Mar 03, 2006 at 01:42:20PM -0800, Chris Leech wrote: > +void dma_async_device_unregister(struct dma_device* device) > +{ ... > + kref_put(&device->refcount, dma_async_device_cleanup); > + wait_for_completion(&device->done); > +} This looks like a bug: device is dereferenced after it is potentially freed. -ben -- "Time is of no importance, Mr. President, only life is important." Don't Email: <[EMAIL PROTECTED]>. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/8] [I/OAT] DMA memcpy subsystem
From: Chris Leech <[EMAIL PROTECTED]> Date: Fri, 03 Mar 2006 13:42:20 -0800 > +static spinlock_t dma_list_lock; Please use DEFINE_SPINLOCK(). > +static void dma_chan_free_rcu(struct rcu_head *rcu) { Newline before the brace please. > +static void dma_async_device_cleanup(struct kref *kref) { Newline before the brace please. > +struct dma_chan_percpu > +{ Left brace on the same line as "struct dma_chan_percpu" please. > +struct dma_chan > +{ Similarly. Otherwise this patch looks mostly ok. - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html