Re: [RFC PATCH] fsldma: Add DMA_SLAVE support

2009-06-18 Thread Dan Williams

Ira Snyder wrote:

So, I see a couple of ways of moving forward:
1) Keep my implementation, moving the includes to arch/powerpc/include.
   Do we drop the allocation functions?


+1 for option 1.  Having it under arch/powerpc/include makes it clear 
that it is a powerpc specific api, so keep the allocation routines. 
Copy Kumar on the updated patch as I'll need a ppc-dev ack for carrying 
this file addition through the dmaengine tree.



Thanks for all the input Dan. I finally feel like we're getting
somewhere :)


Thanks for the exchange it always helps to get a good picture of the 
underlying design rationale.


Regards,
Dan
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH] fsldma: Add DMA_SLAVE support

2009-06-18 Thread Ira Snyder
On Thu, Jun 18, 2009 at 11:16:03AM -0700, Dan Williams wrote:
> Ira Snyder wrote:
>> I can do something similar by calling device_prep_dma_memcpy() lots of
>> times. Once for each page in the scatterlist.
>>
>> This has the advantage of not changing the DMAEngine API.
>>
>> This has the disadvantage of not being a single transaction. The DMA
>> controller will get an interrupt after each memcpy() transaction, clean
>> up descriptors, etc.
>>
>
> Why would it need an interrupt per memcpy?  There is a  
> DMA_PREP_INTERRUPT flag to gate interrupt generation to the last  
> descriptor.  This is how async_tx delays callbacks until the last  
> operation in a chain has completed.  Also, I am not currently seeing how  
> your implementation achieves a single hardware transaction.  It still  
> calls fsl_dma_alloc_descriptor() per address pair?
>

Ok, there are a few things here:

1) an interrupt per memcpy

The *software* using DMAEngine doesn't need one interrupt per memcpy.
That is controlled by the DMA_PREP_INTERRUPT flag, exactly as you
describe.

The Freescale DMA driver DOES use one interrupt per async_tx descriptor.
See drivers/dma/fsldma.c dma_init() and append_ld_queue().

The FSL_DMA_MR_EOTIE flag in dma_init() tells the controller to generate
an interrupt at the end of each transfer. A transfer is (potentially
long) list of address pairs / hardware descriptors.

The FSL_DMA_MR_EOSIE flag in append_ld_queue() tells the controller to
generate an interrupt at the end of transferring this hardware
descriptor (AKA segment). The driver combines multiple memcpy operations
into a single transfer. When the driver combines operations, it adds the
EOSIE flag to the descriptor that would-have-been the end of a transfer.
It uses this interrupt to update the DMA cookie, presumably to speed up
users of dma_sync_wait() when there are lots of users sharing the DMA
controller.

Let me try to explain what will happen with some code:

=== Case 1: Two seperate transfers ===

dma_cookie_t t1, t2;
t1 = dma_async_memcpy_buf_to_buf(...);
dma_async_memcpy_issue_pending();

/*
 * some time goes by, the DMA transfer completes,
 * and the controller gets the end-of-transfer interrupt
 */

t2 = dma_async_memcpy_buf_to_buf(...);
dma_async_memcpy_issue_pending();

/*
 * some time goes by, the DMA transfer completes,
 * and the controller gets the end-of-transfer interrupt
 */

This is exactly what I would expect from the API. There are two seperate
transfers, and there are two hardware interrupts. Here is a crude
timeline.

|--||--|---
|  ||  |
t1 start   t1 end, EOT interruptt2 start   t2 end, EOT interrupt

=== Case 2: Two seperate transfers, merged by the driver ===

t1 = dma_async_memcpy_buf_to_buf(...);
dma_async_memcpy_issue_pending();

/*
 * the controller starts executing the transfer, but has not
 * finished yet
 */
t2 = dma_async_memcpy_buf_to_buf(...);

/*
 * append_ld_queue() sets the EOSIE flag on the last hardware
 * descriptor in t1, then sets the next link descriptor to the
 * first descriptor in t2
 */

Now there are two transfers, and again two hardware interrupts. Again,
not really a problem. Every part of the API still works as expected.

|---|---|-|
|   |   | |
t1 startt2 tx_submit()  t1 end, EOS interrupt, t2 start   t2 end, EOT 
interrupt

=== Case 3: Two transfers, merged by the driver ===

t1 = dma_async_memcpy_buf_to_buf(...);
t2 = dma_async_memcpy_buf_to_buf(...);
dma_async_memcpy_issue_pending();

After a very careful reading of the driver, I just noticed that if there
is no transfer in progress (as would be expected on a private channel)
then the EOS interrupt never gets set, and the requests are simply
merged together. This would lead to a timeline like this:

||--|--
||  |
t1 start t1 end, t2 start   t2 end, EOT interrupt

This is perfectly fine as well. It is the behavior I want.

Some more study of the code shows that the Freescale DMA driver will not
halt the channel as long as the channel is busy, meaning that it will
not clear the external start bits during a transfer.

So, in conclusion, I can call memcpy multiple times and have it all
merged into a single transfer by the driver, with a single hardware
interrupt (at the end-of-transfer) and have everything complete without
halting the DMA controller.

2) Single transaction

I think we're using different terms here. I define a single transaction
to be the time while the DMA controller is busy transferring things.

In case #1 above, there are two transfers. In case #2 above, one
transfer, and two interrupts. In case #3 above, one transfer, one
interrupt.

3) Hardware descriptor per address pair

Re: [RFC PATCH] fsldma: Add DMA_SLAVE support

2009-06-18 Thread Dan Williams

Ira Snyder wrote:

I can do something similar by calling device_prep_dma_memcpy() lots of
times. Once for each page in the scatterlist.

This has the advantage of not changing the DMAEngine API.

This has the disadvantage of not being a single transaction. The DMA
controller will get an interrupt after each memcpy() transaction, clean
up descriptors, etc.



Why would it need an interrupt per memcpy?  There is a 
DMA_PREP_INTERRUPT flag to gate interrupt generation to the last 
descriptor.  This is how async_tx delays callbacks until the last 
operation in a chain has completed.  Also, I am not currently seeing how 
your implementation achieves a single hardware transaction.  It still 
calls fsl_dma_alloc_descriptor() per address pair?


The api currently allows a call to ->prep_* to generate multiple 
internal descriptors for a single input address (i.e. memcpy in the case 
where the transfer length exceeds the hardware maximum).  The slave api 
allows for transfers from a scatterlist to a slave context.  I think 
what is bothering me is that the fsldma slave implementation is passing 
a list of sideband addresses rather than a constant address context like 
the existing dw_dmac, so it is different.  However, I can now see that 
trying to enforce uniformity in this area is counterproductive.  The 
DMA_SLAVE interface will always have irreconcilable differences across 
architectures.



It also has the disadvantage of not being able to change the
controller-specific features I'm using: external start. This feature
lets the "3rd device" control the DMA controller. It looks like the
atmel-mci driver has a similar feature. The DMAEngine API has no way to
expose this type of feature except through DMA_SLAVE.


Yeah, an example of an architecture specific quirk that has no chance of 
being uniformly handled in a generic api.



If it is just the 3 helper routines that are the issue with this patch,
I have no problem dropping them and letting each user re-create them
themselves.


I think this makes the most sense at this point.  We can reserve 
judgement on the approach until the next fsldma-slave user arrives and 
tries to use this implementation for their device.  Can we move the 
header file under arch/powerpc/include?


[..]

A single-transaction scatterlist-to-scatterlist copy would be nice.

One of the major advantages of working with the DMA controller is that
it automatically handles scatter/gather. Almost all DMA controllers have
the feature, but it is totally missing from the high-level API.


The engines I am familiar with (ioatdma and iop-adma) still need a 
hardware descriptor per address pair I do not see how fsldma does this 
any more automatically than those engines?  I could see having a helper 
routine that does the mapping and iterating, but in the end the driver 
still sees multiple calls to ->prep (unless of course you are doing this 
in a DMA_SLAVE context, then anything goes).


Thanks,
Dan
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH] fsldma: Add DMA_SLAVE support

2009-06-17 Thread Ira Snyder
On Wed, Jun 17, 2009 at 10:17:54AM -0700, Dan Williams wrote:
> Ira Snyder wrote:
>>> Using EXPORT_SYMBOL would defeat the purpose of conforming to the
>>> dmaengine api which should allow other subsystems to generically
>>> discover an fsldma resource.
>>>
>>
>> Any driver would still use dma_request_channel(), etc. to get access to
>> a DMA channel. AFAICT, DMA_SLAVE is intended for doing something
>> completely hardware-specific with the DMA controller.
>
> Yes.  Specifically DMA_SLAVE operations imply a pre-established context  
> with a 3rd device (the other 2 devices being system memory and the dma  
> channel), as compared to plain memcpy.  The assumption is that a dma  
> device with this capability may not be shared with other requesters  
> until the context is torn down.
>
> [..]
>> What I've implemented is this: (sorry about the poor drawing)
>>
>> scatterlist   fsl_dma_hw_addr
>> +++---+
>> |  DATA  | >  | DEST1 |
>> |  DATA  | +  +---+
>> |  DATA  | |
>> |  DATA  | |  +---+
>> |  DATA  | +--->  | DEST2 |
>> |  DATA  |+---+
>> ++
>>   .
>>   .
>>   .
>>
>> Of course, the reverse works as well. You can copy from a list of
>> hardware address/length pairs to a scatterlist.
>>
>> So, using my implementation of the DMA_SLAVE feature, you can take a big
>> chunk of data (which is organized into a scatterlist) and DMA it
>> directly to a set of hardware addresses, all in a single, unbroken
>> transaction.
>
> Could the same effect be achieved by calling ->prep_slave_sg multiple  
> times?  Once you need to add dma-driver specific helper routines you are  
> effectively extending the dmaengine interface in an fsldma specific  
> fashion.  Can we reduce this to just the existing primitives?  If it  
> turns out that this is untenable can we extend dmaengine to make this a  
> generic capability?  My preference is the former.
>

I can do something similar by calling device_prep_dma_memcpy() lots of
times. Once for each page in the scatterlist.

This has the advantage of not changing the DMAEngine API.

This has the disadvantage of not being a single transaction. The DMA
controller will get an interrupt after each memcpy() transaction, clean
up descriptors, etc.

It also has the disadvantage of not being able to change the
controller-specific features I'm using: external start. This feature
lets the "3rd device" control the DMA controller. It looks like the
atmel-mci driver has a similar feature. The DMAEngine API has no way to
expose this type of feature except through DMA_SLAVE.

If it is just the 3 helper routines that are the issue with this patch,
I have no problem dropping them and letting each user re-create them
themselves.

>> I've inlined the driver for the FPGA programmer below. I don't think it
>> is appropriate to push into mainline, since it will only work for our
>> board, and nothing else.
>
> If we find that we need to extend the dmaengine interface we will need  
> an in-tree user.  In my opinion, as long as it passes the Voyager test  
> [1] then I do not see why it should be barred from upstream.
>

Yep, I understand the Voyager test. I just didn't think it would be
useful to anyone to try and push it upstream, especially since the
hardware is in-house only.

The virtio-over-pci patches I've posted a while back may benefit from
this as well. They look more likely to go upstream. I've been really
busy and haven't had time to work on them recently, but Grant Likely is
working on them at the moment.

> [..]
>> He also used platform data to get the register addresses. I'm unaware of
>> a way to put arbitrary platform data into the OF tree used on PowerPC.
>>
>
> Anybody else on ppc-dev know if this is possible??
>
>> I didn't want to force other users to implement the allocation routines
>> for the struct fsl_dma_hw_addr themselves, so I provided routines to do
>> so.
>
> It depends on how many users of this feature there ends up being,  
> pushing this into each driver that needs it would not be too horrible  
> especially if it just the three straightforward routines  
> (fsl_dma_slave_append, fsl_dma_slave_free, and fsl_dma_slave_alloc).
>

Right, the routines are very simple. I wouldn't have any problem leaving
it up to users.

> So, the three options in order of preference are:
> 1/ arrange to call ->prep multiple times to handle each dma address

This doesn't seem quite right. I'd end up doing something like:

struct fsl_dma_slave slave;
slave.external_start = true;
slave.address = 0xf0003000;
slave.length = 4096;
chan->private = &slave;

for_each_sg(sgl, sg, sg_len, i) {
device_prep_slave_sg(chan, sg, 1, DMA_TO_DEVICE, 0);
}

That pretty much defeats the purpose of the scatterlist argument,
doesn't it? Also, it is no better than just calling
device_prep_dma_memcpy() lots of times. You don't get a single DMA

Re: [RFC PATCH] fsldma: Add DMA_SLAVE support

2009-06-17 Thread Dan Williams

Ira Snyder wrote:

Using EXPORT_SYMBOL would defeat the purpose of conforming to the
dmaengine api which should allow other subsystems to generically
discover an fsldma resource.



Any driver would still use dma_request_channel(), etc. to get access to
a DMA channel. AFAICT, DMA_SLAVE is intended for doing something
completely hardware-specific with the DMA controller.


Yes.  Specifically DMA_SLAVE operations imply a pre-established context 
with a 3rd device (the other 2 devices being system memory and the dma 
channel), as compared to plain memcpy.  The assumption is that a dma 
device with this capability may not be shared with other requesters 
until the context is torn down.


[..]

What I've implemented is this: (sorry about the poor drawing)

scatterlist   fsl_dma_hw_addr
+++---+
|  DATA  | >  | DEST1 |
|  DATA  | +  +---+
|  DATA  | |
|  DATA  | |  +---+
|  DATA  | +--->  | DEST2 |
|  DATA  |+---+
++
  .
  .
  .

Of course, the reverse works as well. You can copy from a list of
hardware address/length pairs to a scatterlist.

So, using my implementation of the DMA_SLAVE feature, you can take a big
chunk of data (which is organized into a scatterlist) and DMA it
directly to a set of hardware addresses, all in a single, unbroken
transaction.


Could the same effect be achieved by calling ->prep_slave_sg multiple 
times?  Once you need to add dma-driver specific helper routines you are 
effectively extending the dmaengine interface in an fsldma specific 
fashion.  Can we reduce this to just the existing primitives?  If it 
turns out that this is untenable can we extend dmaengine to make this a 
generic capability?  My preference is the former.



I've inlined the driver for the FPGA programmer below. I don't think it
is appropriate to push into mainline, since it will only work for our
board, and nothing else.


If we find that we need to extend the dmaengine interface we will need 
an in-tree user.  In my opinion, as long as it passes the Voyager test 
[1] then I do not see why it should be barred from upstream.


[..]

He also used platform data to get the register addresses. I'm unaware of
a way to put arbitrary platform data into the OF tree used on PowerPC.



Anybody else on ppc-dev know if this is possible??


I didn't want to force other users to implement the allocation routines
for the struct fsl_dma_hw_addr themselves, so I provided routines to do
so.


It depends on how many users of this feature there ends up being, 
pushing this into each driver that needs it would not be too horrible 
especially if it just the three straightforward routines 
(fsl_dma_slave_append, fsl_dma_slave_free, and fsl_dma_slave_alloc).


So, the three options in order of preference are:
1/ arrange to call ->prep multiple times to handle each dma address
2/ push the functionality down into the individual drivers that need it
3/ up level the functionality into dmaengine to make it generically 
available


Thanks,
Dan

[1]: The Voyager test refers to James Bottomley's maintenance of the 
Voyager x86-sub-architecture.  If at least one person cares about a 
feature, is willing to maintain the code, and it does not impose a 
maintenance burden on other parts of the tree then it can go upstream.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH] fsldma: Add DMA_SLAVE support

2009-06-16 Thread Ira Snyder
On Tue, Jun 16, 2009 at 12:01:40PM -0700, Dan Williams wrote:
> Hi Ira,
>
> Ira Snyder wrote:
>> Use the DMA_SLAVE capability of the DMAEngine API to copy/from a
>> scatterlist into an arbitrary list of hardware address/length pairs.
>>
>> This allows a single DMA transaction to copy data from several different
>> devices into a scatterlist at the same time.
>>
>> This also adds support to enable some controller-specific features such as
>> external start and external pause of a DMA transaction.
>>
>> Signed-off-by: Ira W. Snyder 
>> ---
>>
>> This is a request for comments on this patch. I hunch it is not quite
>> ready for inclusion, though it is certainly ready for review. Correct
>> functioning of this patch depends on the patches submitted earlier.
>>
>> As suggested by Dan Williams, I implemented DMA_SLAVE support for the
>> fsldma controller to allow me to use the hardware to transfer to/from a
>> scatterlist to a list of hardware address/length pairs.
>>
>> I implemented support for the extra features available in the DMA
>> controller, such as external pause and external start. I have not tested
>> the features yet. I am willing to drop the support if everything else
>> looks good.
>>
>> I have implemented helper functions for creating the list of hardware
>> address/length pairs as static inline functions in the linux/fsldma.h
>> header. Should I incorporate these into the driver itself and use
>> EXPORT_SYMBOL()? I've never done this before :)
>
> Using EXPORT_SYMBOL would defeat the purpose of conforming to the  
> dmaengine api which should allow other subsystems to generically  
> discover an fsldma resource.
>

Any driver would still use dma_request_channel(), etc. to get access to
a DMA channel. AFAICT, DMA_SLAVE is intended for doing something
completely hardware-specific with the DMA controller.

>> diff --git a/include/linux/fsldma.h b/include/linux/fsldma.h
>> new file mode 100644
>> index 000..a42dcdd
>> --- /dev/null
>> +++ b/include/linux/fsldma.h
>> @@ -0,0 +1,105 @@
>> +/*
>> + * Freescale MPC83XX / MPC85XX DMA Controller
>> + *
>> + * Copyright (c) 2009 Ira W. Snyder 
>> + *
>> + * This file is licensed under the terms of the GNU General Public License
>> + * version 2. This program is licensed "as is" without any warranty of any
>> + * kind, whether express or implied.
>> + */
>> +
>> +#ifndef __LINUX_FSLDMA_H__
>> +#define __LINUX_FSLDMA_H__
>> +
>> +#include 
>> +
>> +/*
>> + * physical hardware address / length pair for use with the
>> + * DMAEngine DMA_SLAVE API
>> + */
>> +struct fsl_dma_hw_addr {
>> +   struct list_head entry;
>> +
>> +   dma_addr_t address;
>> +   size_t length;
>> +};
>
> Can you explain a bit more why you need the new dma address list, would  
> a struct scatterlist suffice?
>

I don't believe so. A scatterlist only holds page/length pairs. How
would you pass an arbitrary dma_addr_t/length pair in a scatterlist. I
/could/ abuse sg_dma_address() and do something like the following, but
I think you'd be even less inclined to take the patch:

struct scatterlist sg[10];

sg_dma_address(sg) = addr1;
sg_dma_len(sg) = len1;
sg++;
sg_dma_address(sg) = addr2;
sg_dma_len(sg) = len2;

/* and so on */

This would mean that there is a scatterlist with the struct page
pointers set to NULL, which has not had dma_map_sg() run on it. Seems
like abuse to me.

What I've implemented is this: (sorry about the poor drawing)

scatterlist   fsl_dma_hw_addr
+++---+
|  DATA  | >  | DEST1 |
|  DATA  | +  +---+
|  DATA  | |
|  DATA  | |  +---+
|  DATA  | +--->  | DEST2 |
|  DATA  |+---+
++
  .
  .
  .

Of course, the reverse works as well. You can copy from a list of
hardware address/length pairs to a scatterlist.

So, using my implementation of the DMA_SLAVE feature, you can take a big
chunk of data (which is organized into a scatterlist) and DMA it
directly to a set of hardware addresses, all in a single, unbroken
transaction.

I've got an FPGA programmer which needs a ~12MB image dumped to a FIFO
at 0xf0003000 in 4K chunks (all writes must be in the 0xf0003000 to
0xf0004000 range). The programmer is actually in control of the DMA
controller at that time. Internally, the FPGA programmer does some
toggling of pins, etc. which is needed to actually push the image into
the FPGA's themselves.

> In general it is difficult to merge new functionality without an in-tree  
> user.  Can you share the client of this new api?
>

I've inlined the driver for the FPGA programmer below. I don't think it
is appropriate to push into mainline, since it will only work for our
board, and nothing else.

It is pretty simple, but I'm totally open to suggestions for changes. I
used a char device to fill in a scatterlist, then set up the DMA to
0xf0003000 in 4K chunks.

I've got another driver that uses the interface, but t

Re: [RFC PATCH] fsldma: Add DMA_SLAVE support

2009-06-16 Thread Dan Williams

Hi Ira,

Ira Snyder wrote:

Use the DMA_SLAVE capability of the DMAEngine API to copy/from a
scatterlist into an arbitrary list of hardware address/length pairs.

This allows a single DMA transaction to copy data from several different
devices into a scatterlist at the same time.

This also adds support to enable some controller-specific features such as
external start and external pause of a DMA transaction.

Signed-off-by: Ira W. Snyder 
---

This is a request for comments on this patch. I hunch it is not quite
ready for inclusion, though it is certainly ready for review. Correct
functioning of this patch depends on the patches submitted earlier.

As suggested by Dan Williams, I implemented DMA_SLAVE support for the
fsldma controller to allow me to use the hardware to transfer to/from a
scatterlist to a list of hardware address/length pairs.

I implemented support for the extra features available in the DMA
controller, such as external pause and external start. I have not tested
the features yet. I am willing to drop the support if everything else
looks good.

I have implemented helper functions for creating the list of hardware
address/length pairs as static inline functions in the linux/fsldma.h
header. Should I incorporate these into the driver itself and use
EXPORT_SYMBOL()? I've never done this before :)


Using EXPORT_SYMBOL would defeat the purpose of conforming to the 
dmaengine api which should allow other subsystems to generically 
discover an fsldma resource.



diff --git a/include/linux/fsldma.h b/include/linux/fsldma.h
new file mode 100644
index 000..a42dcdd
--- /dev/null
+++ b/include/linux/fsldma.h
@@ -0,0 +1,105 @@
+/*
+ * Freescale MPC83XX / MPC85XX DMA Controller
+ *
+ * Copyright (c) 2009 Ira W. Snyder 
+ *
+ * This file is licensed under the terms of the GNU General Public License
+ * version 2. This program is licensed "as is" without any warranty of any
+ * kind, whether express or implied.
+ */
+
+#ifndef __LINUX_FSLDMA_H__
+#define __LINUX_FSLDMA_H__
+
+#include 
+
+/*
+ * physical hardware address / length pair for use with the
+ * DMAEngine DMA_SLAVE API
+ */
+struct fsl_dma_hw_addr {
+   struct list_head entry;
+
+   dma_addr_t address;
+   size_t length;
+};


Can you explain a bit more why you need the new dma address list, would 
a struct scatterlist suffice?


In general it is difficult to merge new functionality without an in-tree 
user.  Can you share the client of this new api?


I suspect you can get away without needing these new helper routines. 
Have a look at how Haavard implemented his DMA_SLAVE client in 
drivers/mmc/host/atmel-mci.c.


Haavard ended up needing to add some public structure definitions to 
include/linux, but my preference is to keep this in an 
architecture/platform specific header file location if possible.


Regards,
Dan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [RFC PATCH] fsldma: Add DMA_SLAVE support

2009-06-04 Thread Ira Snyder
On Thu, Jun 04, 2009 at 07:20:26PM +0800, Li Yang-R58472 wrote:
> 
> >> This is a request for comments on this patch. I hunch it is 
> >not quite 
> >> ready for inclusion, though it is certainly ready for 
> >review. Correct 
> >> functioning of this patch depends on the patches submitted earlier.
> >> 
> >> As suggested by Dan Williams, I implemented DMA_SLAVE 
> >support for the 
> >> fsldma controller to allow me to use the hardware to 
> >transfer to/from 
> >> a scatterlist to a list of hardware address/length pairs.
> >> 
> >> I implemented support for the extra features available in the DMA 
> >> controller, such as external pause and external start. I have not 
> >> tested the features yet. I am willing to drop the support if 
> >> everything else looks good.
> >> 
> >> I have implemented helper functions for creating the list of 
> >hardware 
> >> address/length pairs as static inline functions in the 
> >linux/fsldma.h 
> >> header. Should I incorporate these into the driver itself and use 
> >> EXPORT_SYMBOL()? I've never done this before :)
> >> 
> >
> >Any comments on this patch?
> >
> 
> No comment for now.  Didn't get the time to study the DMA_SLAVE thing.
> :(
> 

No problem, I just wanted to make sure that it had been noticed.

The DMA_SLAVE thing is pretty simple: it uses chan->private to pass in
arbitrary parameters to the device_prep_slave_sg() routine. The
device_prep_slave_sg() routine takes a scatterlist and a DMA direction,
then uses the private data to setup the transfer.

When you're ready, submit the transfer and go, just like a normal memcpy
operation. It is basically a way to have a device-specific function in
the generic API.

> >I've tested the external start feature, and it works great. I 
> 
> Good.  Is it working with your custom board?  Can I verify this with my
> reference board?
> 

Yes, I only tested it on my custom board. We use the external DMA
control as part of the programming sequence for some FPGA's on the
board.

If I'm reading the eval board's schematic correctly, the pins for
external DMA control aren't hooked to anything useful at the moment.

If you really want to test it, you'll have to do some modifications to
your board. On page 9 of the schematic (upper right corner) you'll see
that the DMA pins are connected to the GPIO controller on the processor.
You could solder some wires between the first and second three GPIO
pins. Then you can configure SICRL so the first 3 pins are configured
for GPIO, and the second 3 pins are configured for external DMA control.
It is a bit of work, but not too bad.

> >had to set the DMA Request Count (in the mode register) after 
> >the driver was in control. There is no interface for doing 
> >this in the existing driver. I just ioremap()ed the registers 
> >and added the value from my driver after claiming the channel 
> >with dma_request_channel().
> >
> >There are ways to get the DMA controller into a state where 
> >the CB bit stays set no matter what you do. I have only seen 
> >this during an externally controlled transfer, when the 
> >external master does weird things. I would fix this state in 
> >terminate_all(), but I have been unsuccessful in getting the 
> >DMA controller working again after it has been messed up.
> 
> What's the frequency for this to happen?  Is the problem reproducable?

I can only cause it to totally lock up when there is an error, and my
FPGA programmer stops toggling the external DMA control pins in the
middle of a transfer. Nothing I can do will get the DMASR[CB] bit to
zero.

I tried causing the problem with a memory-to-memory transfer, and was
able to cause a similar problem. I programmed to the controller to read
32 bytes from 0xc000, which isn't part of the memory map.

The DMA never completes, and the controller has the following bits set:
DMAMR[CS]==1
DMASR[TE]==1
DMASR[CB]==1

The fsldma driver doesn't seem to detect the error at this point.

I was able to re-start the DMA controller with the following sequence:
DMAMR[CS]=0
DMASR[TE]=1 DMASR[CB]=1 (write 1's to set bits)
DMACDAR=0 DMASAR=0 DMADAR=0 DMABCR=0 DMANDAR=0  (zero all registers)
DMAMR[CS]=1

Now the fsldma driver tells me "transfer error!". The DMA channel is now
in working order, and I can use it again.

So, I can get something similar by doing a "bad" memory-to-memory
transfer, but I cannot completely lock up the DMA controller.

Thanks,
Ira
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


RE: [RFC PATCH] fsldma: Add DMA_SLAVE support

2009-06-04 Thread Li Yang-R58472

>> This is a request for comments on this patch. I hunch it is 
>not quite 
>> ready for inclusion, though it is certainly ready for 
>review. Correct 
>> functioning of this patch depends on the patches submitted earlier.
>> 
>> As suggested by Dan Williams, I implemented DMA_SLAVE 
>support for the 
>> fsldma controller to allow me to use the hardware to 
>transfer to/from 
>> a scatterlist to a list of hardware address/length pairs.
>> 
>> I implemented support for the extra features available in the DMA 
>> controller, such as external pause and external start. I have not 
>> tested the features yet. I am willing to drop the support if 
>> everything else looks good.
>> 
>> I have implemented helper functions for creating the list of 
>hardware 
>> address/length pairs as static inline functions in the 
>linux/fsldma.h 
>> header. Should I incorporate these into the driver itself and use 
>> EXPORT_SYMBOL()? I've never done this before :)
>> 
>
>Any comments on this patch?
>

No comment for now.  Didn't get the time to study the DMA_SLAVE thing.
:(

>I've tested the external start feature, and it works great. I 

Good.  Is it working with your custom board?  Can I verify this with my
reference board?

>had to set the DMA Request Count (in the mode register) after 
>the driver was in control. There is no interface for doing 
>this in the existing driver. I just ioremap()ed the registers 
>and added the value from my driver after claiming the channel 
>with dma_request_channel().
>
>There are ways to get the DMA controller into a state where 
>the CB bit stays set no matter what you do. I have only seen 
>this during an externally controlled transfer, when the 
>external master does weird things. I would fix this state in 
>terminate_all(), but I have been unsuccessful in getting the 
>DMA controller working again after it has been messed up.

What's the frequency for this to happen?  Is the problem reproducable?

- Leo
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [RFC PATCH] fsldma: Add DMA_SLAVE support

2009-06-03 Thread Ira Snyder
On Fri, May 15, 2009 at 03:56:59PM -0700, Ira Snyder wrote:
> Use the DMA_SLAVE capability of the DMAEngine API to copy/from a
> scatterlist into an arbitrary list of hardware address/length pairs.
> 
> This allows a single DMA transaction to copy data from several different
> devices into a scatterlist at the same time.
> 
> This also adds support to enable some controller-specific features such as
> external start and external pause of a DMA transaction.
> 
> Signed-off-by: Ira W. Snyder 
> ---
> 
> This is a request for comments on this patch. I hunch it is not quite
> ready for inclusion, though it is certainly ready for review. Correct
> functioning of this patch depends on the patches submitted earlier.
> 
> As suggested by Dan Williams, I implemented DMA_SLAVE support for the
> fsldma controller to allow me to use the hardware to transfer to/from a
> scatterlist to a list of hardware address/length pairs.
> 
> I implemented support for the extra features available in the DMA
> controller, such as external pause and external start. I have not tested
> the features yet. I am willing to drop the support if everything else
> looks good.
> 
> I have implemented helper functions for creating the list of hardware
> address/length pairs as static inline functions in the linux/fsldma.h
> header. Should I incorporate these into the driver itself and use
> EXPORT_SYMBOL()? I've never done this before :)
> 

Any comments on this patch?

I've tested the external start feature, and it works great. I had to set
the DMA Request Count (in the mode register) after the driver was in
control. There is no interface for doing this in the existing driver. I
just ioremap()ed the registers and added the value from my driver after
claiming the channel with dma_request_channel().

There are ways to get the DMA controller into a state where the CB bit
stays set no matter what you do. I have only seen this during an
externally controlled transfer, when the external master does weird
things. I would fix this state in terminate_all(), but I have been
unsuccessful in getting the DMA controller working again after it has
been messed up.

Ira
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev