Re: [PATCH v2] spi: qup: Add DMA capabilities

2015-02-27 Thread Stanimir Varbanov
On 02/26/2015 04:33 AM, Mark Brown wrote:
> On Tue, Feb 24, 2015 at 06:08:54PM +0200, Stanimir Varbanov wrote:
> 
>> yes, there is a potential race between atomic_inc and dma callback. I
>> reordered these calls to save few checks, and now it returns to me.
> 
>> I imagine few options here:
> 
>>  - reorder the dmaengine calls and atomic operations, i.e.
>> call atomic_inc for rx and tx channels before corresponding
>> dmaengine_submit and dmaengine_issue_pending.
> 
>>  - have two different dma callbacks and two completions and waiting for
>> the two.
> 
>>  - manage to receive only one dma callback, i.e. the last transfer in
>> case of presence of the rx_buf and tx_buf at the same time.
> 
>>  - let me see for better solution.
> 
> Any solution which doesn't make use of atomics is likely to be better,
> as I said they are enormously error prone.  A more common approach is a
> single completion triggering on the RX (for RX only or bidirectional
> transfers) or TX if that's the only thing active.  For most hardware you
> can just use the RX to manage completion since it must of necessity
> complete at the same time as or later than the transmit side, transmit
> often completes early since the DMA completes when the FIFO is full not
> when the data is on the wire.
> 

yep, that's what I wanted to express in third option above.

-- 
regards,
Stan
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] spi: qup: Add DMA capabilities

2015-02-25 Thread Mark Brown
On Tue, Feb 24, 2015 at 06:08:54PM +0200, Stanimir Varbanov wrote:

> yes, there is a potential race between atomic_inc and dma callback. I
> reordered these calls to save few checks, and now it returns to me.

> I imagine few options here:

>  - reorder the dmaengine calls and atomic operations, i.e.
> call atomic_inc for rx and tx channels before corresponding
> dmaengine_submit and dmaengine_issue_pending.

>  - have two different dma callbacks and two completions and waiting for
> the two.

>  - manage to receive only one dma callback, i.e. the last transfer in
> case of presence of the rx_buf and tx_buf at the same time.

>  - let me see for better solution.

Any solution which doesn't make use of atomics is likely to be better,
as I said they are enormously error prone.  A more common approach is a
single completion triggering on the RX (for RX only or bidirectional
transfers) or TX if that's the only thing active.  For most hardware you
can just use the RX to manage completion since it must of necessity
complete at the same time as or later than the transmit side, transmit
often completes early since the DMA completes when the FIFO is full not
when the data is on the wire.


signature.asc
Description: Digital signature


Re: [PATCH v2] spi: qup: Add DMA capabilities

2015-02-24 Thread Andy Gross
On Tue, Feb 24, 2015 at 06:08:54PM +0200, Stanimir Varbanov wrote:



> 
> yes, there is a potential race between atomic_inc and dma callback. I
> reordered these calls to save few checks, and now it returns to me.
> 
> I imagine few options here:
> 
>  - reorder the dmaengine calls and atomic operations, i.e.
> call atomic_inc for rx and tx channels before corresponding
> dmaengine_submit and dmaengine_issue_pending.
> 
>  - have two different dma callbacks and two completions and waiting for
> the two.

This is probably the better solution.  The only thing you'll have to take into
consideration is that you may not have a RX DMA transactions.

> 
>  - manage to receive only one dma callback, i.e. the last transfer in
> case of presence of the rx_buf and tx_buf at the same time.

You use separate channels for the RX and TX, so as long as you have separate
callbacks, it shouldnt be a problem.


-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] spi: qup: Add DMA capabilities

2015-02-24 Thread Ivan T. Ivanov

Hi Stan, 

Sorry I didn't saw this first look.

On Tue, 2015-02-24 at 15:00 +0200, Stanimir Varbanov wrote:



> 
> +static bool spi_qup_can_dma(struct spi_master *master, struct spi_device 
> *spi,
> +   struct spi_transfer 
> *xfer)
> +{
> +   struct spi_qup *qup = spi_master_get_devdata(master);
> +   size_t dma_align = dma_get_cache_alignment();
> +   int n_words, w_size;
> +
> +   qup->dma_available = 0;
> +
> +   if (xfer->rx_buf && xfer->len % qup->in_blk_sz)
> +   return false;
> +
> +   if (xfer->tx_buf && xfer->len % qup->out_blk_sz)
> +   return false;
> +

Actually we can end up here with tx_buf or rx_buf to be NULL.
Which voids my previous comments about these pointers.

It will be simpler if you just check transfer length.

And better return false if both are NULL.

> +   if (IS_ERR_OR_NULL(master->dma_rx) || IS_ERR_OR_NULL(master->dma_tx))
> +   return false;
> +
> +   if (!IS_ALIGNED((size_t)xfer->tx_buf, dma_align) ||
> +   !IS_ALIGNED((size_t)xfer->rx_buf, 
> dma_align))
> +   return false;

Testing NULL for alignment is fine, right?

> +   w_size = spi_qup_get_word_sz(xfer);
> +   n_words = xfer->len / w_size;
> +
> +   /* will use fifo mode */
> +   if (n_words <= (qup->in_fifo_sz / sizeof(u32)))
> +   return false;
> +
> +   qup->dma_available = 1;
> +
> +   return true;
> +}
> +

Regards,
Ivan
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] spi: qup: Add DMA capabilities

2015-02-24 Thread Stanimir Varbanov
On 02/24/2015 03:56 PM, Mark Brown wrote:
> On Tue, Feb 24, 2015 at 03:00:03PM +0200, Stanimir Varbanov wrote:
> 
>> +static void spi_qup_dma_done(void *data)
>> +{
>> +struct spi_qup *qup = data;
>> +
>> +if (atomic_dec_and_test(&qup->dma_outstanding))
>> +complete(&qup->done);
>> +}
> 
> I'm finding it hard to be thrilled about the use of atomics for
> synchronization (they're just generally hard to work with) and...
> 
>> +cookie = dmaengine_submit(desc);
>> +ret = dma_submit_error(cookie);
>> +if (ret)
>> +return ret;
> 
>> +atomic_inc(&qup->dma_outstanding);
> 
> ..don't we have two potential races here: one if somehow the DMA manages
> to complete prior to the atomic_inc() (unlikely but that's what race
> conditions are all about really) and one if we are issuing multiple DMAs
> and the early ones complete before the later ones are issued?
> 

yes, there is a potential race between atomic_inc and dma callback. I
reordered these calls to save few checks, and now it returns to me.

I imagine few options here:

 - reorder the dmaengine calls and atomic operations, i.e.
call atomic_inc for rx and tx channels before corresponding
dmaengine_submit and dmaengine_issue_pending.

 - have two different dma callbacks and two completions and waiting for
the two.

 - manage to receive only one dma callback, i.e. the last transfer in
case of presence of the rx_buf and tx_buf at the same time.

 - let me see for better solution.

Thanks for the comments.

regards,
Stan



--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] spi: qup: Add DMA capabilities

2015-02-24 Thread Ivan T. Ivanov

Hi Stan,

On Tue, 2015-02-24 at 15:00 +0200, Stanimir Varbanov wrote:
> 



>  #define SPI_MAX_RATE   5000
> @@ -143,6 +147,11 @@ struct spi_qup {
> int tx_bytes;
> int rx_bytes;
> int qup_v1;
> +
> +   int dma_available;

This is more like 'use dma for this transfer", right?

> +   struct dma_slave_configrx_conf;
> +   struct dma_slave_configtx_conf;
> +   atomic_t  dma_outstanding;

Do we really need this one. See below.

>  };
> 



> +
> +static int spi_qup_prep_sg(struct spi_master *master, struct spi_transfer 
> *xfer,
> +   enum dma_transfer_direction 
> dir)
> +{
> +   struct spi_qup *qup = spi_master_get_devdata(master);
> +   unsigned long flags = DMA_PREP_INTERRUPT | DMA_PREP_FENCE;
> +   struct dma_async_tx_descriptor *desc;
> +   struct scatterlist *sgl;
> +   dma_cookie_t cookie;
> +   unsigned int nents;
> +   struct dma_chan *chan;
> +   int ret;
> +
> +   if (dir == DMA_MEM_TO_DEV) {
> +   chan = master->dma_tx;
> +   nents = xfer->tx_sg.nents;
> +   sgl = xfer->tx_sg.sgl;
> +   } else {
> +   chan = master->dma_rx;
> +   nents = xfer->rx_sg.nents;
> +   sgl = xfer->rx_sg.sgl;
> +   }
> +
> +   desc = dmaengine_prep_slave_sg(chan, sgl, nents, dir, flags);
> +   if (!desc)
> +   return -EINVAL;
> +
> +   desc->callback = spi_qup_dma_done;
> +   desc->callback_param = qup;

What if we attach callback only on RX descriptor and use
dmaengine_tx_status() for TX channel in wait for completion?

> +
> +   cookie = dmaengine_submit(desc);
> +   ret = dma_submit_error(cookie);
> +   if (ret)
> +   return ret;
> +
> +   atomic_inc(&qup->dma_outstanding);
> +
> +   return 0;
> +}
> +
> +static int spi_qup_do_dma(struct spi_master *master, struct spi_transfer 
> *xfer)
> +{
> +   struct spi_qup *qup = spi_master_get_devdata(master);
> +   int ret;
> +
> +   atomic_set(&qup->dma_outstanding, 0);
> +
> +   reinit_completion(&qup->done);

Redundant, already done in transfer_one().

> +
> +   if (xfer->rx_buf) {

Always true.

> +   ret = spi_qup_prep_sg(master, xfer, DMA_DEV_TO_MEM);
> +   if (ret)
> +   return ret;
> +
> +   dma_async_issue_pending(master->dma_rx);
> +   }
> +
> +   if (xfer->tx_buf) {

Same.

> 
+   ret = spi_qup_prep_sg(master, xfer, DMA_MEM_TO_DEV);
> +   if (ret)
> +   goto err_rx;
> +
> +   dma_async_issue_pending(master->dma_tx);
> +   }
> +
> +   ret = spi_qup_set_state(qup, QUP_STATE_RUN);
> +   if (ret) {
> +   dev_warn(qup->dev, "cannot set RUN state\n");
> +   goto err_tx;
> +   }
> +
> +   if (!wait_for_completion_timeout(&qup->done, msecs_to_jiffies(1000))) 
> {

transfer_one() calculates timeout dynamically based on transfer length.

Transition in state RUN and wait for completion are already coded in 
transfer_one().
With little rearrangement they could be removed from here.

> +   ret = -ETIMEDOUT;
> +   goto err_tx;
> +   }
> +
> +   return 0;
> +
> +err_tx:
> +   if (xfer->tx_buf)

Always true.

> +   dmaengine_terminate_all(master->dma_tx);
> +err_rx:
> +   if (xfer->rx_buf)
> 

Same.

> +   dmaengine_terminate_all(master->dma_rx);
> +
> +   return ret;
> +}

I don't see reason for this function, based on comments so far :-).



> 
> @@ -621,10 +881,16 @@ static int spi_qup_probe(struct platform_device *pdev)
> writel_relaxed(0, base + SPI_CONFIG);
> writel_relaxed(SPI_IO_C_NO_TRI_STATE, base + SPI_IO_CONTROL);
> 
> +   ret = spi_qup_init_dma(master, res->start);
> +   if (ret == -EPROBE_DEFER)
> +   goto error;

Better move resource allocation before touching hardware.

Otherwise is looking good and I know that is working :-)

Regards,
Ivan



--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] spi: qup: Add DMA capabilities

2015-02-24 Thread Mark Brown
On Tue, Feb 24, 2015 at 03:00:03PM +0200, Stanimir Varbanov wrote:

> +static void spi_qup_dma_done(void *data)
> +{
> + struct spi_qup *qup = data;
> +
> + if (atomic_dec_and_test(&qup->dma_outstanding))
> + complete(&qup->done);
> +}

I'm finding it hard to be thrilled about the use of atomics for
synchronization (they're just generally hard to work with) and...

> + cookie = dmaengine_submit(desc);
> + ret = dma_submit_error(cookie);
> + if (ret)
> + return ret;

> + atomic_inc(&qup->dma_outstanding);

..don't we have two potential races here: one if somehow the DMA manages
to complete prior to the atomic_inc() (unlikely but that's what race
conditions are all about really) and one if we are issuing multiple DMAs
and the early ones complete before the later ones are issued?


signature.asc
Description: Digital signature


[PATCH v2] spi: qup: Add DMA capabilities

2015-02-24 Thread Stanimir Varbanov
From: Andy Gross 

This patch adds DMA capabilities to the spi-qup driver.  If DMA channels are
present, the QUP will use DMA instead of block mode for transfers to/from SPI
peripherals for transactions larger than the length of a block.

Signed-off-by: Andy Gross 
Signed-off-by: Stanimir Varbanov 
---

This is reworked version with comments addressed
 - use SPI core DMA mapping code
 - implemented .can_dma callback
 - use dmaengine api's to account deferred_probe

First version can be found at [1].

[1] https://lkml.org/lkml/2014/6/26/481

regards,
Stan

 .../devicetree/bindings/spi/qcom,spi-qup.txt   |8 +
 drivers/spi/spi-qup.c  |  300 +++-
 2 files changed, 293 insertions(+), 15 deletions(-)

diff --git a/Documentation/devicetree/bindings/spi/qcom,spi-qup.txt 
b/Documentation/devicetree/bindings/spi/qcom,spi-qup.txt
index e2c88df..a0357c7 100644
--- a/Documentation/devicetree/bindings/spi/qcom,spi-qup.txt
+++ b/Documentation/devicetree/bindings/spi/qcom,spi-qup.txt
@@ -33,6 +33,11 @@ Optional properties:
nodes.  If unspecified, a single SPI device without a chip
select can be used.
 
+- dmas: Two DMA channel specifiers following the convention outlined
+in bindings/dma/dma.txt
+- dma-names:Names for the dma channels, if present. There must be at
+least one channel named "tx" for transmit and named "rx" for
+receive.
 
 SPI slave nodes must be children of the SPI master node and can contain
 properties described in Documentation/devicetree/bindings/spi/spi-bus.txt
@@ -51,6 +56,9 @@ Example:
clocks = <&gcc GCC_BLSP2_QUP2_SPI_APPS_CLK>, <&gcc 
GCC_BLSP2_AHB_CLK>;
clock-names = "core", "iface";
 
+   dmas = <&blsp1_bam 13>, <&blsp2_bam 12>;
+   dma-names = "rx", "tx";
+
pinctrl-names = "default";
pinctrl-0 = <&spi8_default>;
 
diff --git a/drivers/spi/spi-qup.c b/drivers/spi/spi-qup.c
index e7fb5a0..386ae69 100644
--- a/drivers/spi/spi-qup.c
+++ b/drivers/spi/spi-qup.c
@@ -22,6 +22,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #define QUP_CONFIG 0x
 #define QUP_STATE  0x0004
@@ -116,6 +118,8 @@
 
 #define SPI_NUM_CHIPSELECTS4
 
+#define SPI_MAX_DMA_XFER   (SZ_64K - 64)
+
 /* high speed mode is when bus rate is greater then 26MHz */
 #define SPI_HS_MIN_RATE2600
 #define SPI_MAX_RATE   5000
@@ -143,6 +147,11 @@ struct spi_qup {
int tx_bytes;
int rx_bytes;
int qup_v1;
+
+   int dma_available;
+   struct dma_slave_config rx_conf;
+   struct dma_slave_config tx_conf;
+   atomic_tdma_outstanding;
 };
 
 
@@ -198,6 +207,16 @@ static int spi_qup_set_state(struct spi_qup *controller, 
u32 state)
return 0;
 }
 
+static int spi_qup_get_word_sz(struct spi_transfer *xfer)
+{
+   if (xfer->bits_per_word <= 8)
+   return 1;
+
+   if (xfer->bits_per_word <= 16)
+   return 2;
+
+   return 4;
+}
 
 static void spi_qup_fifo_read(struct spi_qup *controller,
struct spi_transfer *xfer)
@@ -266,6 +285,101 @@ static void spi_qup_fifo_write(struct spi_qup *controller,
}
 }
 
+static void spi_qup_dma_done(void *data)
+{
+   struct spi_qup *qup = data;
+
+   if (atomic_dec_and_test(&qup->dma_outstanding))
+   complete(&qup->done);
+}
+
+static int spi_qup_prep_sg(struct spi_master *master, struct spi_transfer 
*xfer,
+  enum dma_transfer_direction dir)
+{
+   struct spi_qup *qup = spi_master_get_devdata(master);
+   unsigned long flags = DMA_PREP_INTERRUPT | DMA_PREP_FENCE;
+   struct dma_async_tx_descriptor *desc;
+   struct scatterlist *sgl;
+   dma_cookie_t cookie;
+   unsigned int nents;
+   struct dma_chan *chan;
+   int ret;
+
+   if (dir == DMA_MEM_TO_DEV) {
+   chan = master->dma_tx;
+   nents = xfer->tx_sg.nents;
+   sgl = xfer->tx_sg.sgl;
+   } else {
+   chan = master->dma_rx;
+   nents = xfer->rx_sg.nents;
+   sgl = xfer->rx_sg.sgl;
+   }
+
+   desc = dmaengine_prep_slave_sg(chan, sgl, nents, dir, flags);
+   if (!desc)
+   return -EINVAL;
+
+   desc->callback = spi_qup_dma_done;
+   desc->callback_param = qup;
+
+   cookie = dmaengine_submit(desc);
+   ret = dma_submit_error(cookie);
+   if (ret)
+   return ret;
+
+   atomic_inc(&qup->dma_outstanding);
+
+   return 0;
+}
+
+static int spi_qup_do_dma(struct spi_master *master, struct spi_transfer *xfer)
+{
+   struct spi_qup *qup = spi_master_get_devdata(master);
+   int ret;
+