Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-26 Thread Konstantin Dorfman

Hello,

On 11/19/2012 11:34 PM, Per Förlin wrote:

On 11/19/2012 03:32 PM, Per Förlin wrote:

On 11/19/2012 10:48 AM, Konstantin Dorfman wrote:


I'm fine with wait_event() (block request transfers) combined with completion 
(for other transfer), as long as the core.c API is intact. I don't see a reason 
for a new start_data_req()

mmc_start_req() is only used by the mmc block transfers.
Making a test case in mmc_test.c is a good way to stress test the feature (e.g. 
random timing of incoming new requests) and it will show how the API works too.

BR
Per

Right now there are couple of tests in mmc_test module that use async 
request mechanism. After applying my fix (v3 mmc: fix async request 
mechanism for sequential read scenarios), these test were broken.


The patch attached fixes those broken tests and also you can see exactly 
what is API change. It is not in mmc_start_req() signature, it is new 
context_info field that we need to synchronize with NEW_REQUEST 
notification. mmc_test is not uses the notification, but it is very easy 
to implement.




My main concern is to avoid adding new API to core.c in order to add the 
NEW_REQ feature. My patch is an example of changes to be done in core.c without 
adding new functions.

Now you can review API changes.





We would like to have a generic capability to handle additional events,
such as HPI/stop flow, in addition to the NEW_REQUEST notification.
Therefore, an event mechanism seems to be a better design than completion.

I've looked at SDIO code and from what I can understand, right now SDIO
is not using async request mechanism and works from 'wait_for_cmd()`
level. This means that such work as exposing MMC core API's is major
change and definitely should be separate design and implementation
effort, while my current patch right now will fix mmc thread behavior
and give better performance for upper layers.

You are right. There is no support in the SDIO implementation for pre/post/async feature. 
Still I had it in mind design the struct mmc_async. It's possible to re-use 
the mmc core parts in order to utilize this support in the SDIO case. I'm not saying we 
should design for SDIO but at least keep the API in a way that it's potential usable from 
SDIO too. The API of core.c (start/wait of requests) should make sense without the 
existence of MMC block driver (block/queue).

One way to test the design is to add a test in mmc_test.c for penging NEW_REQ. 
In mmc_test.c there is not block.c or queue.c.
If the test case if simple to write in mmc_test.c it's means that the API is on 
a good level.
I can simulate single NEW_PACKET notification in the mmc_test, but this 
will check only correctness, without real queue of requests we can't 
see/measure tpt/latency gain of this.


Kind reminder about patch v3, waiting for your review.

Thanks

--
Konstantin Dorfman,
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
From 240fa5757e9c784679b391ba42ec58e70fe856d9 Mon Sep 17 00:00:00 2001
From: Konstantin Dorfman kdorf...@codeaurora.org
Date: Mon, 26 Nov 2012 11:50:35 +0200
Subject: [RFC/PATCH] mmc_test: fix non-blocking req test to support mmc core
 wait_event mechanism

After introduce new wait_event synchronization mechanism at mmc/core/core.c level,
non-blocking request tests from mmc_test ut module are broken.

This patch fixes the tests by updating test environment to work
correctly on top of new wait_event mechanism.

Signed-off-by: Konstantin Dorfman kdorf...@codeaurora.org

diff --git a/drivers/mmc/card/mmc_test.c b/drivers/mmc/card/mmc_test.c
index 759714e..764471f 100644
--- a/drivers/mmc/card/mmc_test.c
+++ b/drivers/mmc/card/mmc_test.c
@@ -764,7 +764,8 @@ static int mmc_test_check_broken_result(struct mmc_test_card *test,
 static void mmc_test_nonblock_reset(struct mmc_request *mrq,
 struct mmc_command *cmd,
 struct mmc_command *stop,
-struct mmc_data *data)
+struct mmc_data *data,
+struct mmc_context_info *context_info)
 {
 	memset(mrq, 0, sizeof(struct mmc_request));
 	memset(cmd, 0, sizeof(struct mmc_command));
@@ -774,6 +775,7 @@ static void mmc_test_nonblock_reset(struct mmc_request *mrq,
 	mrq-cmd = cmd;
 	mrq-data = data;
 	mrq-stop = stop;
+	mrq-context_info = context_info;
 }
 static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
   struct scatterlist *sg, unsigned sg_len,
@@ -790,6 +792,8 @@ static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
 	struct mmc_command stop2;
 	struct mmc_data data2;
 
+	struct mmc_context_info context_info;
+
 	struct mmc_test_async_req test_areq[2];
 	struct mmc_async_req *done_areq;
 	struct mmc_async_req *cur_areq = test_areq[0].areq;
@@ -800,14 +804,18 @@ static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
 	test_areq[0].test = test;
 	test_areq[1].test = test;
 
-	mmc_test_nonblock_reset(mrq1, cmd1, stop1, data1);
-	

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-20 Thread Konstantin Dorfman

Hello,

On 11/19/2012 11:34 PM, Per Förlin wrote:


   if (host-areq) {
+ if (!areq)
+ mmc_prefetch_init(host-areq,
+   host-areq-mrq-completion);
   mmc_wait_for_req_done(host, host-areq-mrq);
+ if (!areq) {
+ mmc_prefetch_start(host-areq, false);
+ if (mmc_prefetch_pending(host-areq))
+ return NULL;

In this case, mmc thread may be unblocked because done() arrived for
current request and not because new request notification. In such a case
we would like the done request to be handled before fetching the new
request.

I agree. We wake up and there is no pending new request execution should 
proceed normally by handling the completed request.


In my code is_done_rcv flag used along with is_new_req flag in
order to differentiate the reason for mmc thread awake.

In my patch the return value of mmc_prefetch_pending() specifies if thee is a 
pending request.what should happen.
If both current request is done and mmc_prefetch_pending is done at the same 
time I see there is a problem with my patch. There needs to be a flag to 
indicate if current request is done.
There is race between done() and NEW_REQUEST events, also when we got 
both of them, the order is to complete current and then to fetch new.




I understand your motivation and idea for re-structure the patch. It is
still not clear for me how exactly mmc thread will be awaken on new
request notification in your version, but I have another problem:


mmc_request_fn() is called and it calls complete(mq-prefetch_completion) which 
wakes up the current thread.
My patch is just an example. The intention is to make the patch cleaner. But I 
may have missed some of the HPI aspects.


Is it the lack of functions wrappers that you are using in your example?

It's fine to skip the functions wrappers if it makes no sense.
What I want to avoid is to create new functions for data_req_start and 
data_req_wait.
I would rather add this to the existing functions in core.c and keep changes in 
block.c and the flow minimal.


As I understand your example, you mean to implement generic logic on
core/core.c level by using wrapper functions and leave final
implementation for MMC to card/queue.c and for SDIO layer to
card/..sdio.. (I'm not too familiar with sdio protocol implementation).
Well, it is make a lot of sense.


I still have plans to add pre/post/async support in the SDIO framework too 
but I have not got to it.
I would be nice to add your NEW_REQ feature along with the other mmc-async 
features, to make it usable from SDIO.



But the devil is in details - there is a lot of work in
mmc_wait_for_data_req_done(), done() callback and also error handling
changes for card/block.c

Things in core.c could be re-used in the SDIO case. In block.c there are only 
FS specific implementation not relevant for SDIO.



Do you think, that wait_event() API used not suits the same semantic as
completion API?

The waiting mechanims may be wait_event or completion.
I'm not in favor of using both. Better to replace completion with wait_event if 
you prefer.


I'm fine with wait_event() (block request transfers) combined with completion 
(for other transfer), as long as the core.c API is intact. I don't see a reason 
for a new start_data_req()

mmc_start_req() is only used by the mmc block transfers.
The main idea is to change start_req() waiting point 
(mmc_wait_for_data_req_done() function) in such way that external events 
(in the MMC case it is coming from block layer) may awake MMC context 
from waiting for current request. This is done by change the original 
mmc_start_req() function and not changing its signature of mmc_start_req().


May I ask you to see [PATCH v2] patch? I think that is is no change in 
core.c API (by core.c API you mean all functions from core.h?).


Also to use new request feature of core.c one need to implement 
notification similar to mmc_request_fn() and I do not see difficulties 
to do it from SDIO.




Making a test case in mmc_test.c is a good way to stress test the feature (e.g. 
random timing of incoming new requests) and it will show how the API works too.

BR
Per


My main concern is to avoid adding new API to core.c in order to add the 
NEW_REQ feature. My patch is an example of changes to be done in core.c without 
adding new functions.



We would like to have a generic capability to handle additional events,
such as HPI/stop flow, in addition to the NEW_REQUEST notification.
Therefore, an event mechanism seems to be a better design than completion.

I've looked at SDIO code and from what I can understand, right now SDIO
is not using async request mechanism and works from 'wait_for_cmd()`
level. This means that such work as exposing MMC core API's is major
change and definitely should be separate design and implementation
effort, while my current patch right now will fix mmc 

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-20 Thread Konstantin Dorfman
Correction inside
On Tue, November 20, 2012 6:26 pm, Konstantin Dorfman wrote:
 Hello,

 On 11/19/2012 11:34 PM, Per Förlin wrote:

if (host-areq) {
 + if (!areq)
 + mmc_prefetch_init(host-areq,
 +
 host-areq-mrq-completion);
mmc_wait_for_req_done(host, host-areq-mrq);
 + if (!areq) {
 + mmc_prefetch_start(host-areq, false);
 + if (mmc_prefetch_pending(host-areq))
 + return NULL;
 In this case, mmc thread may be unblocked because done() arrived for
 current request and not because new request notification. In such a
 case
 we would like the done request to be handled before fetching the new
 request.
 I agree. We wake up and there is no pending new request execution
 should proceed normally by handling the completed request.

 In my code is_done_rcv flag used along with is_new_req flag in
 order to differentiate the reason for mmc thread awake.
 In my patch the return value of mmc_prefetch_pending() specifies if
 thee is a pending request.what should happen.
 If both current request is done and mmc_prefetch_pending is done at the
 same time I see there is a problem with my patch. There needs to be a
 flag to indicate if current request is done.
 There is race between done() and NEW_REQUEST events, also when we got
 both of them, the order is to complete current and then to fetch new.


 I understand your motivation and idea for re-structure the patch. It
 is
 still not clear for me how exactly mmc thread will be awaken on new
 request notification in your version, but I have another problem:

 mmc_request_fn() is called and it calls
 complete(mq-prefetch_completion) which wakes up the current thread.
 My patch is just an example. The intention is to make the patch
 cleaner. But I may have missed some of the HPI aspects.

 Is it the lack of functions wrappers that you are using in your
 example?
 It's fine to skip the functions wrappers if it makes no sense.
 What I want to avoid is to create new functions for data_req_start and
 data_req_wait.
 I would rather add this to the existing functions in core.c and keep
 changes in block.c and the flow minimal.

 As I understand your example, you mean to implement generic logic on
 core/core.c level by using wrapper functions and leave final
 implementation for MMC to card/queue.c and for SDIO layer to
 card/..sdio.. (I'm not too familiar with sdio protocol
 implementation).
 Well, it is make a lot of sense.

 I still have plans to add pre/post/async support in the SDIO
 framework too but I have not got to it.
 I would be nice to add your NEW_REQ feature along with the other
 mmc-async features, to make it usable from SDIO.


 But the devil is in details - there is a lot of work in
 mmc_wait_for_data_req_done(), done() callback and also error handling
 changes for card/block.c
 Things in core.c could be re-used in the SDIO case. In block.c there
 are only FS specific implementation not relevant for SDIO.


 Do you think, that wait_event() API used not suits the same semantic
 as
 completion API?
 The waiting mechanims may be wait_event or completion.
 I'm not in favor of using both. Better to replace completion with
 wait_event if you prefer.

 I'm fine with wait_event() (block request transfers) combined with
 completion (for other transfer), as long as the core.c API is intact. I
 don't see a reason for a new start_data_req()

 mmc_start_req() is only used by the mmc block transfers.
 The main idea is to change start_req() waiting point
 (mmc_wait_for_data_req_done() function) in such way that external events
 (in the MMC case it is coming from block layer) may awake MMC context
 from waiting for current request. This is done by change the original
 mmc_start_req() function and not changing its signature of
 mmc_start_req().

 May I ask you to see [PATCH v2] patch? I think that is is no change in
 core.c API (by core.c API you mean all functions from core.h?).

[PATCH v3] mmc: fix async request mechanism for sequential read scenarios

 Also to use new request feature of core.c one need to implement
 notification similar to mmc_request_fn() and I do not see difficulties
 to do it from SDIO.


 Making a test case in mmc_test.c is a good way to stress test the
 feature (e.g. random timing of incoming new requests) and it will show
 how the API works too.

 BR
 Per

 My main concern is to avoid adding new API to core.c in order to add
 the NEW_REQ feature. My patch is an example of changes to be done in
 core.c without adding new functions.


 We would like to have a generic capability to handle additional
 events,
 such as HPI/stop flow, in addition to the NEW_REQUEST notification.
 Therefore, an event mechanism seems to be a better design than
 completion.

 I've looked at SDIO code and from what I can understand, right now
 SDIO
 is not using async request mechanism and works from 'wait_for_cmd()`
 level. This means that such 

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-19 Thread Konstantin Dorfman
Hello Per,

Thank you for giving me an example, below I'm trying to explain some
logic issues of it and asking you some questions about your vision of
the patch.

On 11/15/2012 06:38 PM, Per Förlin wrote:
 On 11/14/2012 04:15 PM, Konstantin Dorfman wrote:
 Hello Per,

 On 11/13/2012 11:10 PM, Per Forlin wrote:
 On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
 kdorf...@codeaurora.org wrote:
 Hello,

 On 10/29/2012 11:40 PM, Per Forlin wrote:
 Hi,

 I would like to move the focus of my concerns from root cause analysis
 to the actual patch,
 My first reflection is that the patch is relatively complex and some
 of the code looks duplicated.
 Would it be possible to simplify it and re-use  the current execution 
 flow.

 Is the following flow feasible?

 in mmc_start_req():
 --
 if (rqc == NULL  not_resend)
   wait_for_both_mmc_and_arrival_of_new_req
 else
   wait_only_for_mmc

 if (arrival_of_new_req) {
set flag to indicate fetch-new_req
   return NULL;
 }
 -

 in queue.c:
 if (fetch-new_req)
   don't overwrite previous req.

 BR
 Per

 You are talking about using both waiting mechanisms, old (wait on
 completion) and new - wait_event_interruptible()? But how done()
 callback, called on host controller irq context, will differentiate
 which one is relevant for the request?

 I think it is better to use single wait point for mmc thread.

 I have sketch a patch to better explain my point. It's not tested it
 barely compiles :)
 The patch tries to introduce your feature and still keep the same code
 path. And it exposes an API that could be used by SDIO as well.
 The intention of my sketch patch is only to explain what I tried to
 visualize in the pseudo code previously in this thread.
 The out come of your final patch should be documented here I think:
 Documentation/mmc/mmc-async-req.txt
 This document is ready, attaching it to this mail and will be included
 in next version of the patch (or RESEND).

 Here follows my patch sketch:
 
 diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
 index e360a97..08036a1 100644
 --- a/drivers/mmc/card/queue.c
 +++ b/drivers/mmc/card/queue.c
 @@ -66,6 +66,8 @@ static int mmc_queue_thread(void *d)
   spin_unlock_irq(q-queue_lock);

   if (req || mq-mqrq_prev-req) {
 + if (!req)
 + 
 mmc_prefetch_start(mq-mqrq_prev-mmc_active, true);
   set_current_state(TASK_RUNNING);
   mq-issue_fn(mq, req);
   } else {
 @@ -79,11 +81,13 @@ static int mmc_queue_thread(void *d)
   }

   /* Current request becomes previous request and vice versa. */
 - mq-mqrq_prev-brq.mrq.data = NULL;
 - mq-mqrq_prev-req = NULL;
 - tmp = mq-mqrq_prev;
 - mq-mqrq_prev = mq-mqrq_cur;
 - mq-mqrq_cur = tmp;
 + if (!mmc_prefetch_pending(mq-mqrq_prev-mmc_active)) {
 + mq-mqrq_prev-brq.mrq.data = NULL;
 + mq-mqrq_prev-req = NULL;
 + tmp = mq-mqrq_prev;
 + mq-mqrq_prev = mq-mqrq_cur;
 + mq-mqrq_cur = tmp;
 + }
   } while (1);
   up(mq-thread_sem);

 @@ -109,10 +113,44 @@ static void mmc_request_fn(struct request_queue *q)
   return;
   }

 + if (mq-prefetch_enable) {
 + spin_lock(mq-prefetch_lock);
 + if (mq-prefetch_completion)
 + complete(mq-prefetch_completion);
Since mq-prefetch_completion init happens only in core.c after
mmc_start_req(NULL), we can miss all new request notifications coming
from fetching NULL request until then.
 + mq-prefetch_pending = true;
 + spin_unlock(mq-prefetch_lock);
 + }
 +
   if (!mq-mqrq_cur-req  !mq-mqrq_prev-req)
   wake_up_process(mq-thread);
  }

 +static void mmc_req_init(struct mmc_async_req *areq, struct completion 
 *compl)
 +{
 + struct mmc_queue *mq =
 + container_of(areq-prefetch, struct mmc_queue, prefetch);
 +
 + spin_lock(mq-prefetch_lock);
 + mq-prefetch_completion = compl;
 + if (mq-prefetch_pending)
 + complete(mq-prefetch_completion);
 + spin_unlock(mq-prefetch_lock);
 +}
 +
 +static void mmc_req_start(struct mmc_async_req *areq, bool enable)
 +{
 + struct mmc_queue *mq =
 + container_of(areq-prefetch, struct mmc_queue, prefetch);
 + mq-prefetch_enable = enable;
 +}
 +
 +static bool mmc_req_pending(struct mmc_async_req *areq)
 +{
 + struct mmc_queue *mq =
 + container_of(areq-prefetch, struct mmc_queue, prefetch);
 + return mq-prefetch_pending;
 +}
 +
  static struct scatterlist *mmc_alloc_sg(int sg_len, int *err)
  {
   struct scatterlist *sg;
 @@ -166,6 +204,12 @@ int mmc_init_queue(struct mmc_queue *mq, struct
 mmc_card *card,
 

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-19 Thread Per Förlin
On 11/19/2012 10:48 AM, Konstantin Dorfman wrote:
 Hello Per,
 
 Thank you for giving me an example, below I'm trying to explain some
 logic issues of it and asking you some questions about your vision of
 the patch.
 
 On 11/15/2012 06:38 PM, Per Förlin wrote:
 On 11/14/2012 04:15 PM, Konstantin Dorfman wrote:
 Hello Per,

 On 11/13/2012 11:10 PM, Per Forlin wrote:
 On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
 kdorf...@codeaurora.org wrote:
 Hello,

 On 10/29/2012 11:40 PM, Per Forlin wrote:
 Hi,

 I would like to move the focus of my concerns from root cause analysis
 to the actual patch,
 My first reflection is that the patch is relatively complex and some
 of the code looks duplicated.
 Would it be possible to simplify it and re-use  the current execution 
 flow.

 Is the following flow feasible?

 in mmc_start_req():
 --
 if (rqc == NULL  not_resend)
   wait_for_both_mmc_and_arrival_of_new_req
 else
   wait_only_for_mmc

 if (arrival_of_new_req) {
set flag to indicate fetch-new_req
   return NULL;
 }
 -

 in queue.c:
 if (fetch-new_req)
   don't overwrite previous req.

 BR
 Per

 You are talking about using both waiting mechanisms, old (wait on
 completion) and new - wait_event_interruptible()? But how done()
 callback, called on host controller irq context, will differentiate
 which one is relevant for the request?

 I think it is better to use single wait point for mmc thread.

 I have sketch a patch to better explain my point. It's not tested it
 barely compiles :)
 The patch tries to introduce your feature and still keep the same code
 path. And it exposes an API that could be used by SDIO as well.
 The intention of my sketch patch is only to explain what I tried to
 visualize in the pseudo code previously in this thread.
 The out come of your final patch should be documented here I think:
 Documentation/mmc/mmc-async-req.txt
 This document is ready, attaching it to this mail and will be included
 in next version of the patch (or RESEND).

 Here follows my patch sketch:
 
 diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
 index e360a97..08036a1 100644
 --- a/drivers/mmc/card/queue.c
 +++ b/drivers/mmc/card/queue.c
 @@ -66,6 +66,8 @@ static int mmc_queue_thread(void *d)
   spin_unlock_irq(q-queue_lock);

   if (req || mq-mqrq_prev-req) {
 + if (!req)
 + 
 mmc_prefetch_start(mq-mqrq_prev-mmc_active, true);
   set_current_state(TASK_RUNNING);
   mq-issue_fn(mq, req);
   } else {
 @@ -79,11 +81,13 @@ static int mmc_queue_thread(void *d)
   }

   /* Current request becomes previous request and vice versa. 
 */
 - mq-mqrq_prev-brq.mrq.data = NULL;
 - mq-mqrq_prev-req = NULL;
 - tmp = mq-mqrq_prev;
 - mq-mqrq_prev = mq-mqrq_cur;
 - mq-mqrq_cur = tmp;
 + if (!mmc_prefetch_pending(mq-mqrq_prev-mmc_active)) {
 + mq-mqrq_prev-brq.mrq.data = NULL;
 + mq-mqrq_prev-req = NULL;
 + tmp = mq-mqrq_prev;
 + mq-mqrq_prev = mq-mqrq_cur;
 + mq-mqrq_cur = tmp;
 + }
   } while (1);
   up(mq-thread_sem);

 @@ -109,10 +113,44 @@ static void mmc_request_fn(struct request_queue *q)
   return;
   }

 + if (mq-prefetch_enable) {
 + spin_lock(mq-prefetch_lock);
 + if (mq-prefetch_completion)
 + complete(mq-prefetch_completion);
 Since mq-prefetch_completion init happens only in core.c after
 mmc_start_req(NULL), we can miss all new request notifications coming
 from fetching NULL request until then.
It's a mistake in the patch.
I meant check mq-prefetch_pending to know whether we need to wait in the first 
place.
That's why mq-prefetch_pending is assigned even if mq-prefetch_completion is 
not set yet.
Patch needs to be updated to take it into account. One could let 
mmc_prefecth_init() return and indicate if there is already a NEW_REQ pending. 
If this is the case skip wait.


 + mq-prefetch_pending = true;
 + spin_unlock(mq-prefetch_lock);
 + }
 +
   if (!mq-mqrq_cur-req  !mq-mqrq_prev-req)
   wake_up_process(mq-thread);
  }

 +static void mmc_req_init(struct mmc_async_req *areq, struct completion 
 *compl)
 +{
 + struct mmc_queue *mq =
 + container_of(areq-prefetch, struct mmc_queue, prefetch);
 +
 + spin_lock(mq-prefetch_lock);
 + mq-prefetch_completion = compl;
 + if (mq-prefetch_pending)
 + complete(mq-prefetch_completion);
 + spin_unlock(mq-prefetch_lock);
 +}
 +
 +static void mmc_req_start(struct mmc_async_req *areq, bool enable)
 +{
 + struct mmc_queue *mq =
 + container_of(areq-prefetch, struct mmc_queue, 

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-19 Thread Per Förlin
On 11/19/2012 03:32 PM, Per Förlin wrote:
 On 11/19/2012 10:48 AM, Konstantin Dorfman wrote:
 Hello Per,

 Thank you for giving me an example, below I'm trying to explain some
 logic issues of it and asking you some questions about your vision of
 the patch.

 On 11/15/2012 06:38 PM, Per Förlin wrote:
 On 11/14/2012 04:15 PM, Konstantin Dorfman wrote:
 Hello Per,

 On 11/13/2012 11:10 PM, Per Forlin wrote:
 On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
 kdorf...@codeaurora.org wrote:
 Hello,

 On 10/29/2012 11:40 PM, Per Forlin wrote:
 Hi,

 I would like to move the focus of my concerns from root cause analysis
 to the actual patch,
 My first reflection is that the patch is relatively complex and some
 of the code looks duplicated.
 Would it be possible to simplify it and re-use  the current execution 
 flow.

 Is the following flow feasible?

 in mmc_start_req():
 --
 if (rqc == NULL  not_resend)
   wait_for_both_mmc_and_arrival_of_new_req
 else
   wait_only_for_mmc

 if (arrival_of_new_req) {
set flag to indicate fetch-new_req
   return NULL;
 }
 -

 in queue.c:
 if (fetch-new_req)
   don't overwrite previous req.

 BR
 Per

 You are talking about using both waiting mechanisms, old (wait on
 completion) and new - wait_event_interruptible()? But how done()
 callback, called on host controller irq context, will differentiate
 which one is relevant for the request?

 I think it is better to use single wait point for mmc thread.

 I have sketch a patch to better explain my point. It's not tested it
 barely compiles :)
 The patch tries to introduce your feature and still keep the same code
 path. And it exposes an API that could be used by SDIO as well.
 The intention of my sketch patch is only to explain what I tried to
 visualize in the pseudo code previously in this thread.
 The out come of your final patch should be documented here I think:
 Documentation/mmc/mmc-async-req.txt
 This document is ready, attaching it to this mail and will be included
 in next version of the patch (or RESEND).

 Here follows my patch sketch:
 
 diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
 index e360a97..08036a1 100644
 --- a/drivers/mmc/card/queue.c
 +++ b/drivers/mmc/card/queue.c
 @@ -66,6 +66,8 @@ static int mmc_queue_thread(void *d)
   spin_unlock_irq(q-queue_lock);

   if (req || mq-mqrq_prev-req) {
 + if (!req)
 + 
 mmc_prefetch_start(mq-mqrq_prev-mmc_active, true);
   set_current_state(TASK_RUNNING);
   mq-issue_fn(mq, req);
   } else {
 @@ -79,11 +81,13 @@ static int mmc_queue_thread(void *d)
   }

   /* Current request becomes previous request and vice versa. 
 */
 - mq-mqrq_prev-brq.mrq.data = NULL;
 - mq-mqrq_prev-req = NULL;
 - tmp = mq-mqrq_prev;
 - mq-mqrq_prev = mq-mqrq_cur;
 - mq-mqrq_cur = tmp;
 + if (!mmc_prefetch_pending(mq-mqrq_prev-mmc_active)) {
 + mq-mqrq_prev-brq.mrq.data = NULL;
 + mq-mqrq_prev-req = NULL;
 + tmp = mq-mqrq_prev;
 + mq-mqrq_prev = mq-mqrq_cur;
 + mq-mqrq_cur = tmp;
 + }
   } while (1);
   up(mq-thread_sem);

 @@ -109,10 +113,44 @@ static void mmc_request_fn(struct request_queue *q)
   return;
   }

 + if (mq-prefetch_enable) {
 + spin_lock(mq-prefetch_lock);
 + if (mq-prefetch_completion)
 + complete(mq-prefetch_completion);
 Since mq-prefetch_completion init happens only in core.c after
 mmc_start_req(NULL), we can miss all new request notifications coming
 from fetching NULL request until then.
 It's a mistake in the patch.
 I meant check mq-prefetch_pending to know whether we need to wait in the 
 first place.
 That's why mq-prefetch_pending is assigned even if mq-prefetch_completion 
 is not set yet.
 Patch needs to be updated to take it into account. One could let 
 mmc_prefecth_init() return and indicate if there is already a NEW_REQ 
 pending. If this is the case skip wait.
 
 
 + mq-prefetch_pending = true;
 + spin_unlock(mq-prefetch_lock);
 + }
 +
   if (!mq-mqrq_cur-req  !mq-mqrq_prev-req)
   wake_up_process(mq-thread);
  }

 +static void mmc_req_init(struct mmc_async_req *areq, struct completion 
 *compl)
 +{
 + struct mmc_queue *mq =
 + container_of(areq-prefetch, struct mmc_queue, prefetch);
 +
 + spin_lock(mq-prefetch_lock);
 + mq-prefetch_completion = compl;
 + if (mq-prefetch_pending)
 + complete(mq-prefetch_completion);
 + spin_unlock(mq-prefetch_lock);
 +}
 +
 +static void mmc_req_start(struct mmc_async_req *areq, bool enable)
 +{
 + struct mmc_queue *mq =
 +   

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-15 Thread Per Förlin
On 11/14/2012 04:15 PM, Konstantin Dorfman wrote:
 Hello Per,
 
 On 11/13/2012 11:10 PM, Per Forlin wrote:
 On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
 kdorf...@codeaurora.org wrote:
 Hello,

 On 10/29/2012 11:40 PM, Per Forlin wrote:
 Hi,

 I would like to move the focus of my concerns from root cause analysis
 to the actual patch,
 My first reflection is that the patch is relatively complex and some
 of the code looks duplicated.
 Would it be possible to simplify it and re-use  the current execution flow.

 Is the following flow feasible?

 in mmc_start_req():
 --
 if (rqc == NULL  not_resend)
   wait_for_both_mmc_and_arrival_of_new_req
 else
   wait_only_for_mmc

 if (arrival_of_new_req) {
set flag to indicate fetch-new_req
   return NULL;
 }
 -

 in queue.c:
 if (fetch-new_req)
   don't overwrite previous req.

 BR
 Per

 You are talking about using both waiting mechanisms, old (wait on
 completion) and new - wait_event_interruptible()? But how done()
 callback, called on host controller irq context, will differentiate
 which one is relevant for the request?

 I think it is better to use single wait point for mmc thread.

 I have sketch a patch to better explain my point. It's not tested it
 barely compiles :)
 The patch tries to introduce your feature and still keep the same code
 path. And it exposes an API that could be used by SDIO as well.
 The intention of my sketch patch is only to explain what I tried to
 visualize in the pseudo code previously in this thread.
 The out come of your final patch should be documented here I think:
 Documentation/mmc/mmc-async-req.txt
 This document is ready, attaching it to this mail and will be included
 in next version of the patch (or RESEND).

 Here follows my patch sketch:
 
 diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
 index e360a97..08036a1 100644
 --- a/drivers/mmc/card/queue.c
 +++ b/drivers/mmc/card/queue.c
 @@ -66,6 +66,8 @@ static int mmc_queue_thread(void *d)
   spin_unlock_irq(q-queue_lock);

   if (req || mq-mqrq_prev-req) {
 + if (!req)
 + mmc_prefetch_start(mq-mqrq_prev-mmc_active, 
 true);
   set_current_state(TASK_RUNNING);
   mq-issue_fn(mq, req);
   } else {
 @@ -79,11 +81,13 @@ static int mmc_queue_thread(void *d)
   }

   /* Current request becomes previous request and vice versa. */
 - mq-mqrq_prev-brq.mrq.data = NULL;
 - mq-mqrq_prev-req = NULL;
 - tmp = mq-mqrq_prev;
 - mq-mqrq_prev = mq-mqrq_cur;
 - mq-mqrq_cur = tmp;
 + if (!mmc_prefetch_pending(mq-mqrq_prev-mmc_active)) {
 + mq-mqrq_prev-brq.mrq.data = NULL;
 + mq-mqrq_prev-req = NULL;
 + tmp = mq-mqrq_prev;
 + mq-mqrq_prev = mq-mqrq_cur;
 + mq-mqrq_cur = tmp;
 + }
   } while (1);
   up(mq-thread_sem);

 @@ -109,10 +113,44 @@ static void mmc_request_fn(struct request_queue *q)
   return;
   }

 + if (mq-prefetch_enable) {
 + spin_lock(mq-prefetch_lock);
 + if (mq-prefetch_completion)
 + complete(mq-prefetch_completion);
 + mq-prefetch_pending = true;
 + spin_unlock(mq-prefetch_lock);
 + }
 +
   if (!mq-mqrq_cur-req  !mq-mqrq_prev-req)
   wake_up_process(mq-thread);
  }

 +static void mmc_req_init(struct mmc_async_req *areq, struct completion 
 *compl)
 +{
 + struct mmc_queue *mq =
 + container_of(areq-prefetch, struct mmc_queue, prefetch);
 +
 + spin_lock(mq-prefetch_lock);
 + mq-prefetch_completion = compl;
 + if (mq-prefetch_pending)
 + complete(mq-prefetch_completion);
 + spin_unlock(mq-prefetch_lock);
 +}
 +
 +static void mmc_req_start(struct mmc_async_req *areq, bool enable)
 +{
 + struct mmc_queue *mq =
 + container_of(areq-prefetch, struct mmc_queue, prefetch);
 + mq-prefetch_enable = enable;
 +}
 +
 +static bool mmc_req_pending(struct mmc_async_req *areq)
 +{
 + struct mmc_queue *mq =
 + container_of(areq-prefetch, struct mmc_queue, prefetch);
 + return mq-prefetch_pending;
 +}
 +
  static struct scatterlist *mmc_alloc_sg(int sg_len, int *err)
  {
   struct scatterlist *sg;
 @@ -166,6 +204,12 @@ int mmc_init_queue(struct mmc_queue *mq, struct
 mmc_card *card,
   int ret;
   struct mmc_queue_req *mqrq_cur = mq-mqrq[0];
   struct mmc_queue_req *mqrq_prev = mq-mqrq[1];
 + spin_lock_init(mq-prefetch_lock);
 + mq-prefetch.wait_req_init = mmc_req_init;
 + mq-prefetch.wait_req_start = mmc_req_start;
 + mq-prefetch.wait_req_pending = mmc_req_pending;
 + mqrq_cur-mmc_active.prefetch = mq-prefetch;
 + 

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-14 Thread Konstantin Dorfman
Hello Per,

On 11/13/2012 11:10 PM, Per Forlin wrote:
 On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
 kdorf...@codeaurora.org wrote:
 Hello,

 On 10/29/2012 11:40 PM, Per Forlin wrote:
 Hi,

 I would like to move the focus of my concerns from root cause analysis
 to the actual patch,
 My first reflection is that the patch is relatively complex and some
 of the code looks duplicated.
 Would it be possible to simplify it and re-use  the current execution flow.

 Is the following flow feasible?

 in mmc_start_req():
 --
 if (rqc == NULL  not_resend)
   wait_for_both_mmc_and_arrival_of_new_req
 else
   wait_only_for_mmc

 if (arrival_of_new_req) {
set flag to indicate fetch-new_req
   return NULL;
 }
 -

 in queue.c:
 if (fetch-new_req)
   don't overwrite previous req.

 BR
 Per

 You are talking about using both waiting mechanisms, old (wait on
 completion) and new - wait_event_interruptible()? But how done()
 callback, called on host controller irq context, will differentiate
 which one is relevant for the request?

 I think it is better to use single wait point for mmc thread.
 
 I have sketch a patch to better explain my point. It's not tested it
 barely compiles :)
 The patch tries to introduce your feature and still keep the same code
 path. And it exposes an API that could be used by SDIO as well.
 The intention of my sketch patch is only to explain what I tried to
 visualize in the pseudo code previously in this thread.
 The out come of your final patch should be documented here I think:
 Documentation/mmc/mmc-async-req.txt
This document is ready, attaching it to this mail and will be included
in next version of the patch (or RESEND).
 
 Here follows my patch sketch:
 
 diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
 index e360a97..08036a1 100644
 --- a/drivers/mmc/card/queue.c
 +++ b/drivers/mmc/card/queue.c
 @@ -66,6 +66,8 @@ static int mmc_queue_thread(void *d)
   spin_unlock_irq(q-queue_lock);
 
   if (req || mq-mqrq_prev-req) {
 + if (!req)
 + mmc_prefetch_start(mq-mqrq_prev-mmc_active, 
 true);
   set_current_state(TASK_RUNNING);
   mq-issue_fn(mq, req);
   } else {
 @@ -79,11 +81,13 @@ static int mmc_queue_thread(void *d)
   }
 
   /* Current request becomes previous request and vice versa. */
 - mq-mqrq_prev-brq.mrq.data = NULL;
 - mq-mqrq_prev-req = NULL;
 - tmp = mq-mqrq_prev;
 - mq-mqrq_prev = mq-mqrq_cur;
 - mq-mqrq_cur = tmp;
 + if (!mmc_prefetch_pending(mq-mqrq_prev-mmc_active)) {
 + mq-mqrq_prev-brq.mrq.data = NULL;
 + mq-mqrq_prev-req = NULL;
 + tmp = mq-mqrq_prev;
 + mq-mqrq_prev = mq-mqrq_cur;
 + mq-mqrq_cur = tmp;
 + }
   } while (1);
   up(mq-thread_sem);
 
 @@ -109,10 +113,44 @@ static void mmc_request_fn(struct request_queue *q)
   return;
   }
 
 + if (mq-prefetch_enable) {
 + spin_lock(mq-prefetch_lock);
 + if (mq-prefetch_completion)
 + complete(mq-prefetch_completion);
 + mq-prefetch_pending = true;
 + spin_unlock(mq-prefetch_lock);
 + }
 +
   if (!mq-mqrq_cur-req  !mq-mqrq_prev-req)
   wake_up_process(mq-thread);
  }
 
 +static void mmc_req_init(struct mmc_async_req *areq, struct completion 
 *compl)
 +{
 + struct mmc_queue *mq =
 + container_of(areq-prefetch, struct mmc_queue, prefetch);
 +
 + spin_lock(mq-prefetch_lock);
 + mq-prefetch_completion = compl;
 + if (mq-prefetch_pending)
 + complete(mq-prefetch_completion);
 + spin_unlock(mq-prefetch_lock);
 +}
 +
 +static void mmc_req_start(struct mmc_async_req *areq, bool enable)
 +{
 + struct mmc_queue *mq =
 + container_of(areq-prefetch, struct mmc_queue, prefetch);
 + mq-prefetch_enable = enable;
 +}
 +
 +static bool mmc_req_pending(struct mmc_async_req *areq)
 +{
 + struct mmc_queue *mq =
 + container_of(areq-prefetch, struct mmc_queue, prefetch);
 + return mq-prefetch_pending;
 +}
 +
  static struct scatterlist *mmc_alloc_sg(int sg_len, int *err)
  {
   struct scatterlist *sg;
 @@ -166,6 +204,12 @@ int mmc_init_queue(struct mmc_queue *mq, struct
 mmc_card *card,
   int ret;
   struct mmc_queue_req *mqrq_cur = mq-mqrq[0];
   struct mmc_queue_req *mqrq_prev = mq-mqrq[1];
 + spin_lock_init(mq-prefetch_lock);
 + mq-prefetch.wait_req_init = mmc_req_init;
 + mq-prefetch.wait_req_start = mmc_req_start;
 + mq-prefetch.wait_req_pending = mmc_req_pending;
 + mqrq_cur-mmc_active.prefetch = mq-prefetch;
 + mqrq_prev-mmc_active.prefetch = mq-prefetch;
 
   if 

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-13 Thread Per Forlin
On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
kdorf...@codeaurora.org wrote:
 Hello,

 On 10/29/2012 11:40 PM, Per Forlin wrote:
 Hi,

 I would like to move the focus of my concerns from root cause analysis
 to the actual patch,
 My first reflection is that the patch is relatively complex and some
 of the code looks duplicated.
 Would it be possible to simplify it and re-use  the current execution flow.

 Is the following flow feasible?

 in mmc_start_req():
 --
 if (rqc == NULL  not_resend)
   wait_for_both_mmc_and_arrival_of_new_req
 else
   wait_only_for_mmc

 if (arrival_of_new_req) {
set flag to indicate fetch-new_req
   return NULL;
 }
 -

 in queue.c:
 if (fetch-new_req)
   don't overwrite previous req.

 BR
 Per

 You are talking about using both waiting mechanisms, old (wait on
 completion) and new - wait_event_interruptible()? But how done()
 callback, called on host controller irq context, will differentiate
 which one is relevant for the request?

 I think it is better to use single wait point for mmc thread.

I have sketch a patch to better explain my point. It's not tested it
barely compiles :)
The patch tries to introduce your feature and still keep the same code
path. And it exposes an API that could be used by SDIO as well.
The intention of my sketch patch is only to explain what I tried to
visualize in the pseudo code previously in this thread.
The out come of your final patch should be documented here I think:
Documentation/mmc/mmc-async-req.txt

Here follows my patch sketch:

diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index e360a97..08036a1 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -66,6 +66,8 @@ static int mmc_queue_thread(void *d)
spin_unlock_irq(q-queue_lock);

if (req || mq-mqrq_prev-req) {
+   if (!req)
+   mmc_prefetch_start(mq-mqrq_prev-mmc_active, 
true);
set_current_state(TASK_RUNNING);
mq-issue_fn(mq, req);
} else {
@@ -79,11 +81,13 @@ static int mmc_queue_thread(void *d)
}

/* Current request becomes previous request and vice versa. */
-   mq-mqrq_prev-brq.mrq.data = NULL;
-   mq-mqrq_prev-req = NULL;
-   tmp = mq-mqrq_prev;
-   mq-mqrq_prev = mq-mqrq_cur;
-   mq-mqrq_cur = tmp;
+   if (!mmc_prefetch_pending(mq-mqrq_prev-mmc_active)) {
+   mq-mqrq_prev-brq.mrq.data = NULL;
+   mq-mqrq_prev-req = NULL;
+   tmp = mq-mqrq_prev;
+   mq-mqrq_prev = mq-mqrq_cur;
+   mq-mqrq_cur = tmp;
+   }
} while (1);
up(mq-thread_sem);

@@ -109,10 +113,44 @@ static void mmc_request_fn(struct request_queue *q)
return;
}

+   if (mq-prefetch_enable) {
+   spin_lock(mq-prefetch_lock);
+   if (mq-prefetch_completion)
+   complete(mq-prefetch_completion);
+   mq-prefetch_pending = true;
+   spin_unlock(mq-prefetch_lock);
+   }
+
if (!mq-mqrq_cur-req  !mq-mqrq_prev-req)
wake_up_process(mq-thread);
 }

+static void mmc_req_init(struct mmc_async_req *areq, struct completion *compl)
+{
+   struct mmc_queue *mq =
+   container_of(areq-prefetch, struct mmc_queue, prefetch);
+
+   spin_lock(mq-prefetch_lock);
+   mq-prefetch_completion = compl;
+   if (mq-prefetch_pending)
+   complete(mq-prefetch_completion);
+   spin_unlock(mq-prefetch_lock);
+}
+
+static void mmc_req_start(struct mmc_async_req *areq, bool enable)
+{
+   struct mmc_queue *mq =
+   container_of(areq-prefetch, struct mmc_queue, prefetch);
+   mq-prefetch_enable = enable;
+}
+
+static bool mmc_req_pending(struct mmc_async_req *areq)
+{
+   struct mmc_queue *mq =
+   container_of(areq-prefetch, struct mmc_queue, prefetch);
+   return mq-prefetch_pending;
+}
+
 static struct scatterlist *mmc_alloc_sg(int sg_len, int *err)
 {
struct scatterlist *sg;
@@ -166,6 +204,12 @@ int mmc_init_queue(struct mmc_queue *mq, struct
mmc_card *card,
int ret;
struct mmc_queue_req *mqrq_cur = mq-mqrq[0];
struct mmc_queue_req *mqrq_prev = mq-mqrq[1];
+   spin_lock_init(mq-prefetch_lock);
+   mq-prefetch.wait_req_init = mmc_req_init;
+   mq-prefetch.wait_req_start = mmc_req_start;
+   mq-prefetch.wait_req_pending = mmc_req_pending;
+   mqrq_cur-mmc_active.prefetch = mq-prefetch;
+   mqrq_prev-mmc_active.prefetch = mq-prefetch;

if (mmc_dev(host)-dma_mask  *mmc_dev(host)-dma_mask)
limit = *mmc_dev(host)-dma_mask;
diff --git a/drivers/mmc/card/queue.h 

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-30 Thread Konstantin Dorfman
Hello,

On 10/29/2012 11:40 PM, Per Forlin wrote:
 Hi,
 
 I would like to move the focus of my concerns from root cause analysis
 to the actual patch,
 My first reflection is that the patch is relatively complex and some
 of the code looks duplicated.
 Would it be possible to simplify it and re-use  the current execution flow.
 
 Is the following flow feasible?
 
 in mmc_start_req():
 --
 if (rqc == NULL  not_resend)
   wait_for_both_mmc_and_arrival_of_new_req
 else
   wait_only_for_mmc
 
 if (arrival_of_new_req) {
set flag to indicate fetch-new_req
   return NULL;
 }
 -
 
 in queue.c:
 if (fetch-new_req)
   don't overwrite previous req.
 
 BR
 Per

You are talking about using both waiting mechanisms, old (wait on
completion) and new - wait_event_interruptible()? But how done()
callback, called on host controller irq context, will differentiate
which one is relevant for the request?

I think it is better to use single wait point for mmc thread.

Also in future plans to add second reason to wake up mmc thread from
waiting: this is urgent_request - this notification about next-to-fetch
read request, that should be executed out-of-order, using new eMMC HPI
feature (to stop currently running request).

I will publish this patch soon.

Your idea with fetch_new_req flag is good, I'll try to simplify current
patch and lets talk again.

Thanks,
-- 
Konstantin Dorfman,
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line unsubscribe linux-mmc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-30 Thread Konstantin Dorfman
On 10/30/2012 09:45 AM, Per Forlin wrote:
 minor clarification,
 
 On Mon, Oct 29, 2012 at 10:40 PM, Per Forlin per.l...@gmail.com wrote:


 Is the following flow feasible?

 in mmc_start_req():
 --
 if (rqc == NULL  not_resend)
  request_is_ongoing (in case of resend request is never ongoing
 
   wait_for_both_mmc_and_arrival_of_new_req
 We should wake up if any of the two events occur.
 
 else
   wait_only_for_mmc

 if (arrival_of_new_req) {
set flag to indicate fetch-new_req
   return NULL;
 }
 -

 in queue.c:
 if (fetch-new_req)
   don't overwrite previous req.

 
 This is somewhat a subset of the your patch. Maybe I'm missing parts
 of the complexity.
 I haven't figured out why a new mmc_start_data_req() is needed. A new
 mechanism for waiting is required.
You're right, it was no reason to add new function, we can use
mmc_start_req() and just change mechanism for waiting, I will fix this
to minimize changes.

-- 
Konstantin Dorfman,
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line unsubscribe linux-mmc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-30 Thread Per Forlin
Hi,

On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
kdorf...@codeaurora.org wrote:
 Hello,

 On 10/29/2012 11:40 PM, Per Forlin wrote:
 Hi,

 I would like to move the focus of my concerns from root cause analysis
 to the actual patch,
 My first reflection is that the patch is relatively complex and some
 of the code looks duplicated.
 Would it be possible to simplify it and re-use  the current execution flow.

 Is the following flow feasible?

 in mmc_start_req():
 --
 if (rqc == NULL  not_resend)
   wait_for_both_mmc_and_arrival_of_new_req
 else
   wait_only_for_mmc

 if (arrival_of_new_req) {
set flag to indicate fetch-new_req
   return NULL;
 }
 -

 in queue.c:
 if (fetch-new_req)
   don't overwrite previous req.

 BR
 Per

 You are talking about using both waiting mechanisms, old (wait on
 completion) and new - wait_event_interruptible()? But how done()
 callback, called on host controller irq context, will differentiate
 which one is relevant for the request?

 I think it is better to use single wait point for mmc thread.

 Also in future plans to add second reason to wake up mmc thread from
 waiting: this is urgent_request - this notification about next-to-fetch
 read request, that should be executed out-of-order, using new eMMC HPI
 feature (to stop currently running request).

 I will publish this patch soon.

 Your idea with fetch_new_req flag is good, I'll try to simplify current
 patch and lets talk again.

I have not thought this through entirely. But it would be nice to add
a new func() in areq to add support for this a new mechanism. If the
func() is NULL the normal flow will be executed. This could be used in
the SDIO case too. At least it should be possible to use only the core
API and it still make sense without a mmc block driver on top.

How to implement this wait function?
There should be one single wait point, I agree.
One could share the mmc completion perhaps through the new func().
This mean both mmc-queue and mmc-host and completes the same
completion.
When waking up from the completion one needs to stop the new
func-wait-point and check if a new request is available. If yes, fetch
a new request and next time don't re-init the completion (this would
overwrite the mmc-host complete value).

Let's talk again when you have a new patch set.

BR
Per

 Thanks,
 --
 Konstantin Dorfman,
 QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
 Inc. is a member of Code Aurora Forum,
 hosted by The Linux Foundation
--
To unsubscribe from this list: send the line unsubscribe linux-mmc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-29 Thread Per Forlin
Hi,

I would like to move the focus of my concerns from root cause analysis
to the actual patch,
My first reflection is that the patch is relatively complex and some
of the code looks duplicated.
Would it be possible to simplify it and re-use  the current execution flow.

Is the following flow feasible?

in mmc_start_req():
--
if (rqc == NULL  not_resend)
  wait_for_both_mmc_and_arrival_of_new_req
else
  wait_only_for_mmc

if (arrival_of_new_req) {
   set flag to indicate fetch-new_req
  return NULL;
}
-

in queue.c:
if (fetch-new_req)
  don't overwrite previous req.

BR
Per



On Sun, Oct 28, 2012 at 2:12 PM, Konstantin Dorfman
kdorf...@codeaurora.org wrote:
 Hello,

 On 10/26/2012 02:07 PM, Venkatraman S wrote:


 Actually there could a lot of reasons why block layer or CFQ would not have
 inserted the request into the queue. i.e. you can see a lot of exit paths
 where blk_peek_request returns NULL, even though there could be any request
 pending in one of the CFQ requests queues.

 Essentially you need to check what makes blk_fetch_request
 (cfq_dispatch_requests() ) return NULL when there is an element in
 queue, if at all.


 This is not important why block layer causes to blk_fetch_request() (or
 blk_peek_request()) to return NULL to the MMC layer, but what is really
 important - MMC layer should always be able to fetch the new arrived
 request ASAP, after block layer notification (mmc_request() ) and this
 is what my fix goes to implement.

 And the fix is not changing block layer behavior.

 Thanks

 --
 Konstantin Dorfman,
 QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
 Inc. is a member of Code Aurora Forum,
 hosted by The Linux Foundation
--
To unsubscribe from this list: send the line unsubscribe linux-mmc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-28 Thread Konstantin Dorfman
Hello,

On 10/26/2012 02:07 PM, Venkatraman S wrote:

 
 Actually there could a lot of reasons why block layer or CFQ would not have
 inserted the request into the queue. i.e. you can see a lot of exit paths
 where blk_peek_request returns NULL, even though there could be any request
 pending in one of the CFQ requests queues.
 
 Essentially you need to check what makes blk_fetch_request
 (cfq_dispatch_requests() ) return NULL when there is an element in
 queue, if at all.
 

This is not important why block layer causes to blk_fetch_request() (or
blk_peek_request()) to return NULL to the MMC layer, but what is really
important - MMC layer should always be able to fetch the new arrived
request ASAP, after block layer notification (mmc_request() ) and this
is what my fix goes to implement.

And the fix is not changing block layer behavior.

Thanks

-- 
Konstantin Dorfman,
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line unsubscribe linux-mmc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-26 Thread Venkatraman S
On Thu, Oct 25, 2012 at 6:58 PM, Konstantin Dorfman
kdorf...@codeaurora.org wrote:
 On 10/24/2012 07:07 PM, Per Förlin wrote:
 On 10/24/2012 11:41 AM, Konstantin Dorfman wrote:
 Hello Per,

 On Mon, October 22, 2012 1:02 am, Per Forlin wrote:
 When mmcqt reports on completion of a request there should be
 a context switch to allow the insertion of the next read ahead BIOs
 to the block layer. Since the mmcqd tries to fetch another request
 immediately after the completion of the previous request it gets NULL
 and starts waiting for the completion of the previous request.
 This wait on completion gives the FS the opportunity to insert the next
 request but the MMC layer is already blocked on the previous request
 completion and is not aware of the new request waiting to be fetched.
 I thought that I could trigger a context switch in order to give
 execution time for FS to add the new request to the MMC queue.
 I made a simple hack to call yield() in case the request gets NULL. I
 thought it may give the FS layer enough time to add a new request to
 the MMC queue. This would not delay the MMC transfer since the yield()
 is done in parallel with an ongoing transfer. Anyway it was just meant
 to be a simple test.

 One yield was not enough. Just for sanity check I added a msleep as
 well and that was enough to let FS add a new request,
 Would it be possible to gain throughput by delaying the fetch of new
 request? Too avoid unnecessary NULL requests

 If (ongoing request is read AND size is max read ahead AND new request
 is NULL) yield();

 BR
 Per
 We did the same experiment and it will not give maximum possible
 performance. There is no guarantee that the context switch which was
 manually caused by the MMC layer comes just in time: when it was early
 then next fetch still results in NULL, when it was later, then we miss
 possibility to fetch/prepare new request.

 Any delay in fetch of the new request that comes after the new request has
 arrived hits throughput and latency.

 The solution we are talking about here will fix not only situation with FS
 read ahead mechanism, but also it will remove penalty of the MMC context
 waiting on completion while any new request arrives.

 Thanks,

 It seems strange that the block layer cannot keep up with relatively slow 
 flash media devices. There must be a limitation on number of outstanding 
 request towards MMC.
 I need to make up my mind if it's the best way to address this issue in the 
 MMC framework or block layer. I have started to look into the block layer 
 code but it will take some time to dig out the relevant parts.

 BR
 Per

 The root cause of the issue in incompletion of the current design with
 well known producer-consumer problem solution (producer is block layer,
 consumer is mmc layer).
 Classic definitions states that the buffer is fix size, in our case we
 have queue, so Producer always capable to put new request into the queue.
 Consumer context blocked when both buffers (curr and prev) are busy
 (first started its execution on the bus, second is fetched and waiting
 for the first).
 Producer context considered to be blocked when FS (or others bio
 sources) has no requests to put into queue.
 To maximize performance there are 2 notifications should be used:
 1. Producer notifies Consumer about new item to proceed.
 2. Consumer notifies Producer about free place.

 In our case 2nd notification not need since as I said before - it is
 always free space in the queue.
 There is no such notification as 1st, i.e. block layer has no way to
 notify mmc layer about new request arrived.

Being nitpicky, I think it contradicts with the commit log that you have
for the patch..
Quote
When the block layer notifies the MMC layer on a new request, we check
for the above case where MMC layer is waiting on a previous request
completion and the current request is NULL.
/Quote


 What you suggesting is to resolve specific case, when FS READ_AHEAD
 mechanism behavior causes delays in producing new requests.
 Probably you can resolve this specific case, but do you have guarantee
 that this is only case that causes delays between new requests events?
 Flash memory devices these days constantly improved on all levels: NAND,
 firmware, bus speed and host controller capabilities, this makes any
 yield/sleep/timeouts solution only temporary hacks.

--
To unsubscribe from this list: send the line unsubscribe linux-mmc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-26 Thread Venkatraman S
On Thu, Oct 25, 2012 at 8:32 PM, Per Förlin per.for...@stericsson.com wrote:
 On 10/25/2012 03:28 PM, Konstantin Dorfman wrote:
 On 10/24/2012 07:07 PM, Per Förlin wrote:
 On 10/24/2012 11:41 AM, Konstantin Dorfman wrote:
 Hello Per,

 On Mon, October 22, 2012 1:02 am, Per Forlin wrote:
 When mmcqt reports on completion of a request there should be
 a context switch to allow the insertion of the next read ahead BIOs
 to the block layer. Since the mmcqd tries to fetch another request
 immediately after the completion of the previous request it gets NULL
 and starts waiting for the completion of the previous request.
 This wait on completion gives the FS the opportunity to insert the next
 request but the MMC layer is already blocked on the previous request
 completion and is not aware of the new request waiting to be fetched.
 I thought that I could trigger a context switch in order to give
 execution time for FS to add the new request to the MMC queue.
 I made a simple hack to call yield() in case the request gets NULL. I
 thought it may give the FS layer enough time to add a new request to
 the MMC queue. This would not delay the MMC transfer since the yield()
 is done in parallel with an ongoing transfer. Anyway it was just meant
 to be a simple test.

 One yield was not enough. Just for sanity check I added a msleep as
 well and that was enough to let FS add a new request,
 Would it be possible to gain throughput by delaying the fetch of new
 request? Too avoid unnecessary NULL requests

 If (ongoing request is read AND size is max read ahead AND new request
 is NULL) yield();

 BR
 Per
 We did the same experiment and it will not give maximum possible
 performance. There is no guarantee that the context switch which was
 manually caused by the MMC layer comes just in time: when it was early
 then next fetch still results in NULL, when it was later, then we miss
 possibility to fetch/prepare new request.

 Any delay in fetch of the new request that comes after the new request has
 arrived hits throughput and latency.

 The solution we are talking about here will fix not only situation with FS
 read ahead mechanism, but also it will remove penalty of the MMC context
 waiting on completion while any new request arrives.

 Thanks,

 It seems strange that the block layer cannot keep up with relatively slow 
 flash media devices. There must be a limitation on number of outstanding 
 request towards MMC.
 I need to make up my mind if it's the best way to address this issue in the 
 MMC framework or block layer. I have started to look into the block layer 
 code but it will take some time to dig out the relevant parts.

 BR
 Per

 The root cause of the issue in incompletion of the current design with
 well known producer-consumer problem solution (producer is block layer,
 consumer is mmc layer).
 Classic definitions states that the buffer is fix size, in our case we
 have queue, so Producer always capable to put new request into the queue.
 Consumer context blocked when both buffers (curr and prev) are busy
 (first started its execution on the bus, second is fetched and waiting
 for the first).
 This happens but I thought that the block layer would continue to add request 
 to the MMC queue while the consumer is busy.
 When consumer fetches request from the queue again there should be several 
 requests available in the queue, but there is only one.

 Producer context considered to be blocked when FS (or others bio
 sources) has no requests to put into queue.
 Does the block layer ever wait for outstanding request to finish? Could this 
 be another reason why the producer doesn't add new requests on the MMC queue?


Actually there could a lot of reasons why block layer or CFQ would not have
inserted the request into the queue. i.e. you can see a lot of exit paths
where blk_peek_request returns NULL, even though there could be any request
pending in one of the CFQ requests queues.

Essentially you need to check what makes blk_fetch_request
(cfq_dispatch_requests() ) return NULL when there is an element in
queue, if at all.

 To maximize performance there are 2 notifications should be used:
 1. Producer notifies Consumer about new item to proceed.
 2. Consumer notifies Producer about free place.

 In our case 2nd notification not need since as I said before - it is
 always free space in the queue.
 There is no such notification as 1st, i.e. block layer has no way to
 notify mmc layer about new request arrived.

 What you suggesting is to resolve specific case, when FS READ_AHEAD
 mechanism behavior causes delays in producing new requests.
 Probably you can resolve this specific case, but do you have guarantee
 that this is only case that causes delays between new requests events?
 Flash memory devices these days constantly improved on all levels: NAND,
 firmware, bus speed and host controller capabilities, this makes any
 yield/sleep/timeouts solution only temporary hacks.
 I never meant yield or sleep to be a 

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-25 Thread Per Förlin
On 10/25/2012 03:28 PM, Konstantin Dorfman wrote:
 On 10/24/2012 07:07 PM, Per Förlin wrote:
 On 10/24/2012 11:41 AM, Konstantin Dorfman wrote:
 Hello Per,

 On Mon, October 22, 2012 1:02 am, Per Forlin wrote:
 When mmcqt reports on completion of a request there should be
 a context switch to allow the insertion of the next read ahead BIOs
 to the block layer. Since the mmcqd tries to fetch another request
 immediately after the completion of the previous request it gets NULL
 and starts waiting for the completion of the previous request.
 This wait on completion gives the FS the opportunity to insert the next
 request but the MMC layer is already blocked on the previous request
 completion and is not aware of the new request waiting to be fetched.
 I thought that I could trigger a context switch in order to give
 execution time for FS to add the new request to the MMC queue.
 I made a simple hack to call yield() in case the request gets NULL. I
 thought it may give the FS layer enough time to add a new request to
 the MMC queue. This would not delay the MMC transfer since the yield()
 is done in parallel with an ongoing transfer. Anyway it was just meant
 to be a simple test.

 One yield was not enough. Just for sanity check I added a msleep as
 well and that was enough to let FS add a new request,
 Would it be possible to gain throughput by delaying the fetch of new
 request? Too avoid unnecessary NULL requests

 If (ongoing request is read AND size is max read ahead AND new request
 is NULL) yield();

 BR
 Per
 We did the same experiment and it will not give maximum possible
 performance. There is no guarantee that the context switch which was
 manually caused by the MMC layer comes just in time: when it was early
 then next fetch still results in NULL, when it was later, then we miss
 possibility to fetch/prepare new request.

 Any delay in fetch of the new request that comes after the new request has
 arrived hits throughput and latency.

 The solution we are talking about here will fix not only situation with FS
 read ahead mechanism, but also it will remove penalty of the MMC context
 waiting on completion while any new request arrives.

 Thanks,

 It seems strange that the block layer cannot keep up with relatively slow 
 flash media devices. There must be a limitation on number of outstanding 
 request towards MMC.
 I need to make up my mind if it's the best way to address this issue in the 
 MMC framework or block layer. I have started to look into the block layer 
 code but it will take some time to dig out the relevant parts.

 BR
 Per

 The root cause of the issue in incompletion of the current design with
 well known producer-consumer problem solution (producer is block layer,
 consumer is mmc layer).
 Classic definitions states that the buffer is fix size, in our case we
 have queue, so Producer always capable to put new request into the queue.
 Consumer context blocked when both buffers (curr and prev) are busy
 (first started its execution on the bus, second is fetched and waiting
 for the first).
This happens but I thought that the block layer would continue to add request 
to the MMC queue while the consumer is busy.
When consumer fetches request from the queue again there should be several 
requests available in the queue, but there is only one.

 Producer context considered to be blocked when FS (or others bio
 sources) has no requests to put into queue.
Does the block layer ever wait for outstanding request to finish? Could this be 
another reason why the producer doesn't add new requests on the MMC queue?

 To maximize performance there are 2 notifications should be used:
 1. Producer notifies Consumer about new item to proceed.
 2. Consumer notifies Producer about free place.
 
 In our case 2nd notification not need since as I said before - it is
 always free space in the queue.
 There is no such notification as 1st, i.e. block layer has no way to
 notify mmc layer about new request arrived.
 
 What you suggesting is to resolve specific case, when FS READ_AHEAD
 mechanism behavior causes delays in producing new requests.
 Probably you can resolve this specific case, but do you have guarantee
 that this is only case that causes delays between new requests events?
 Flash memory devices these days constantly improved on all levels: NAND,
 firmware, bus speed and host controller capabilities, this makes any
 yield/sleep/timeouts solution only temporary hacks.
I never meant yield or sleep to be a permanent fix. I was only curious of how 
if would affect the performance in order to gain a better knowledge of the root 
cause.
My impression is that even if the SD card is very slow you will see the same 
affect. The behavior of the block layer in this case is not related to the 
speed for the flash memory.
On a slow card the MMC-queue runs empty just like it does for a fast eMMC.
According to you the block layer should have a better chance to feed the MMC 
queue if the card is slow (more 

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-24 Thread Konstantin Dorfman
Hello Per,

On Mon, October 22, 2012 1:02 am, Per Forlin wrote:
 When mmcqt reports on completion of a request there should be
 a context switch to allow the insertion of the next read ahead BIOs
 to the block layer. Since the mmcqd tries to fetch another request
 immediately after the completion of the previous request it gets NULL
 and starts waiting for the completion of the previous request.
 This wait on completion gives the FS the opportunity to insert the next
 request but the MMC layer is already blocked on the previous request
 completion and is not aware of the new request waiting to be fetched.
 I thought that I could trigger a context switch in order to give
 execution time for FS to add the new request to the MMC queue.
 I made a simple hack to call yield() in case the request gets NULL. I
 thought it may give the FS layer enough time to add a new request to
 the MMC queue. This would not delay the MMC transfer since the yield()
 is done in parallel with an ongoing transfer. Anyway it was just meant
 to be a simple test.

 One yield was not enough. Just for sanity check I added a msleep as
 well and that was enough to let FS add a new request,
 Would it be possible to gain throughput by delaying the fetch of new
 request? Too avoid unnecessary NULL requests

 If (ongoing request is read AND size is max read ahead AND new request
 is NULL) yield();

 BR
 Per
We did the same experiment and it will not give maximum possible
performance. There is no guarantee that the context switch which was
manually caused by the MMC layer comes just in time: when it was early
then next fetch still results in NULL, when it was later, then we miss
possibility to fetch/prepare new request.

Any delay in fetch of the new request that comes after the new request has
arrived hits throughput and latency.

The solution we are talking about here will fix not only situation with FS
read ahead mechanism, but also it will remove penalty of the MMC context
waiting on completion while any new request arrives.

Thanks,

-- 
Konstantin Dorfman,
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

--
To unsubscribe from this list: send the line unsubscribe linux-mmc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-24 Thread Per Förlin
On 10/24/2012 11:41 AM, Konstantin Dorfman wrote:
 Hello Per,

 On Mon, October 22, 2012 1:02 am, Per Forlin wrote:
 When mmcqt reports on completion of a request there should be
 a context switch to allow the insertion of the next read ahead BIOs
 to the block layer. Since the mmcqd tries to fetch another request
 immediately after the completion of the previous request it gets NULL
 and starts waiting for the completion of the previous request.
 This wait on completion gives the FS the opportunity to insert the next
 request but the MMC layer is already blocked on the previous request
 completion and is not aware of the new request waiting to be fetched.
 I thought that I could trigger a context switch in order to give
 execution time for FS to add the new request to the MMC queue.
 I made a simple hack to call yield() in case the request gets NULL. I
 thought it may give the FS layer enough time to add a new request to
 the MMC queue. This would not delay the MMC transfer since the yield()
 is done in parallel with an ongoing transfer. Anyway it was just meant
 to be a simple test.

 One yield was not enough. Just for sanity check I added a msleep as
 well and that was enough to let FS add a new request,
 Would it be possible to gain throughput by delaying the fetch of new
 request? Too avoid unnecessary NULL requests

 If (ongoing request is read AND size is max read ahead AND new request
 is NULL) yield();

 BR
 Per
 We did the same experiment and it will not give maximum possible
 performance. There is no guarantee that the context switch which was
 manually caused by the MMC layer comes just in time: when it was early
 then next fetch still results in NULL, when it was later, then we miss
 possibility to fetch/prepare new request.

 Any delay in fetch of the new request that comes after the new request has
 arrived hits throughput and latency.

 The solution we are talking about here will fix not only situation with FS
 read ahead mechanism, but also it will remove penalty of the MMC context
 waiting on completion while any new request arrives.

 Thanks,


It seems strange that the block layer cannot keep up with relatively slow flash 
media devices. There must be a limitation on number of outstanding request 
towards MMC.
I need to make up my mind if it's the best way to address this issue in the MMC 
framework or block layer. I have started to look into the block layer code but 
it will take some time to dig out the relevant parts.

BR
Per

--
To unsubscribe from this list: send the line unsubscribe linux-mmc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-21 Thread Per Forlin
On Mon, Oct 15, 2012 at 5:36 PM, Konstantin Dorfman
kdorf...@codeaurora.org wrote:
 The main assumption of the async request design is that the file
 system adds block requests to the block device queue asynchronously
 without waiting for completion (see the Rationale section of
 https://wiki.linaro.org/WorkingGroups/Kernel/Specs
 /StoragePerfMMC-async-req).

 We found out that in case of sequential read operations this is not
 the case, due to the read ahead mechanism.
 When mmcqt reports on completion of a request there should be
 a context switch to allow the insertion of the next read ahead BIOs
 to the block layer. Since the mmcqd tries to fetch another request
 immediately after the completion of the previous request it gets NULL
 and starts waiting for the completion of the previous request.
 This wait on completion gives the FS the opportunity to insert the next
 request but the MMC layer is already blocked on the previous request
 completion and is not aware of the new request waiting to be fetched.
I thought that I could trigger a context switch in order to give
execution time for FS to add the new request to the MMC queue.
I made a simple hack to call yield() in case the request gets NULL. I
thought it may give the FS layer enough time to add a new request to
the MMC queue. This would not delay the MMC transfer since the yield()
is done in parallel with an ongoing transfer. Anyway it was just meant
to be a simple test.

One yield was not enough. Just for sanity check I added a msleep as
well and that was enough to let FS add a new request,
Would it be possible to gain throughput by delaying the fetch of new
request? Too avoid unnecessary NULL requests

If (ongoing request is read AND size is max read ahead AND new request
is NULL) yield();

BR
Per


 This patch fixes the above behavior and allows the async request mechanism
 to work in case of sequential read scenarios.
 The main idea is to replace the blocking wait for a completion with an
 event driven mechanism and add a new event of new_request.
 When the block layer notifies the MMC layer on a new request, we check
 for the above case where MMC layer is waiting on a previous request
 completion and the current request is NULL.
 In such a case the new_request event will be triggered to wakeup
 the waiting thread. MMC layer will then fetch the new request
 and after its preparation will go back to waiting on the completion.

 Our tests showed that this fix improves the read sequential throughput
 by 16%.

 Signed-off-by: Konstantin Dorfman kdorf...@codeaurora.org

 diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
 index 172a768..4d6431b 100644
 --- a/drivers/mmc/card/block.c
 +++ b/drivers/mmc/card/block.c
 @@ -112,17 +112,6 @@ struct mmc_blk_data {

  static DEFINE_MUTEX(open_lock);

 -enum mmc_blk_status {
 -   MMC_BLK_SUCCESS = 0,
 -   MMC_BLK_PARTIAL,
 -   MMC_BLK_CMD_ERR,
 -   MMC_BLK_RETRY,
 -   MMC_BLK_ABORT,
 -   MMC_BLK_DATA_ERR,
 -   MMC_BLK_ECC_ERR,
 -   MMC_BLK_NOMEDIUM,
 -};
 -
  module_param(perdev_minors, int, 0444);
  MODULE_PARM_DESC(perdev_minors, Minors numbers to allocate per device);

 @@ -1224,6 +1213,7 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req 
 *mqrq,
 }

 mqrq-mmc_active.mrq = brq-mrq;
 +   mqrq-mmc_active.mrq-sync_data = mq-sync_data;
 mqrq-mmc_active.err_check = mmc_blk_err_check;

 mmc_queue_bounce_pre(mqrq);
 @@ -1284,9 +1274,12 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
 struct request *rqc)
 areq = mq-mqrq_cur-mmc_active;
 } else
 areq = NULL;
 -   areq = mmc_start_req(card-host, areq, (int *) status);
 -   if (!areq)
 +   areq = mmc_start_data_req(card-host, areq, (int *)status);
 +   if (!areq) {
 +   if (status == MMC_BLK_NEW_REQUEST)
 +   return status;
 return 0;
 +   }

 mq_rq = container_of(areq, struct mmc_queue_req, mmc_active);
 brq = mq_rq-brq;
 @@ -1295,6 +1288,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
 struct request *rqc)
 mmc_queue_bounce_post(mq_rq);

 switch (status) {
 +   case MMC_BLK_NEW_REQUEST:
 +   BUG_ON(1); /* should never get here */
 case MMC_BLK_SUCCESS:
 case MMC_BLK_PARTIAL:
 /*
 @@ -1367,7 +1362,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
 struct request *rqc)
  * prepare it again and resend.
  */
 mmc_blk_rw_rq_prep(mq_rq, card, disable_multi, mq);
 -   mmc_start_req(card-host, mq_rq-mmc_active, NULL);
 +   mmc_start_data_req(card-host, mq_rq-mmc_active,
 +   

[PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-15 Thread Konstantin Dorfman
The main assumption of the async request design is that the file
system adds block requests to the block device queue asynchronously
without waiting for completion (see the Rationale section of
https://wiki.linaro.org/WorkingGroups/Kernel/Specs
/StoragePerfMMC-async-req).

We found out that in case of sequential read operations this is not
the case, due to the read ahead mechanism.
When mmcqt reports on completion of a request there should be
a context switch to allow the insertion of the next read ahead BIOs
to the block layer. Since the mmcqd tries to fetch another request
immediately after the completion of the previous request it gets NULL
and starts waiting for the completion of the previous request.
This wait on completion gives the FS the opportunity to insert the next
request but the MMC layer is already blocked on the previous request
completion and is not aware of the new request waiting to be fetched.

This patch fixes the above behavior and allows the async request mechanism
to work in case of sequential read scenarios.
The main idea is to replace the blocking wait for a completion with an
event driven mechanism and add a new event of new_request.
When the block layer notifies the MMC layer on a new request, we check
for the above case where MMC layer is waiting on a previous request
completion and the current request is NULL.
In such a case the new_request event will be triggered to wakeup
the waiting thread. MMC layer will then fetch the new request
and after its preparation will go back to waiting on the completion.

Our tests showed that this fix improves the read sequential throughput
by 16%.

Signed-off-by: Konstantin Dorfman kdorf...@codeaurora.org

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 172a768..4d6431b 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -112,17 +112,6 @@ struct mmc_blk_data {
 
 static DEFINE_MUTEX(open_lock);
 
-enum mmc_blk_status {
-   MMC_BLK_SUCCESS = 0,
-   MMC_BLK_PARTIAL,
-   MMC_BLK_CMD_ERR,
-   MMC_BLK_RETRY,
-   MMC_BLK_ABORT,
-   MMC_BLK_DATA_ERR,
-   MMC_BLK_ECC_ERR,
-   MMC_BLK_NOMEDIUM,
-};
-
 module_param(perdev_minors, int, 0444);
 MODULE_PARM_DESC(perdev_minors, Minors numbers to allocate per device);
 
@@ -1224,6 +1213,7 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
}
 
mqrq-mmc_active.mrq = brq-mrq;
+   mqrq-mmc_active.mrq-sync_data = mq-sync_data;
mqrq-mmc_active.err_check = mmc_blk_err_check;
 
mmc_queue_bounce_pre(mqrq);
@@ -1284,9 +1274,12 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
struct request *rqc)
areq = mq-mqrq_cur-mmc_active;
} else
areq = NULL;
-   areq = mmc_start_req(card-host, areq, (int *) status);
-   if (!areq)
+   areq = mmc_start_data_req(card-host, areq, (int *)status);
+   if (!areq) {
+   if (status == MMC_BLK_NEW_REQUEST)
+   return status;
return 0;
+   }
 
mq_rq = container_of(areq, struct mmc_queue_req, mmc_active);
brq = mq_rq-brq;
@@ -1295,6 +1288,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
struct request *rqc)
mmc_queue_bounce_post(mq_rq);
 
switch (status) {
+   case MMC_BLK_NEW_REQUEST:
+   BUG_ON(1); /* should never get here */
case MMC_BLK_SUCCESS:
case MMC_BLK_PARTIAL:
/*
@@ -1367,7 +1362,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
struct request *rqc)
 * prepare it again and resend.
 */
mmc_blk_rw_rq_prep(mq_rq, card, disable_multi, mq);
-   mmc_start_req(card-host, mq_rq-mmc_active, NULL);
+   mmc_start_data_req(card-host, mq_rq-mmc_active,
+   (int *)status);
}
} while (ret);
 
@@ -1382,7 +1378,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
struct request *rqc)
  start_new_req:
if (rqc) {
mmc_blk_rw_rq_prep(mq-mqrq_cur, card, 0, mq);
-   mmc_start_req(card-host, mq-mqrq_cur-mmc_active, NULL);
+   mmc_start_data_req(card-host, mq-mqrq_cur-mmc_active, NULL);
}
 
return 0;
@@ -1426,9 +1422,10 @@ static int mmc_blk_issue_rq(struct mmc_queue *mq, struct 
request *req)
}
 
 out:
-   if (!req)
+   if (!req  (ret != MMC_BLK_NEW_REQUEST))
/* release host only when there are no more requests */
mmc_release_host(card-host);
+
return ret;
 }
 
diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index e360a97..65cecf2 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -67,7 +67,8 @@