Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-26 Thread Konstantin Dorfman

Hello,

On 11/19/2012 11:34 PM, Per Förlin wrote:

On 11/19/2012 03:32 PM, Per Förlin wrote:

On 11/19/2012 10:48 AM, Konstantin Dorfman wrote:


I'm fine with wait_event() (block request transfers) combined with completion 
(for other transfer), as long as the core.c API is intact. I don't see a reason 
for a new start_data_req()

mmc_start_req() is only used by the mmc block transfers.
Making a test case in mmc_test.c is a good way to stress test the feature (e.g. 
random timing of incoming new requests) and it will show how the API works too.

BR
Per

Right now there are couple of tests in mmc_test module that use async 
request mechanism. After applying my fix (v3 mmc: fix async request 
mechanism for sequential read scenarios), these test were broken.


The patch attached fixes those broken tests and also you can see exactly 
what is API change. It is not in mmc_start_req() signature, it is new 
context_info field that we need to synchronize with NEW_REQUEST 
notification. mmc_test is not uses the notification, but it is very easy 
to implement.




My main concern is to avoid adding new API to core.c in order to add the 
NEW_REQ feature. My patch is an example of changes to be done in core.c without 
adding new functions.

Now you can review API changes.





We would like to have a generic capability to handle additional events,
such as HPI/stop flow, in addition to the NEW_REQUEST notification.
Therefore, an event mechanism seems to be a better design than completion.

I've looked at SDIO code and from what I can understand, right now SDIO
is not using async request mechanism and works from 'wait_for_cmd()`
level. This means that such work as exposing MMC core API's is major
change and definitely should be separate design and implementation
effort, while my current patch right now will fix mmc thread behavior
and give better performance for upper layers.

You are right. There is no support in the SDIO implementation for pre/post/async feature. 
Still I had it in mind design the "struct mmc_async". It's possible to re-use 
the mmc core parts in order to utilize this support in the SDIO case. I'm not saying we 
should design for SDIO but at least keep the API in a way that it's potential usable from 
SDIO too. The API of core.c (start/wait of requests) should make sense without the 
existence of MMC block driver (block/queue).

One way to test the design is to add a test in mmc_test.c for penging NEW_REQ. 
In mmc_test.c there is not block.c or queue.c.
If the test case if simple to write in mmc_test.c it's means that the API is on 
a good level.
I can simulate single NEW_PACKET notification in the mmc_test, but this 
will check only correctness, without real queue of requests we can't 
see/measure tpt/latency gain of this.


Kind reminder about patch v3, waiting for your review.

Thanks

--
Konstantin Dorfman,
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
>From 240fa5757e9c784679b391ba42ec58e70fe856d9 Mon Sep 17 00:00:00 2001
From: Konstantin Dorfman 
Date: Mon, 26 Nov 2012 11:50:35 +0200
Subject: [RFC/PATCH] mmc_test: fix non-blocking req test to support mmc core
 wait_event mechanism

After introduce new wait_event synchronization mechanism at mmc/core/core.c level,
non-blocking request tests from mmc_test ut module are broken.

This patch fixes the tests by updating test environment to work
correctly on top of new wait_event mechanism.

Signed-off-by: Konstantin Dorfman 

diff --git a/drivers/mmc/card/mmc_test.c b/drivers/mmc/card/mmc_test.c
index 759714e..764471f 100644
--- a/drivers/mmc/card/mmc_test.c
+++ b/drivers/mmc/card/mmc_test.c
@@ -764,7 +764,8 @@ static int mmc_test_check_broken_result(struct mmc_test_card *test,
 static void mmc_test_nonblock_reset(struct mmc_request *mrq,
 struct mmc_command *cmd,
 struct mmc_command *stop,
-struct mmc_data *data)
+struct mmc_data *data,
+struct mmc_context_info *context_info)
 {
 	memset(mrq, 0, sizeof(struct mmc_request));
 	memset(cmd, 0, sizeof(struct mmc_command));
@@ -774,6 +775,7 @@ static void mmc_test_nonblock_reset(struct mmc_request *mrq,
 	mrq->cmd = cmd;
 	mrq->data = data;
 	mrq->stop = stop;
+	mrq->context_info = context_info;
 }
 static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
   struct scatterlist *sg, unsigned sg_len,
@@ -790,6 +792,8 @@ static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
 	struct mmc_command stop2;
 	struct mmc_data data2;
 
+	struct mmc_context_info context_info;
+
 	struct mmc_test_async_req test_areq[2];
 	struct mmc_async_req *done_areq;
 	struct mmc_async_req *cur_areq = &test_areq[0].areq;
@@ -800,14 +804,18 @@ static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
 	test_areq[0].test = test;
 	test_areq[1].test = test;
 
-	mmc_test_nonblock_reset(&mrq1, &cmd1, &stop1, &data1);
-	mmc_test_nonblock_reset(&mrq2, &cmd2, &stop2, &data2);
+	mem

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-20 Thread Konstantin Dorfman
Correction inside
On Tue, November 20, 2012 6:26 pm, Konstantin Dorfman wrote:
> Hello,
>
> On 11/19/2012 11:34 PM, Per Förlin wrote:
>
>>>if (host->areq) {
>>> + if (!areq)
>>> + mmc_prefetch_init(host->areq,
>>> +
>>> &host->areq->mrq->completion);
>>>mmc_wait_for_req_done(host, host->areq->mrq);
>>> + if (!areq) {
>>> + mmc_prefetch_start(host->areq, false);
>>> + if (mmc_prefetch_pending(host->areq))
>>> + return NULL;
 In this case, mmc thread may be unblocked because done() arrived for
 current request and not because new request notification. In such a
 case
 we would like the done request to be handled before fetching the new
 request.
>>> I agree. We wake up and there is no pending new request execution
>>> should proceed normally by handling the completed request.
>>>
 In my code is_done_rcv flag used along with is_new_req flag in
 order to differentiate the reason for mmc thread awake.
>>> In my patch the return value of "mmc_prefetch_pending()" specifies if
>>> thee is a pending request.what should happen.
>>> If both current request is done and mmc_prefetch_pending is done at the
>>> same time I see there is a problem with my patch. There needs to be a
>>> flag to indicate if current request is done.
> There is race between done() and NEW_REQUEST events, also when we got
> both of them, the order is to complete current and then to fetch new.
>
>
>> I understand your motivation and idea for re-structure the patch. It
>> is
>> still not clear for me how exactly mmc thread will be awaken on new
>> request notification in your version, but I have another problem:
>>
> mmc_request_fn() is called and it calls
> complete(mq->prefetch_completion) which wakes up the current thread.
> My patch is just an example. The intention is to make the patch
> cleaner. But I may have missed some of the HPI aspects.

 Is it the lack of functions wrappers that you are using in your
 example?
>>> It's fine to skip the functions wrappers if it makes no sense.
>>> What I want to avoid is to create new functions for data_req_start and
>>> data_req_wait.
>>> I would rather add this to the existing functions in core.c and keep
>>> changes in block.c and the flow minimal.
>>>
 As I understand your example, you mean to implement generic logic on
 core/core.c level by using wrapper functions and leave final
 implementation for MMC to card/queue.c and for SDIO layer to
 card/..sdio.. (I'm not too familiar with sdio protocol
 implementation).
 Well, it is make a lot of sense.

>>> I still have "plans" to add pre/post/async support in the SDIO
>>> framework too but I have not got to it.
>>> I would be nice to add your NEW_REQ feature along with the other
>>> mmc-async features, to make it usable from SDIO.
>>>
>>>
 But the devil is in details - there is a lot of work in
 mmc_wait_for_data_req_done(), done() callback and also error handling
 changes for card/block.c
>>> Things in core.c could be re-used in the SDIO case. In block.c there
>>> are only FS specific implementation not relevant for SDIO.
>>>

 Do you think, that wait_event() API used not suits the same semantic
 as
 completion API?
>>> The waiting mechanims may be wait_event or completion.
>>> I'm not in favor of using both. Better to replace completion with
>>> wait_event if you prefer.
>>>
>> I'm fine with wait_event() (block request transfers) combined with
>> completion (for other transfer), as long as the core.c API is intact. I
>> don't see a reason for a new start_data_req()
>>
>> mmc_start_req() is only used by the mmc block transfers.
> The main idea is to change start_req() waiting point
> (mmc_wait_for_data_req_done() function) in such way that external events
> (in the MMC case it is coming from block layer) may awake MMC context
> from waiting for current request. This is done by change the original
> mmc_start_req() function and not changing its signature of
> mmc_start_req().
>
> May I ask you to see [PATCH v2] patch? I think that is is no change in
> core.c API (by core.c API you mean all functions from core.h?).
>
[PATCH v3] mmc: fix async request mechanism for sequential read scenarios

> Also to use new request feature of core.c one need to implement
> notification similar to mmc_request_fn() and I do not see difficulties
> to do it from SDIO.
>
>
>> Making a test case in mmc_test.c is a good way to stress test the
>> feature (e.g. random timing of incoming new requests) and it will show
>> how the API works too.
>>
>> BR
>> Per
>>
>>> My main concern is to avoid adding new API to core.c in order to add
>>> the NEW_REQ feature. My patch is an example of changes to be done in
>>> core.c without adding new functions.
>>>

 We would like to 

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-20 Thread Konstantin Dorfman

Hello,

On 11/19/2012 11:34 PM, Per Förlin wrote:


   if (host->areq) {
+ if (!areq)
+ mmc_prefetch_init(host->areq,
+   &host->areq->mrq->completion);
   mmc_wait_for_req_done(host, host->areq->mrq);
+ if (!areq) {
+ mmc_prefetch_start(host->areq, false);
+ if (mmc_prefetch_pending(host->areq))
+ return NULL;

In this case, mmc thread may be unblocked because done() arrived for
current request and not because new request notification. In such a case
we would like the done request to be handled before fetching the new
request.

I agree. We wake up and there is no pending new request execution should 
proceed normally by handling the completed request.


In my code is_done_rcv flag used along with is_new_req flag in
order to differentiate the reason for mmc thread awake.

In my patch the return value of "mmc_prefetch_pending()" specifies if thee is a 
pending request.what should happen.
If both current request is done and mmc_prefetch_pending is done at the same 
time I see there is a problem with my patch. There needs to be a flag to 
indicate if current request is done.
There is race between done() and NEW_REQUEST events, also when we got 
both of them, the order is to complete current and then to fetch new.




I understand your motivation and idea for re-structure the patch. It is
still not clear for me how exactly mmc thread will be awaken on new
request notification in your version, but I have another problem:


mmc_request_fn() is called and it calls complete(mq->prefetch_completion) which 
wakes up the current thread.
My patch is just an example. The intention is to make the patch cleaner. But I 
may have missed some of the HPI aspects.


Is it the lack of functions wrappers that you are using in your example?

It's fine to skip the functions wrappers if it makes no sense.
What I want to avoid is to create new functions for data_req_start and 
data_req_wait.
I would rather add this to the existing functions in core.c and keep changes in 
block.c and the flow minimal.


As I understand your example, you mean to implement generic logic on
core/core.c level by using wrapper functions and leave final
implementation for MMC to card/queue.c and for SDIO layer to
card/..sdio.. (I'm not too familiar with sdio protocol implementation).
Well, it is make a lot of sense.


I still have "plans" to add pre/post/async support in the SDIO framework too 
but I have not got to it.
I would be nice to add your NEW_REQ feature along with the other mmc-async 
features, to make it usable from SDIO.



But the devil is in details - there is a lot of work in
mmc_wait_for_data_req_done(), done() callback and also error handling
changes for card/block.c

Things in core.c could be re-used in the SDIO case. In block.c there are only 
FS specific implementation not relevant for SDIO.



Do you think, that wait_event() API used not suits the same semantic as
completion API?

The waiting mechanims may be wait_event or completion.
I'm not in favor of using both. Better to replace completion with wait_event if 
you prefer.


I'm fine with wait_event() (block request transfers) combined with completion 
(for other transfer), as long as the core.c API is intact. I don't see a reason 
for a new start_data_req()

mmc_start_req() is only used by the mmc block transfers.
The main idea is to change start_req() waiting point 
(mmc_wait_for_data_req_done() function) in such way that external events 
(in the MMC case it is coming from block layer) may awake MMC context 
from waiting for current request. This is done by change the original 
mmc_start_req() function and not changing its signature of mmc_start_req().


May I ask you to see [PATCH v2] patch? I think that is is no change in 
core.c API (by core.c API you mean all functions from core.h?).


Also to use new request feature of core.c one need to implement 
notification similar to mmc_request_fn() and I do not see difficulties 
to do it from SDIO.




Making a test case in mmc_test.c is a good way to stress test the feature (e.g. 
random timing of incoming new requests) and it will show how the API works too.

BR
Per


My main concern is to avoid adding new API to core.c in order to add the 
NEW_REQ feature. My patch is an example of changes to be done in core.c without 
adding new functions.



We would like to have a generic capability to handle additional events,
such as HPI/stop flow, in addition to the NEW_REQUEST notification.
Therefore, an event mechanism seems to be a better design than completion.

I've looked at SDIO code and from what I can understand, right now SDIO
is not using async request mechanism and works from 'wait_for_cmd()`
level. This means that such work as exposing MMC core API's is major
change and definitely should be separate design and implementation
effort, while my current patch right now wi

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-19 Thread Per Förlin
On 11/19/2012 03:32 PM, Per Förlin wrote:
> On 11/19/2012 10:48 AM, Konstantin Dorfman wrote:
>> Hello Per,
>>
>> Thank you for giving me an example, below I'm trying to explain some
>> logic issues of it and asking you some questions about your vision of
>> the patch.
>>
>> On 11/15/2012 06:38 PM, Per Förlin wrote:
>>> On 11/14/2012 04:15 PM, Konstantin Dorfman wrote:
 Hello Per,

 On 11/13/2012 11:10 PM, Per Forlin wrote:
> On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
>  wrote:
>> Hello,
>>
>> On 10/29/2012 11:40 PM, Per Forlin wrote:
>>> Hi,
>>>
>>> I would like to move the focus of my concerns from root cause analysis
>>> to the actual patch,
>>> My first reflection is that the patch is relatively complex and some
>>> of the code looks duplicated.
>>> Would it be possible to simplify it and re-use  the current execution 
>>> flow.
>>>
>>> Is the following flow feasible?
>>>
>>> in mmc_start_req():
>>> --
>>> if (rqc == NULL && not_resend)
>>>   wait_for_both_mmc_and_arrival_of_new_req
>>> else
>>>   wait_only_for_mmc
>>>
>>> if (arrival_of_new_req) {
>>>set flag to indicate fetch-new_req
>>>   return NULL;
>>> }
>>> -
>>>
>>> in queue.c:
>>> if (fetch-new_req)
>>>   don't overwrite previous req.
>>>
>>> BR
>>> Per
>>
>> You are talking about using both waiting mechanisms, old (wait on
>> completion) and new - wait_event_interruptible()? But how done()
>> callback, called on host controller irq context, will differentiate
>> which one is relevant for the request?
>>
>> I think it is better to use single wait point for mmc thread.
>
> I have sketch a patch to better explain my point. It's not tested it
> barely compiles :)
> The patch tries to introduce your feature and still keep the same code
> path. And it exposes an API that could be used by SDIO as well.
> The intention of my sketch patch is only to explain what I tried to
> visualize in the pseudo code previously in this thread.
> The out come of your final patch should be documented here I think:
> Documentation/mmc/mmc-async-req.txt
 This document is ready, attaching it to this mail and will be included
 in next version of the patch (or RESEND).
>
> Here follows my patch sketch:
> 
> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
> index e360a97..08036a1 100644
> --- a/drivers/mmc/card/queue.c
> +++ b/drivers/mmc/card/queue.c
> @@ -66,6 +66,8 @@ static int mmc_queue_thread(void *d)
>   spin_unlock_irq(q->queue_lock);
>
>   if (req || mq->mqrq_prev->req) {
> + if (!req)
> + 
> mmc_prefetch_start(&mq->mqrq_prev->mmc_active, true);
>   set_current_state(TASK_RUNNING);
>   mq->issue_fn(mq, req);
>   } else {
> @@ -79,11 +81,13 @@ static int mmc_queue_thread(void *d)
>   }
>
>   /* Current request becomes previous request and vice versa. 
> */
> - mq->mqrq_prev->brq.mrq.data = NULL;
> - mq->mqrq_prev->req = NULL;
> - tmp = mq->mqrq_prev;
> - mq->mqrq_prev = mq->mqrq_cur;
> - mq->mqrq_cur = tmp;
> + if (!mmc_prefetch_pending(&mq->mqrq_prev->mmc_active)) {
> + mq->mqrq_prev->brq.mrq.data = NULL;
> + mq->mqrq_prev->req = NULL;
> + tmp = mq->mqrq_prev;
> + mq->mqrq_prev = mq->mqrq_cur;
> + mq->mqrq_cur = tmp;
> + }
>   } while (1);
>   up(&mq->thread_sem);
>
> @@ -109,10 +113,44 @@ static void mmc_request_fn(struct request_queue *q)
>   return;
>   }
>
> + if (mq->prefetch_enable) {
> + spin_lock(&mq->prefetch_lock);
> + if (mq->prefetch_completion)
> + complete(mq->prefetch_completion);
>> Since mq->prefetch_completion init happens only in core.c after
>> mmc_start_req(NULL), we can miss all new request notifications coming
>> from fetching NULL request until then.
> It's a mistake in the patch.
> I meant check mq->prefetch_pending to know whether we need to wait in the 
> first place.
> That's why mq->prefetch_pending is assigned even if mq->prefetch_completion 
> is not set yet.
> Patch needs to be updated to take it into account. One could let 
> mmc_prefecth_init() return and indicate if there is already a NEW_REQ 
> pending. If this is the case skip wait.
> 
> 
> + mq->prefetch_pending = true;
> + spin_unlock(&mq->

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-19 Thread Per Förlin
On 11/19/2012 10:48 AM, Konstantin Dorfman wrote:
> Hello Per,
> 
> Thank you for giving me an example, below I'm trying to explain some
> logic issues of it and asking you some questions about your vision of
> the patch.
> 
> On 11/15/2012 06:38 PM, Per Förlin wrote:
>> On 11/14/2012 04:15 PM, Konstantin Dorfman wrote:
>>> Hello Per,
>>>
>>> On 11/13/2012 11:10 PM, Per Forlin wrote:
 On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
  wrote:
> Hello,
>
> On 10/29/2012 11:40 PM, Per Forlin wrote:
>> Hi,
>>
>> I would like to move the focus of my concerns from root cause analysis
>> to the actual patch,
>> My first reflection is that the patch is relatively complex and some
>> of the code looks duplicated.
>> Would it be possible to simplify it and re-use  the current execution 
>> flow.
>>
>> Is the following flow feasible?
>>
>> in mmc_start_req():
>> --
>> if (rqc == NULL && not_resend)
>>   wait_for_both_mmc_and_arrival_of_new_req
>> else
>>   wait_only_for_mmc
>>
>> if (arrival_of_new_req) {
>>set flag to indicate fetch-new_req
>>   return NULL;
>> }
>> -
>>
>> in queue.c:
>> if (fetch-new_req)
>>   don't overwrite previous req.
>>
>> BR
>> Per
>
> You are talking about using both waiting mechanisms, old (wait on
> completion) and new - wait_event_interruptible()? But how done()
> callback, called on host controller irq context, will differentiate
> which one is relevant for the request?
>
> I think it is better to use single wait point for mmc thread.

 I have sketch a patch to better explain my point. It's not tested it
 barely compiles :)
 The patch tries to introduce your feature and still keep the same code
 path. And it exposes an API that could be used by SDIO as well.
 The intention of my sketch patch is only to explain what I tried to
 visualize in the pseudo code previously in this thread.
 The out come of your final patch should be documented here I think:
 Documentation/mmc/mmc-async-req.txt
>>> This document is ready, attaching it to this mail and will be included
>>> in next version of the patch (or RESEND).

 Here follows my patch sketch:
 
 diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
 index e360a97..08036a1 100644
 --- a/drivers/mmc/card/queue.c
 +++ b/drivers/mmc/card/queue.c
 @@ -66,6 +66,8 @@ static int mmc_queue_thread(void *d)
   spin_unlock_irq(q->queue_lock);

   if (req || mq->mqrq_prev->req) {
 + if (!req)
 + 
 mmc_prefetch_start(&mq->mqrq_prev->mmc_active, true);
   set_current_state(TASK_RUNNING);
   mq->issue_fn(mq, req);
   } else {
 @@ -79,11 +81,13 @@ static int mmc_queue_thread(void *d)
   }

   /* Current request becomes previous request and vice versa. 
 */
 - mq->mqrq_prev->brq.mrq.data = NULL;
 - mq->mqrq_prev->req = NULL;
 - tmp = mq->mqrq_prev;
 - mq->mqrq_prev = mq->mqrq_cur;
 - mq->mqrq_cur = tmp;
 + if (!mmc_prefetch_pending(&mq->mqrq_prev->mmc_active)) {
 + mq->mqrq_prev->brq.mrq.data = NULL;
 + mq->mqrq_prev->req = NULL;
 + tmp = mq->mqrq_prev;
 + mq->mqrq_prev = mq->mqrq_cur;
 + mq->mqrq_cur = tmp;
 + }
   } while (1);
   up(&mq->thread_sem);

 @@ -109,10 +113,44 @@ static void mmc_request_fn(struct request_queue *q)
   return;
   }

 + if (mq->prefetch_enable) {
 + spin_lock(&mq->prefetch_lock);
 + if (mq->prefetch_completion)
 + complete(mq->prefetch_completion);
> Since mq->prefetch_completion init happens only in core.c after
> mmc_start_req(NULL), we can miss all new request notifications coming
> from fetching NULL request until then.
It's a mistake in the patch.
I meant check mq->prefetch_pending to know whether we need to wait in the first 
place.
That's why mq->prefetch_pending is assigned even if mq->prefetch_completion is 
not set yet.
Patch needs to be updated to take it into account. One could let 
mmc_prefecth_init() return and indicate if there is already a NEW_REQ pending. 
If this is the case skip wait.


 + mq->prefetch_pending = true;
 + spin_unlock(&mq->prefetch_lock);
 + }
 +
   if (!mq->mqrq_cur->req && !mq->mqrq_prev->req)
   wake_up_process(mq->thread);
  }

 +static void mmc

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-19 Thread Konstantin Dorfman
Hello Per,

Thank you for giving me an example, below I'm trying to explain some
logic issues of it and asking you some questions about your vision of
the patch.

On 11/15/2012 06:38 PM, Per Förlin wrote:
> On 11/14/2012 04:15 PM, Konstantin Dorfman wrote:
>> Hello Per,
>>
>> On 11/13/2012 11:10 PM, Per Forlin wrote:
>>> On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
>>>  wrote:
 Hello,

 On 10/29/2012 11:40 PM, Per Forlin wrote:
> Hi,
>
> I would like to move the focus of my concerns from root cause analysis
> to the actual patch,
> My first reflection is that the patch is relatively complex and some
> of the code looks duplicated.
> Would it be possible to simplify it and re-use  the current execution 
> flow.
>
> Is the following flow feasible?
>
> in mmc_start_req():
> --
> if (rqc == NULL && not_resend)
>   wait_for_both_mmc_and_arrival_of_new_req
> else
>   wait_only_for_mmc
>
> if (arrival_of_new_req) {
>set flag to indicate fetch-new_req
>   return NULL;
> }
> -
>
> in queue.c:
> if (fetch-new_req)
>   don't overwrite previous req.
>
> BR
> Per

 You are talking about using both waiting mechanisms, old (wait on
 completion) and new - wait_event_interruptible()? But how done()
 callback, called on host controller irq context, will differentiate
 which one is relevant for the request?

 I think it is better to use single wait point for mmc thread.
>>>
>>> I have sketch a patch to better explain my point. It's not tested it
>>> barely compiles :)
>>> The patch tries to introduce your feature and still keep the same code
>>> path. And it exposes an API that could be used by SDIO as well.
>>> The intention of my sketch patch is only to explain what I tried to
>>> visualize in the pseudo code previously in this thread.
>>> The out come of your final patch should be documented here I think:
>>> Documentation/mmc/mmc-async-req.txt
>> This document is ready, attaching it to this mail and will be included
>> in next version of the patch (or RESEND).
>>>
>>> Here follows my patch sketch:
>>> 
>>> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
>>> index e360a97..08036a1 100644
>>> --- a/drivers/mmc/card/queue.c
>>> +++ b/drivers/mmc/card/queue.c
>>> @@ -66,6 +66,8 @@ static int mmc_queue_thread(void *d)
>>>   spin_unlock_irq(q->queue_lock);
>>>
>>>   if (req || mq->mqrq_prev->req) {
>>> + if (!req)
>>> + 
>>> mmc_prefetch_start(&mq->mqrq_prev->mmc_active, true);
>>>   set_current_state(TASK_RUNNING);
>>>   mq->issue_fn(mq, req);
>>>   } else {
>>> @@ -79,11 +81,13 @@ static int mmc_queue_thread(void *d)
>>>   }
>>>
>>>   /* Current request becomes previous request and vice versa. */
>>> - mq->mqrq_prev->brq.mrq.data = NULL;
>>> - mq->mqrq_prev->req = NULL;
>>> - tmp = mq->mqrq_prev;
>>> - mq->mqrq_prev = mq->mqrq_cur;
>>> - mq->mqrq_cur = tmp;
>>> + if (!mmc_prefetch_pending(&mq->mqrq_prev->mmc_active)) {
>>> + mq->mqrq_prev->brq.mrq.data = NULL;
>>> + mq->mqrq_prev->req = NULL;
>>> + tmp = mq->mqrq_prev;
>>> + mq->mqrq_prev = mq->mqrq_cur;
>>> + mq->mqrq_cur = tmp;
>>> + }
>>>   } while (1);
>>>   up(&mq->thread_sem);
>>>
>>> @@ -109,10 +113,44 @@ static void mmc_request_fn(struct request_queue *q)
>>>   return;
>>>   }
>>>
>>> + if (mq->prefetch_enable) {
>>> + spin_lock(&mq->prefetch_lock);
>>> + if (mq->prefetch_completion)
>>> + complete(mq->prefetch_completion);
Since mq->prefetch_completion init happens only in core.c after
mmc_start_req(NULL), we can miss all new request notifications coming
from fetching NULL request until then.
>>> + mq->prefetch_pending = true;
>>> + spin_unlock(&mq->prefetch_lock);
>>> + }
>>> +
>>>   if (!mq->mqrq_cur->req && !mq->mqrq_prev->req)
>>>   wake_up_process(mq->thread);
>>>  }
>>>
>>> +static void mmc_req_init(struct mmc_async_req *areq, struct completion 
>>> *compl)
>>> +{
>>> + struct mmc_queue *mq =
>>> + container_of(areq->prefetch, struct mmc_queue, prefetch);
>>> +
>>> + spin_lock(&mq->prefetch_lock);
>>> + mq->prefetch_completion = compl;
>>> + if (mq->prefetch_pending)
>>> + complete(mq->prefetch_completion);
>>> + spin_unlock(&mq->prefetch_lock);
>>> +}
>>> +
>>> +static void mmc_req_start(struct mmc_async_req *areq, bool enable)
>>> +{
>>> + struct mmc_queue *mq =
>>> + container_of(areq->prefet

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-15 Thread Per Förlin
On 11/14/2012 04:15 PM, Konstantin Dorfman wrote:
> Hello Per,
> 
> On 11/13/2012 11:10 PM, Per Forlin wrote:
>> On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
>>  wrote:
>>> Hello,
>>>
>>> On 10/29/2012 11:40 PM, Per Forlin wrote:
 Hi,

 I would like to move the focus of my concerns from root cause analysis
 to the actual patch,
 My first reflection is that the patch is relatively complex and some
 of the code looks duplicated.
 Would it be possible to simplify it and re-use  the current execution flow.

 Is the following flow feasible?

 in mmc_start_req():
 --
 if (rqc == NULL && not_resend)
   wait_for_both_mmc_and_arrival_of_new_req
 else
   wait_only_for_mmc

 if (arrival_of_new_req) {
set flag to indicate fetch-new_req
   return NULL;
 }
 -

 in queue.c:
 if (fetch-new_req)
   don't overwrite previous req.

 BR
 Per
>>>
>>> You are talking about using both waiting mechanisms, old (wait on
>>> completion) and new - wait_event_interruptible()? But how done()
>>> callback, called on host controller irq context, will differentiate
>>> which one is relevant for the request?
>>>
>>> I think it is better to use single wait point for mmc thread.
>>
>> I have sketch a patch to better explain my point. It's not tested it
>> barely compiles :)
>> The patch tries to introduce your feature and still keep the same code
>> path. And it exposes an API that could be used by SDIO as well.
>> The intention of my sketch patch is only to explain what I tried to
>> visualize in the pseudo code previously in this thread.
>> The out come of your final patch should be documented here I think:
>> Documentation/mmc/mmc-async-req.txt
> This document is ready, attaching it to this mail and will be included
> in next version of the patch (or RESEND).
>>
>> Here follows my patch sketch:
>> 
>> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
>> index e360a97..08036a1 100644
>> --- a/drivers/mmc/card/queue.c
>> +++ b/drivers/mmc/card/queue.c
>> @@ -66,6 +66,8 @@ static int mmc_queue_thread(void *d)
>>   spin_unlock_irq(q->queue_lock);
>>
>>   if (req || mq->mqrq_prev->req) {
>> + if (!req)
>> + mmc_prefetch_start(&mq->mqrq_prev->mmc_active, 
>> true);
>>   set_current_state(TASK_RUNNING);
>>   mq->issue_fn(mq, req);
>>   } else {
>> @@ -79,11 +81,13 @@ static int mmc_queue_thread(void *d)
>>   }
>>
>>   /* Current request becomes previous request and vice versa. */
>> - mq->mqrq_prev->brq.mrq.data = NULL;
>> - mq->mqrq_prev->req = NULL;
>> - tmp = mq->mqrq_prev;
>> - mq->mqrq_prev = mq->mqrq_cur;
>> - mq->mqrq_cur = tmp;
>> + if (!mmc_prefetch_pending(&mq->mqrq_prev->mmc_active)) {
>> + mq->mqrq_prev->brq.mrq.data = NULL;
>> + mq->mqrq_prev->req = NULL;
>> + tmp = mq->mqrq_prev;
>> + mq->mqrq_prev = mq->mqrq_cur;
>> + mq->mqrq_cur = tmp;
>> + }
>>   } while (1);
>>   up(&mq->thread_sem);
>>
>> @@ -109,10 +113,44 @@ static void mmc_request_fn(struct request_queue *q)
>>   return;
>>   }
>>
>> + if (mq->prefetch_enable) {
>> + spin_lock(&mq->prefetch_lock);
>> + if (mq->prefetch_completion)
>> + complete(mq->prefetch_completion);
>> + mq->prefetch_pending = true;
>> + spin_unlock(&mq->prefetch_lock);
>> + }
>> +
>>   if (!mq->mqrq_cur->req && !mq->mqrq_prev->req)
>>   wake_up_process(mq->thread);
>>  }
>>
>> +static void mmc_req_init(struct mmc_async_req *areq, struct completion 
>> *compl)
>> +{
>> + struct mmc_queue *mq =
>> + container_of(areq->prefetch, struct mmc_queue, prefetch);
>> +
>> + spin_lock(&mq->prefetch_lock);
>> + mq->prefetch_completion = compl;
>> + if (mq->prefetch_pending)
>> + complete(mq->prefetch_completion);
>> + spin_unlock(&mq->prefetch_lock);
>> +}
>> +
>> +static void mmc_req_start(struct mmc_async_req *areq, bool enable)
>> +{
>> + struct mmc_queue *mq =
>> + container_of(areq->prefetch, struct mmc_queue, prefetch);
>> + mq->prefetch_enable = enable;
>> +}
>> +
>> +static bool mmc_req_pending(struct mmc_async_req *areq)
>> +{
>> + struct mmc_queue *mq =
>> + container_of(areq->prefetch, struct mmc_queue, prefetch);
>> + return mq->prefetch_pending;
>> +}
>> +
>>  static struct scatterlist *mmc_alloc_sg(int sg_len, int *err)
>>  {
>>   struct scatterlist *sg;
>> @@ -166,6 +204,12 @@ int mmc_init_queue(struct mmc_queue *mq, struct
>> mmc_card *card,
>>   

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-14 Thread Konstantin Dorfman
Hello Per,

On 11/13/2012 11:10 PM, Per Forlin wrote:
> On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
>  wrote:
>> Hello,
>>
>> On 10/29/2012 11:40 PM, Per Forlin wrote:
>>> Hi,
>>>
>>> I would like to move the focus of my concerns from root cause analysis
>>> to the actual patch,
>>> My first reflection is that the patch is relatively complex and some
>>> of the code looks duplicated.
>>> Would it be possible to simplify it and re-use  the current execution flow.
>>>
>>> Is the following flow feasible?
>>>
>>> in mmc_start_req():
>>> --
>>> if (rqc == NULL && not_resend)
>>>   wait_for_both_mmc_and_arrival_of_new_req
>>> else
>>>   wait_only_for_mmc
>>>
>>> if (arrival_of_new_req) {
>>>set flag to indicate fetch-new_req
>>>   return NULL;
>>> }
>>> -
>>>
>>> in queue.c:
>>> if (fetch-new_req)
>>>   don't overwrite previous req.
>>>
>>> BR
>>> Per
>>
>> You are talking about using both waiting mechanisms, old (wait on
>> completion) and new - wait_event_interruptible()? But how done()
>> callback, called on host controller irq context, will differentiate
>> which one is relevant for the request?
>>
>> I think it is better to use single wait point for mmc thread.
> 
> I have sketch a patch to better explain my point. It's not tested it
> barely compiles :)
> The patch tries to introduce your feature and still keep the same code
> path. And it exposes an API that could be used by SDIO as well.
> The intention of my sketch patch is only to explain what I tried to
> visualize in the pseudo code previously in this thread.
> The out come of your final patch should be documented here I think:
> Documentation/mmc/mmc-async-req.txt
This document is ready, attaching it to this mail and will be included
in next version of the patch (or RESEND).
> 
> Here follows my patch sketch:
> 
> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
> index e360a97..08036a1 100644
> --- a/drivers/mmc/card/queue.c
> +++ b/drivers/mmc/card/queue.c
> @@ -66,6 +66,8 @@ static int mmc_queue_thread(void *d)
>   spin_unlock_irq(q->queue_lock);
> 
>   if (req || mq->mqrq_prev->req) {
> + if (!req)
> + mmc_prefetch_start(&mq->mqrq_prev->mmc_active, 
> true);
>   set_current_state(TASK_RUNNING);
>   mq->issue_fn(mq, req);
>   } else {
> @@ -79,11 +81,13 @@ static int mmc_queue_thread(void *d)
>   }
> 
>   /* Current request becomes previous request and vice versa. */
> - mq->mqrq_prev->brq.mrq.data = NULL;
> - mq->mqrq_prev->req = NULL;
> - tmp = mq->mqrq_prev;
> - mq->mqrq_prev = mq->mqrq_cur;
> - mq->mqrq_cur = tmp;
> + if (!mmc_prefetch_pending(&mq->mqrq_prev->mmc_active)) {
> + mq->mqrq_prev->brq.mrq.data = NULL;
> + mq->mqrq_prev->req = NULL;
> + tmp = mq->mqrq_prev;
> + mq->mqrq_prev = mq->mqrq_cur;
> + mq->mqrq_cur = tmp;
> + }
>   } while (1);
>   up(&mq->thread_sem);
> 
> @@ -109,10 +113,44 @@ static void mmc_request_fn(struct request_queue *q)
>   return;
>   }
> 
> + if (mq->prefetch_enable) {
> + spin_lock(&mq->prefetch_lock);
> + if (mq->prefetch_completion)
> + complete(mq->prefetch_completion);
> + mq->prefetch_pending = true;
> + spin_unlock(&mq->prefetch_lock);
> + }
> +
>   if (!mq->mqrq_cur->req && !mq->mqrq_prev->req)
>   wake_up_process(mq->thread);
>  }
> 
> +static void mmc_req_init(struct mmc_async_req *areq, struct completion 
> *compl)
> +{
> + struct mmc_queue *mq =
> + container_of(areq->prefetch, struct mmc_queue, prefetch);
> +
> + spin_lock(&mq->prefetch_lock);
> + mq->prefetch_completion = compl;
> + if (mq->prefetch_pending)
> + complete(mq->prefetch_completion);
> + spin_unlock(&mq->prefetch_lock);
> +}
> +
> +static void mmc_req_start(struct mmc_async_req *areq, bool enable)
> +{
> + struct mmc_queue *mq =
> + container_of(areq->prefetch, struct mmc_queue, prefetch);
> + mq->prefetch_enable = enable;
> +}
> +
> +static bool mmc_req_pending(struct mmc_async_req *areq)
> +{
> + struct mmc_queue *mq =
> + container_of(areq->prefetch, struct mmc_queue, prefetch);
> + return mq->prefetch_pending;
> +}
> +
>  static struct scatterlist *mmc_alloc_sg(int sg_len, int *err)
>  {
>   struct scatterlist *sg;
> @@ -166,6 +204,12 @@ int mmc_init_queue(struct mmc_queue *mq, struct
> mmc_card *card,
>   int ret;
>   struct mmc_queue_req *mqrq_cur = &mq->mqrq[0];
>   struct mmc_queue_req *mqrq_prev = &mq->mqrq[1];
> + spin_lock_init(&mq->prefetch_lock);
> + mq->prefetch.wai

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-11-13 Thread Per Forlin
On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
 wrote:
> Hello,
>
> On 10/29/2012 11:40 PM, Per Forlin wrote:
>> Hi,
>>
>> I would like to move the focus of my concerns from root cause analysis
>> to the actual patch,
>> My first reflection is that the patch is relatively complex and some
>> of the code looks duplicated.
>> Would it be possible to simplify it and re-use  the current execution flow.
>>
>> Is the following flow feasible?
>>
>> in mmc_start_req():
>> --
>> if (rqc == NULL && not_resend)
>>   wait_for_both_mmc_and_arrival_of_new_req
>> else
>>   wait_only_for_mmc
>>
>> if (arrival_of_new_req) {
>>set flag to indicate fetch-new_req
>>   return NULL;
>> }
>> -
>>
>> in queue.c:
>> if (fetch-new_req)
>>   don't overwrite previous req.
>>
>> BR
>> Per
>
> You are talking about using both waiting mechanisms, old (wait on
> completion) and new - wait_event_interruptible()? But how done()
> callback, called on host controller irq context, will differentiate
> which one is relevant for the request?
>
> I think it is better to use single wait point for mmc thread.

I have sketch a patch to better explain my point. It's not tested it
barely compiles :)
The patch tries to introduce your feature and still keep the same code
path. And it exposes an API that could be used by SDIO as well.
The intention of my sketch patch is only to explain what I tried to
visualize in the pseudo code previously in this thread.
The out come of your final patch should be documented here I think:
Documentation/mmc/mmc-async-req.txt

Here follows my patch sketch:

diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index e360a97..08036a1 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -66,6 +66,8 @@ static int mmc_queue_thread(void *d)
spin_unlock_irq(q->queue_lock);

if (req || mq->mqrq_prev->req) {
+   if (!req)
+   mmc_prefetch_start(&mq->mqrq_prev->mmc_active, 
true);
set_current_state(TASK_RUNNING);
mq->issue_fn(mq, req);
} else {
@@ -79,11 +81,13 @@ static int mmc_queue_thread(void *d)
}

/* Current request becomes previous request and vice versa. */
-   mq->mqrq_prev->brq.mrq.data = NULL;
-   mq->mqrq_prev->req = NULL;
-   tmp = mq->mqrq_prev;
-   mq->mqrq_prev = mq->mqrq_cur;
-   mq->mqrq_cur = tmp;
+   if (!mmc_prefetch_pending(&mq->mqrq_prev->mmc_active)) {
+   mq->mqrq_prev->brq.mrq.data = NULL;
+   mq->mqrq_prev->req = NULL;
+   tmp = mq->mqrq_prev;
+   mq->mqrq_prev = mq->mqrq_cur;
+   mq->mqrq_cur = tmp;
+   }
} while (1);
up(&mq->thread_sem);

@@ -109,10 +113,44 @@ static void mmc_request_fn(struct request_queue *q)
return;
}

+   if (mq->prefetch_enable) {
+   spin_lock(&mq->prefetch_lock);
+   if (mq->prefetch_completion)
+   complete(mq->prefetch_completion);
+   mq->prefetch_pending = true;
+   spin_unlock(&mq->prefetch_lock);
+   }
+
if (!mq->mqrq_cur->req && !mq->mqrq_prev->req)
wake_up_process(mq->thread);
 }

+static void mmc_req_init(struct mmc_async_req *areq, struct completion *compl)
+{
+   struct mmc_queue *mq =
+   container_of(areq->prefetch, struct mmc_queue, prefetch);
+
+   spin_lock(&mq->prefetch_lock);
+   mq->prefetch_completion = compl;
+   if (mq->prefetch_pending)
+   complete(mq->prefetch_completion);
+   spin_unlock(&mq->prefetch_lock);
+}
+
+static void mmc_req_start(struct mmc_async_req *areq, bool enable)
+{
+   struct mmc_queue *mq =
+   container_of(areq->prefetch, struct mmc_queue, prefetch);
+   mq->prefetch_enable = enable;
+}
+
+static bool mmc_req_pending(struct mmc_async_req *areq)
+{
+   struct mmc_queue *mq =
+   container_of(areq->prefetch, struct mmc_queue, prefetch);
+   return mq->prefetch_pending;
+}
+
 static struct scatterlist *mmc_alloc_sg(int sg_len, int *err)
 {
struct scatterlist *sg;
@@ -166,6 +204,12 @@ int mmc_init_queue(struct mmc_queue *mq, struct
mmc_card *card,
int ret;
struct mmc_queue_req *mqrq_cur = &mq->mqrq[0];
struct mmc_queue_req *mqrq_prev = &mq->mqrq[1];
+   spin_lock_init(&mq->prefetch_lock);
+   mq->prefetch.wait_req_init = mmc_req_init;
+   mq->prefetch.wait_req_start = mmc_req_start;
+   mq->prefetch.wait_req_pending = mmc_req_pending;
+   mqrq_cur->mmc_active.prefetch = &mq->prefetch;
+   mqrq_prev->mmc_active.prefetch = &mq->prefetch;

if (mmc_dev(host)->dma_mask && *mmc_dev(host

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-30 Thread Per Forlin
Hi,

On Tue, Oct 30, 2012 at 1:19 PM, Konstantin Dorfman
 wrote:
> Hello,
>
> On 10/29/2012 11:40 PM, Per Forlin wrote:
>> Hi,
>>
>> I would like to move the focus of my concerns from root cause analysis
>> to the actual patch,
>> My first reflection is that the patch is relatively complex and some
>> of the code looks duplicated.
>> Would it be possible to simplify it and re-use  the current execution flow.
>>
>> Is the following flow feasible?
>>
>> in mmc_start_req():
>> --
>> if (rqc == NULL && not_resend)
>>   wait_for_both_mmc_and_arrival_of_new_req
>> else
>>   wait_only_for_mmc
>>
>> if (arrival_of_new_req) {
>>set flag to indicate fetch-new_req
>>   return NULL;
>> }
>> -
>>
>> in queue.c:
>> if (fetch-new_req)
>>   don't overwrite previous req.
>>
>> BR
>> Per
>
> You are talking about using both waiting mechanisms, old (wait on
> completion) and new - wait_event_interruptible()? But how done()
> callback, called on host controller irq context, will differentiate
> which one is relevant for the request?
>
> I think it is better to use single wait point for mmc thread.
>
> Also in future plans to add second reason to wake up mmc thread from
> waiting: this is urgent_request - this notification about next-to-fetch
> read request, that should be executed out-of-order, using new eMMC HPI
> feature (to stop currently running request).
>
> I will publish this patch soon.
>
> Your idea with fetch_new_req flag is good, I'll try to simplify current
> patch and lets talk again.
>
I have not thought this through entirely. But it would be nice to add
a new func() in areq to add support for this a new mechanism. If the
func() is NULL the normal flow will be executed. This could be used in
the SDIO case too. At least it should be possible to use only the core
API and it still make sense without a mmc block driver on top.

How to implement this wait function?
There should be one single wait point, I agree.
One could share the mmc completion perhaps through the new func().
This mean both mmc-queue and mmc-host and completes the same
completion.
When waking up from the completion one needs to stop the new
func-wait-point and check if a new request is available. If yes, fetch
a new request and next time don't re-init the completion (this would
overwrite the mmc-host complete value).

Let's talk again when you have a new patch set.

BR
Per

> Thanks,
> --
> Konstantin Dorfman,
> QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
> Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-30 Thread Konstantin Dorfman
On 10/30/2012 09:45 AM, Per Forlin wrote:
> minor clarification,
> 
> On Mon, Oct 29, 2012 at 10:40 PM, Per Forlin  wrote:

>>
>> Is the following flow feasible?
>>
>> in mmc_start_req():
>> --
>> if (rqc == NULL && not_resend)
> && request_is_ongoing (in case of resend request is never ongoing
> 
>>   wait_for_both_mmc_and_arrival_of_new_req
> We should wake up if any of the two events occur.
> 
>> else
>>   wait_only_for_mmc
>>
>> if (arrival_of_new_req) {
>>set flag to indicate fetch-new_req
>>   return NULL;
>> }
>> -
>>
>> in queue.c:
>> if (fetch-new_req)
>>   don't overwrite previous req.
>>
> 
> This is somewhat a subset of the your patch. Maybe I'm missing parts
> of the complexity.
> I haven't figured out why a new mmc_start_data_req() is needed. A new
> mechanism for waiting is required.
You're right, it was no reason to add new function, we can use
mmc_start_req() and just change mechanism for waiting, I will fix this
to minimize changes.

-- 
Konstantin Dorfman,
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-30 Thread Konstantin Dorfman
Hello,

On 10/29/2012 11:40 PM, Per Forlin wrote:
> Hi,
> 
> I would like to move the focus of my concerns from root cause analysis
> to the actual patch,
> My first reflection is that the patch is relatively complex and some
> of the code looks duplicated.
> Would it be possible to simplify it and re-use  the current execution flow.
> 
> Is the following flow feasible?
> 
> in mmc_start_req():
> --
> if (rqc == NULL && not_resend)
>   wait_for_both_mmc_and_arrival_of_new_req
> else
>   wait_only_for_mmc
> 
> if (arrival_of_new_req) {
>set flag to indicate fetch-new_req
>   return NULL;
> }
> -
> 
> in queue.c:
> if (fetch-new_req)
>   don't overwrite previous req.
> 
> BR
> Per

You are talking about using both waiting mechanisms, old (wait on
completion) and new - wait_event_interruptible()? But how done()
callback, called on host controller irq context, will differentiate
which one is relevant for the request?

I think it is better to use single wait point for mmc thread.

Also in future plans to add second reason to wake up mmc thread from
waiting: this is urgent_request - this notification about next-to-fetch
read request, that should be executed out-of-order, using new eMMC HPI
feature (to stop currently running request).

I will publish this patch soon.

Your idea with fetch_new_req flag is good, I'll try to simplify current
patch and lets talk again.

Thanks,
-- 
Konstantin Dorfman,
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-30 Thread Per Forlin
minor clarification,

On Mon, Oct 29, 2012 at 10:40 PM, Per Forlin  wrote:
> Hi,
>
> I would like to move the focus of my concerns from root cause analysis
> to the actual patch,
> My first reflection is that the patch is relatively complex and some
> of the code looks duplicated.
> Would it be possible to simplify it and re-use  the current execution flow.
>
> Is the following flow feasible?
>
> in mmc_start_req():
> --
> if (rqc == NULL && not_resend)
&& request_is_ongoing (in case of resend request is never ongoing

>   wait_for_both_mmc_and_arrival_of_new_req
We should wake up if any of the two events occur.

> else
>   wait_only_for_mmc
>
> if (arrival_of_new_req) {
>set flag to indicate fetch-new_req
>   return NULL;
> }
> -
>
> in queue.c:
> if (fetch-new_req)
>   don't overwrite previous req.
>

This is somewhat a subset of the your patch. Maybe I'm missing parts
of the complexity.
I haven't figured out why a new mmc_start_data_req() is needed. A new
mechanism for waiting is required.

BR
Per

> BR
> Per
>
>
>
> On Sun, Oct 28, 2012 at 2:12 PM, Konstantin Dorfman
>  wrote:
>> Hello,
>>
>> On 10/26/2012 02:07 PM, Venkatraman S wrote:
>>
>>>
>>> Actually there could a lot of reasons why block layer or CFQ would not have
>>> inserted the request into the queue. i.e. you can see a lot of exit paths
>>> where blk_peek_request returns NULL, even though there could be any request
>>> pending in one of the CFQ requests queues.
>>>
>>> Essentially you need to check what makes blk_fetch_request
>>> (cfq_dispatch_requests() ) return NULL when there is an element in
>>> queue, if at all.
>>>
>>
>> This is not important why block layer causes to blk_fetch_request() (or
>> blk_peek_request()) to return NULL to the MMC layer, but what is really
>> important - MMC layer should always be able to fetch the new arrived
>> request ASAP, after block layer notification (mmc_request() ) and this
>> is what my fix goes to implement.
>>
>> And the fix is not changing block layer behavior.
>>
>> Thanks
>>
>> --
>> Konstantin Dorfman,
>> QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
>> Inc. is a member of Code Aurora Forum,
>> hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-29 Thread Per Forlin
Hi,

I would like to move the focus of my concerns from root cause analysis
to the actual patch,
My first reflection is that the patch is relatively complex and some
of the code looks duplicated.
Would it be possible to simplify it and re-use  the current execution flow.

Is the following flow feasible?

in mmc_start_req():
--
if (rqc == NULL && not_resend)
  wait_for_both_mmc_and_arrival_of_new_req
else
  wait_only_for_mmc

if (arrival_of_new_req) {
   set flag to indicate fetch-new_req
  return NULL;
}
-

in queue.c:
if (fetch-new_req)
  don't overwrite previous req.

BR
Per



On Sun, Oct 28, 2012 at 2:12 PM, Konstantin Dorfman
 wrote:
> Hello,
>
> On 10/26/2012 02:07 PM, Venkatraman S wrote:
>
>>
>> Actually there could a lot of reasons why block layer or CFQ would not have
>> inserted the request into the queue. i.e. you can see a lot of exit paths
>> where blk_peek_request returns NULL, even though there could be any request
>> pending in one of the CFQ requests queues.
>>
>> Essentially you need to check what makes blk_fetch_request
>> (cfq_dispatch_requests() ) return NULL when there is an element in
>> queue, if at all.
>>
>
> This is not important why block layer causes to blk_fetch_request() (or
> blk_peek_request()) to return NULL to the MMC layer, but what is really
> important - MMC layer should always be able to fetch the new arrived
> request ASAP, after block layer notification (mmc_request() ) and this
> is what my fix goes to implement.
>
> And the fix is not changing block layer behavior.
>
> Thanks
>
> --
> Konstantin Dorfman,
> QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
> Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-28 Thread Konstantin Dorfman
Hello,

On 10/26/2012 02:07 PM, Venkatraman S wrote:

> 
> Actually there could a lot of reasons why block layer or CFQ would not have
> inserted the request into the queue. i.e. you can see a lot of exit paths
> where blk_peek_request returns NULL, even though there could be any request
> pending in one of the CFQ requests queues.
> 
> Essentially you need to check what makes blk_fetch_request
> (cfq_dispatch_requests() ) return NULL when there is an element in
> queue, if at all.
> 

This is not important why block layer causes to blk_fetch_request() (or
blk_peek_request()) to return NULL to the MMC layer, but what is really
important - MMC layer should always be able to fetch the new arrived
request ASAP, after block layer notification (mmc_request() ) and this
is what my fix goes to implement.

And the fix is not changing block layer behavior.

Thanks

-- 
Konstantin Dorfman,
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-28 Thread Konstantin Dorfman
Hello Venkatraman,
You are very welcome with any ideas,

On 10/26/2012 01:55 PM, Venkatraman S wrote:

> 
> Being nitpicky, I think it contradicts with the commit log that you have
> for the patch..
> 
> When the block layer notifies the MMC layer on a new request, we check
> for the above case where MMC layer is waiting on a previous request
> completion and the current request is NULL.
> 
> 

Can you be more specific? What is contradiction? Can you suggest better
patch description?

What I'm trying to say here:

the patch will modify block layer notification (mmc_request() callback)
on new incoming request in such way, that when MMC layer is blocked by
waiting for completion of previous request (in core/core.c) - it will
unblock MMC layer, rollback in mmc_queue_thread code up to blk_fetch()
point. This will permit to the MMC layer start processing of newly
notified request immediately (and not after completion of currently
running request).

Does it make sense?


-- 
Konstantin Dorfman,
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-28 Thread Konstantin Dorfman
On 10/25/2012 05:02 PM, Per Förlin wrote:
> On 10/25/2012 03:28 PM, Konstantin Dorfman wrote:
>> On 10/24/2012 07:07 PM, Per Förlin wrote:
>>> On 10/24/2012 11:41 AM, Konstantin Dorfman wrote:
 Hello Per,

 On Mon, October 22, 2012 1:02 am, Per Forlin wrote:
>> When mmcqt reports on completion of a request there should be
>> a context switch to allow the insertion of the next read ahead BIOs
>> to the block layer. Since the mmcqd tries to fetch another request
>> immediately after the completion of the previous request it gets NULL
>> and starts waiting for the completion of the previous request.
>> This wait on completion gives the FS the opportunity to insert the next
>> request but the MMC layer is already blocked on the previous request
>> completion and is not aware of the new request waiting to be fetched.
> I thought that I could trigger a context switch in order to give
> execution time for FS to add the new request to the MMC queue.
> I made a simple hack to call yield() in case the request gets NULL. I
> thought it may give the FS layer enough time to add a new request to
> the MMC queue. This would not delay the MMC transfer since the yield()
> is done in parallel with an ongoing transfer. Anyway it was just meant
> to be a simple test.
>
> One yield was not enough. Just for sanity check I added a msleep as
> well and that was enough to let FS add a new request,
> Would it be possible to gain throughput by delaying the fetch of new
> request? Too avoid unnecessary NULL requests
>
> If (ongoing request is read AND size is max read ahead AND new request
> is NULL) yield();
>
> BR
> Per
 We did the same experiment and it will not give maximum possible
 performance. There is no guarantee that the context switch which was
 manually caused by the MMC layer comes just in time: when it was early
 then next fetch still results in NULL, when it was later, then we miss
 possibility to fetch/prepare new request.

 Any delay in fetch of the new request that comes after the new request has
 arrived hits throughput and latency.

 The solution we are talking about here will fix not only situation with FS
 read ahead mechanism, but also it will remove penalty of the MMC context
 waiting on completion while any new request arrives.

 Thanks,

>>> It seems strange that the block layer cannot keep up with relatively slow 
>>> flash media devices. There must be a limitation on number of outstanding 
>>> request towards MMC.
>>> I need to make up my mind if it's the best way to address this issue in the 
>>> MMC framework or block layer. I have started to look into the block layer 
>>> code but it will take some time to dig out the relevant parts.
>>>
>>> BR
>>> Per
>>>
>> The root cause of the issue in incompletion of the current design with
>> well known producer-consumer problem solution (producer is block layer,
>> consumer is mmc layer).
>> Classic definitions states that the buffer is fix size, in our case we
>> have queue, so Producer always capable to put new request into the queue.
>> Consumer context blocked when both buffers (curr and prev) are busy
>> (first started its execution on the bus, second is fetched and waiting
>> for the first).
> This happens but I thought that the block layer would continue to add request 
> to the MMC queue while the consumer is busy.
> When consumer fetches request from the queue again there should be several 
> requests available in the queue, but there is only one.

MMC layer tries to fetch next request immediately after starting prev
request, but block layer has no context to submit new request, after
fetching NULL request MMC layer goes to be blocked and has no chance to
repeat the fetch until current request not completed.

> 
>> Producer context considered to be blocked when FS (or others bio
>> sources) has no requests to put into queue.
> Does the block layer ever wait for outstanding request to finish? Could this 
> be another reason why the producer doesn't add new requests on the MMC queue?
> 
In the case with sequence read request, only after request complete
returned to the FS/read ahead layer, it will generate next read request.
It may depend on READ_AHEAD : mmc_req_size ratio, also it may depend on
other layers (above FS) and user space access patterns. You can't expect
always "good" access patterns coming to the MMC layer

> I never meant yield or sleep to be a permanent fix. I was only curious of how 
> if would affect the performance in order to gain a better knowledge of the 
> root cause.
> My impression is that even if the SD card is very slow you will see the same 
> affect. The behavior of the block layer in this case is not related to the 
> speed for the flash memory.
> On a slow card the MMC-queue runs empty just like it does for a fast eMMC.
> According to you the block laye

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-26 Thread Venkatraman S
On Thu, Oct 25, 2012 at 8:32 PM, Per Förlin  wrote:
> On 10/25/2012 03:28 PM, Konstantin Dorfman wrote:
>> On 10/24/2012 07:07 PM, Per Förlin wrote:
>>> On 10/24/2012 11:41 AM, Konstantin Dorfman wrote:
 Hello Per,

 On Mon, October 22, 2012 1:02 am, Per Forlin wrote:
>> When mmcqt reports on completion of a request there should be
>> a context switch to allow the insertion of the next read ahead BIOs
>> to the block layer. Since the mmcqd tries to fetch another request
>> immediately after the completion of the previous request it gets NULL
>> and starts waiting for the completion of the previous request.
>> This wait on completion gives the FS the opportunity to insert the next
>> request but the MMC layer is already blocked on the previous request
>> completion and is not aware of the new request waiting to be fetched.
> I thought that I could trigger a context switch in order to give
> execution time for FS to add the new request to the MMC queue.
> I made a simple hack to call yield() in case the request gets NULL. I
> thought it may give the FS layer enough time to add a new request to
> the MMC queue. This would not delay the MMC transfer since the yield()
> is done in parallel with an ongoing transfer. Anyway it was just meant
> to be a simple test.
>
> One yield was not enough. Just for sanity check I added a msleep as
> well and that was enough to let FS add a new request,
> Would it be possible to gain throughput by delaying the fetch of new
> request? Too avoid unnecessary NULL requests
>
> If (ongoing request is read AND size is max read ahead AND new request
> is NULL) yield();
>
> BR
> Per
 We did the same experiment and it will not give maximum possible
 performance. There is no guarantee that the context switch which was
 manually caused by the MMC layer comes just in time: when it was early
 then next fetch still results in NULL, when it was later, then we miss
 possibility to fetch/prepare new request.

 Any delay in fetch of the new request that comes after the new request has
 arrived hits throughput and latency.

 The solution we are talking about here will fix not only situation with FS
 read ahead mechanism, but also it will remove penalty of the MMC context
 waiting on completion while any new request arrives.

 Thanks,

>>> It seems strange that the block layer cannot keep up with relatively slow 
>>> flash media devices. There must be a limitation on number of outstanding 
>>> request towards MMC.
>>> I need to make up my mind if it's the best way to address this issue in the 
>>> MMC framework or block layer. I have started to look into the block layer 
>>> code but it will take some time to dig out the relevant parts.
>>>
>>> BR
>>> Per
>>>
>> The root cause of the issue in incompletion of the current design with
>> well known producer-consumer problem solution (producer is block layer,
>> consumer is mmc layer).
>> Classic definitions states that the buffer is fix size, in our case we
>> have queue, so Producer always capable to put new request into the queue.
>> Consumer context blocked when both buffers (curr and prev) are busy
>> (first started its execution on the bus, second is fetched and waiting
>> for the first).
> This happens but I thought that the block layer would continue to add request 
> to the MMC queue while the consumer is busy.
> When consumer fetches request from the queue again there should be several 
> requests available in the queue, but there is only one.
>
>> Producer context considered to be blocked when FS (or others bio
>> sources) has no requests to put into queue.
> Does the block layer ever wait for outstanding request to finish? Could this 
> be another reason why the producer doesn't add new requests on the MMC queue?
>

Actually there could a lot of reasons why block layer or CFQ would not have
inserted the request into the queue. i.e. you can see a lot of exit paths
where blk_peek_request returns NULL, even though there could be any request
pending in one of the CFQ requests queues.

Essentially you need to check what makes blk_fetch_request
(cfq_dispatch_requests() ) return NULL when there is an element in
queue, if at all.

>> To maximize performance there are 2 notifications should be used:
>> 1. Producer notifies Consumer about new item to proceed.
>> 2. Consumer notifies Producer about free place.
>>
>> In our case 2nd notification not need since as I said before - it is
>> always free space in the queue.
>> There is no such notification as 1st, i.e. block layer has no way to
>> notify mmc layer about new request arrived.
>>
>> What you suggesting is to resolve specific case, when FS READ_AHEAD
>> mechanism behavior causes delays in producing new requests.
>> Probably you can resolve this specific case, but do you have guarantee
>> that this is only case that caus

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-26 Thread Venkatraman S
On Thu, Oct 25, 2012 at 6:58 PM, Konstantin Dorfman
 wrote:
> On 10/24/2012 07:07 PM, Per Förlin wrote:
>> On 10/24/2012 11:41 AM, Konstantin Dorfman wrote:
>>> Hello Per,
>>>
>>> On Mon, October 22, 2012 1:02 am, Per Forlin wrote:
> When mmcqt reports on completion of a request there should be
> a context switch to allow the insertion of the next read ahead BIOs
> to the block layer. Since the mmcqd tries to fetch another request
> immediately after the completion of the previous request it gets NULL
> and starts waiting for the completion of the previous request.
> This wait on completion gives the FS the opportunity to insert the next
> request but the MMC layer is already blocked on the previous request
> completion and is not aware of the new request waiting to be fetched.
 I thought that I could trigger a context switch in order to give
 execution time for FS to add the new request to the MMC queue.
 I made a simple hack to call yield() in case the request gets NULL. I
 thought it may give the FS layer enough time to add a new request to
 the MMC queue. This would not delay the MMC transfer since the yield()
 is done in parallel with an ongoing transfer. Anyway it was just meant
 to be a simple test.

 One yield was not enough. Just for sanity check I added a msleep as
 well and that was enough to let FS add a new request,
 Would it be possible to gain throughput by delaying the fetch of new
 request? Too avoid unnecessary NULL requests

 If (ongoing request is read AND size is max read ahead AND new request
 is NULL) yield();

 BR
 Per
>>> We did the same experiment and it will not give maximum possible
>>> performance. There is no guarantee that the context switch which was
>>> manually caused by the MMC layer comes just in time: when it was early
>>> then next fetch still results in NULL, when it was later, then we miss
>>> possibility to fetch/prepare new request.
>>>
>>> Any delay in fetch of the new request that comes after the new request has
>>> arrived hits throughput and latency.
>>>
>>> The solution we are talking about here will fix not only situation with FS
>>> read ahead mechanism, but also it will remove penalty of the MMC context
>>> waiting on completion while any new request arrives.
>>>
>>> Thanks,
>>>
>> It seems strange that the block layer cannot keep up with relatively slow 
>> flash media devices. There must be a limitation on number of outstanding 
>> request towards MMC.
>> I need to make up my mind if it's the best way to address this issue in the 
>> MMC framework or block layer. I have started to look into the block layer 
>> code but it will take some time to dig out the relevant parts.
>>
>> BR
>> Per
>>
> The root cause of the issue in incompletion of the current design with
> well known producer-consumer problem solution (producer is block layer,
> consumer is mmc layer).
> Classic definitions states that the buffer is fix size, in our case we
> have queue, so Producer always capable to put new request into the queue.
> Consumer context blocked when both buffers (curr and prev) are busy
> (first started its execution on the bus, second is fetched and waiting
> for the first).
> Producer context considered to be blocked when FS (or others bio
> sources) has no requests to put into queue.
> To maximize performance there are 2 notifications should be used:
> 1. Producer notifies Consumer about new item to proceed.
> 2. Consumer notifies Producer about free place.
>
> In our case 2nd notification not need since as I said before - it is
> always free space in the queue.
> There is no such notification as 1st, i.e. block layer has no way to
> notify mmc layer about new request arrived.

Being nitpicky, I think it contradicts with the commit log that you have
for the patch..

When the block layer notifies the MMC layer on a new request, we check
for the above case where MMC layer is waiting on a previous request
completion and the current request is NULL.


>
> What you suggesting is to resolve specific case, when FS READ_AHEAD
> mechanism behavior causes delays in producing new requests.
> Probably you can resolve this specific case, but do you have guarantee
> that this is only case that causes delays between new requests events?
> Flash memory devices these days constantly improved on all levels: NAND,
> firmware, bus speed and host controller capabilities, this makes any
> yield/sleep/timeouts solution only temporary hacks.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-25 Thread Per Förlin
On 10/25/2012 03:28 PM, Konstantin Dorfman wrote:
> On 10/24/2012 07:07 PM, Per Förlin wrote:
>> On 10/24/2012 11:41 AM, Konstantin Dorfman wrote:
>>> Hello Per,
>>>
>>> On Mon, October 22, 2012 1:02 am, Per Forlin wrote:
> When mmcqt reports on completion of a request there should be
> a context switch to allow the insertion of the next read ahead BIOs
> to the block layer. Since the mmcqd tries to fetch another request
> immediately after the completion of the previous request it gets NULL
> and starts waiting for the completion of the previous request.
> This wait on completion gives the FS the opportunity to insert the next
> request but the MMC layer is already blocked on the previous request
> completion and is not aware of the new request waiting to be fetched.
 I thought that I could trigger a context switch in order to give
 execution time for FS to add the new request to the MMC queue.
 I made a simple hack to call yield() in case the request gets NULL. I
 thought it may give the FS layer enough time to add a new request to
 the MMC queue. This would not delay the MMC transfer since the yield()
 is done in parallel with an ongoing transfer. Anyway it was just meant
 to be a simple test.

 One yield was not enough. Just for sanity check I added a msleep as
 well and that was enough to let FS add a new request,
 Would it be possible to gain throughput by delaying the fetch of new
 request? Too avoid unnecessary NULL requests

 If (ongoing request is read AND size is max read ahead AND new request
 is NULL) yield();

 BR
 Per
>>> We did the same experiment and it will not give maximum possible
>>> performance. There is no guarantee that the context switch which was
>>> manually caused by the MMC layer comes just in time: when it was early
>>> then next fetch still results in NULL, when it was later, then we miss
>>> possibility to fetch/prepare new request.
>>>
>>> Any delay in fetch of the new request that comes after the new request has
>>> arrived hits throughput and latency.
>>>
>>> The solution we are talking about here will fix not only situation with FS
>>> read ahead mechanism, but also it will remove penalty of the MMC context
>>> waiting on completion while any new request arrives.
>>>
>>> Thanks,
>>>
>> It seems strange that the block layer cannot keep up with relatively slow 
>> flash media devices. There must be a limitation on number of outstanding 
>> request towards MMC.
>> I need to make up my mind if it's the best way to address this issue in the 
>> MMC framework or block layer. I have started to look into the block layer 
>> code but it will take some time to dig out the relevant parts.
>>
>> BR
>> Per
>>
> The root cause of the issue in incompletion of the current design with
> well known producer-consumer problem solution (producer is block layer,
> consumer is mmc layer).
> Classic definitions states that the buffer is fix size, in our case we
> have queue, so Producer always capable to put new request into the queue.
> Consumer context blocked when both buffers (curr and prev) are busy
> (first started its execution on the bus, second is fetched and waiting
> for the first).
This happens but I thought that the block layer would continue to add request 
to the MMC queue while the consumer is busy.
When consumer fetches request from the queue again there should be several 
requests available in the queue, but there is only one.

> Producer context considered to be blocked when FS (or others bio
> sources) has no requests to put into queue.
Does the block layer ever wait for outstanding request to finish? Could this be 
another reason why the producer doesn't add new requests on the MMC queue?

> To maximize performance there are 2 notifications should be used:
> 1. Producer notifies Consumer about new item to proceed.
> 2. Consumer notifies Producer about free place.
> 
> In our case 2nd notification not need since as I said before - it is
> always free space in the queue.
> There is no such notification as 1st, i.e. block layer has no way to
> notify mmc layer about new request arrived.
> 
> What you suggesting is to resolve specific case, when FS READ_AHEAD
> mechanism behavior causes delays in producing new requests.
> Probably you can resolve this specific case, but do you have guarantee
> that this is only case that causes delays between new requests events?
> Flash memory devices these days constantly improved on all levels: NAND,
> firmware, bus speed and host controller capabilities, this makes any
> yield/sleep/timeouts solution only temporary hacks.
I never meant yield or sleep to be a permanent fix. I was only curious of how 
if would affect the performance in order to gain a better knowledge of the root 
cause.
My impression is that even if the SD card is very slow you will see the same 
affect. The behavior of the block layer in this case is not related to the 
s

Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-25 Thread Konstantin Dorfman
On 10/24/2012 07:07 PM, Per Förlin wrote:
> On 10/24/2012 11:41 AM, Konstantin Dorfman wrote:
>> Hello Per,
>>
>> On Mon, October 22, 2012 1:02 am, Per Forlin wrote:
 When mmcqt reports on completion of a request there should be
 a context switch to allow the insertion of the next read ahead BIOs
 to the block layer. Since the mmcqd tries to fetch another request
 immediately after the completion of the previous request it gets NULL
 and starts waiting for the completion of the previous request.
 This wait on completion gives the FS the opportunity to insert the next
 request but the MMC layer is already blocked on the previous request
 completion and is not aware of the new request waiting to be fetched.
>>> I thought that I could trigger a context switch in order to give
>>> execution time for FS to add the new request to the MMC queue.
>>> I made a simple hack to call yield() in case the request gets NULL. I
>>> thought it may give the FS layer enough time to add a new request to
>>> the MMC queue. This would not delay the MMC transfer since the yield()
>>> is done in parallel with an ongoing transfer. Anyway it was just meant
>>> to be a simple test.
>>>
>>> One yield was not enough. Just for sanity check I added a msleep as
>>> well and that was enough to let FS add a new request,
>>> Would it be possible to gain throughput by delaying the fetch of new
>>> request? Too avoid unnecessary NULL requests
>>>
>>> If (ongoing request is read AND size is max read ahead AND new request
>>> is NULL) yield();
>>>
>>> BR
>>> Per
>> We did the same experiment and it will not give maximum possible
>> performance. There is no guarantee that the context switch which was
>> manually caused by the MMC layer comes just in time: when it was early
>> then next fetch still results in NULL, when it was later, then we miss
>> possibility to fetch/prepare new request.
>>
>> Any delay in fetch of the new request that comes after the new request has
>> arrived hits throughput and latency.
>>
>> The solution we are talking about here will fix not only situation with FS
>> read ahead mechanism, but also it will remove penalty of the MMC context
>> waiting on completion while any new request arrives.
>>
>> Thanks,
>>
> It seems strange that the block layer cannot keep up with relatively slow 
> flash media devices. There must be a limitation on number of outstanding 
> request towards MMC.
> I need to make up my mind if it's the best way to address this issue in the 
> MMC framework or block layer. I have started to look into the block layer 
> code but it will take some time to dig out the relevant parts.
>
> BR
> Per
>
The root cause of the issue in incompletion of the current design with
well known producer-consumer problem solution (producer is block layer,
consumer is mmc layer).
Classic definitions states that the buffer is fix size, in our case we
have queue, so Producer always capable to put new request into the queue.
Consumer context blocked when both buffers (curr and prev) are busy
(first started its execution on the bus, second is fetched and waiting
for the first).
Producer context considered to be blocked when FS (or others bio
sources) has no requests to put into queue.
To maximize performance there are 2 notifications should be used:
1. Producer notifies Consumer about new item to proceed.
2. Consumer notifies Producer about free place.

In our case 2nd notification not need since as I said before - it is
always free space in the queue.
There is no such notification as 1st, i.e. block layer has no way to
notify mmc layer about new request arrived.

What you suggesting is to resolve specific case, when FS READ_AHEAD
mechanism behavior causes delays in producing new requests.
Probably you can resolve this specific case, but do you have guarantee
that this is only case that causes delays between new requests events?
Flash memory devices these days constantly improved on all levels: NAND,
firmware, bus speed and host controller capabilities, this makes any
yield/sleep/timeouts solution only temporary hacks.

Thanks
-- 

Konstantin Dorfman,
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center,
Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-24 Thread Per Förlin
On 10/24/2012 11:41 AM, Konstantin Dorfman wrote:
> Hello Per,
>
> On Mon, October 22, 2012 1:02 am, Per Forlin wrote:
>>> When mmcqt reports on completion of a request there should be
>>> a context switch to allow the insertion of the next read ahead BIOs
>>> to the block layer. Since the mmcqd tries to fetch another request
>>> immediately after the completion of the previous request it gets NULL
>>> and starts waiting for the completion of the previous request.
>>> This wait on completion gives the FS the opportunity to insert the next
>>> request but the MMC layer is already blocked on the previous request
>>> completion and is not aware of the new request waiting to be fetched.
>> I thought that I could trigger a context switch in order to give
>> execution time for FS to add the new request to the MMC queue.
>> I made a simple hack to call yield() in case the request gets NULL. I
>> thought it may give the FS layer enough time to add a new request to
>> the MMC queue. This would not delay the MMC transfer since the yield()
>> is done in parallel with an ongoing transfer. Anyway it was just meant
>> to be a simple test.
>>
>> One yield was not enough. Just for sanity check I added a msleep as
>> well and that was enough to let FS add a new request,
>> Would it be possible to gain throughput by delaying the fetch of new
>> request? Too avoid unnecessary NULL requests
>>
>> If (ongoing request is read AND size is max read ahead AND new request
>> is NULL) yield();
>>
>> BR
>> Per
> We did the same experiment and it will not give maximum possible
> performance. There is no guarantee that the context switch which was
> manually caused by the MMC layer comes just in time: when it was early
> then next fetch still results in NULL, when it was later, then we miss
> possibility to fetch/prepare new request.
>
> Any delay in fetch of the new request that comes after the new request has
> arrived hits throughput and latency.
>
> The solution we are talking about here will fix not only situation with FS
> read ahead mechanism, but also it will remove penalty of the MMC context
> waiting on completion while any new request arrives.
>
> Thanks,
>

It seems strange that the block layer cannot keep up with relatively slow flash 
media devices. There must be a limitation on number of outstanding request 
towards MMC.
I need to make up my mind if it's the best way to address this issue in the MMC 
framework or block layer. I have started to look into the block layer code but 
it will take some time to dig out the relevant parts.

BR
Per

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-24 Thread Konstantin Dorfman
Hello Per,

On Mon, October 22, 2012 1:02 am, Per Forlin wrote:
>> When mmcqt reports on completion of a request there should be
>> a context switch to allow the insertion of the next read ahead BIOs
>> to the block layer. Since the mmcqd tries to fetch another request
>> immediately after the completion of the previous request it gets NULL
>> and starts waiting for the completion of the previous request.
>> This wait on completion gives the FS the opportunity to insert the next
>> request but the MMC layer is already blocked on the previous request
>> completion and is not aware of the new request waiting to be fetched.
> I thought that I could trigger a context switch in order to give
> execution time for FS to add the new request to the MMC queue.
> I made a simple hack to call yield() in case the request gets NULL. I
> thought it may give the FS layer enough time to add a new request to
> the MMC queue. This would not delay the MMC transfer since the yield()
> is done in parallel with an ongoing transfer. Anyway it was just meant
> to be a simple test.
>
> One yield was not enough. Just for sanity check I added a msleep as
> well and that was enough to let FS add a new request,
> Would it be possible to gain throughput by delaying the fetch of new
> request? Too avoid unnecessary NULL requests
>
> If (ongoing request is read AND size is max read ahead AND new request
> is NULL) yield();
>
> BR
> Per
We did the same experiment and it will not give maximum possible
performance. There is no guarantee that the context switch which was
manually caused by the MMC layer comes just in time: when it was early
then next fetch still results in NULL, when it was later, then we miss
possibility to fetch/prepare new request.

Any delay in fetch of the new request that comes after the new request has
arrived hits throughput and latency.

The solution we are talking about here will fix not only situation with FS
read ahead mechanism, but also it will remove penalty of the MMC context
waiting on completion while any new request arrives.

Thanks,

-- 
Konstantin Dorfman,
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-21 Thread Per Forlin
On Mon, Oct 15, 2012 at 5:36 PM, Konstantin Dorfman
 wrote:
> The main assumption of the async request design is that the file
> system adds block requests to the block device queue asynchronously
> without waiting for completion (see the Rationale section of
> https://wiki.linaro.org/WorkingGroups/Kernel/Specs
> /StoragePerfMMC-async-req).
>
> We found out that in case of sequential read operations this is not
> the case, due to the read ahead mechanism.
> When mmcqt reports on completion of a request there should be
> a context switch to allow the insertion of the next read ahead BIOs
> to the block layer. Since the mmcqd tries to fetch another request
> immediately after the completion of the previous request it gets NULL
> and starts waiting for the completion of the previous request.
> This wait on completion gives the FS the opportunity to insert the next
> request but the MMC layer is already blocked on the previous request
> completion and is not aware of the new request waiting to be fetched.
I thought that I could trigger a context switch in order to give
execution time for FS to add the new request to the MMC queue.
I made a simple hack to call yield() in case the request gets NULL. I
thought it may give the FS layer enough time to add a new request to
the MMC queue. This would not delay the MMC transfer since the yield()
is done in parallel with an ongoing transfer. Anyway it was just meant
to be a simple test.

One yield was not enough. Just for sanity check I added a msleep as
well and that was enough to let FS add a new request,
Would it be possible to gain throughput by delaying the fetch of new
request? Too avoid unnecessary NULL requests

If (ongoing request is read AND size is max read ahead AND new request
is NULL) yield();

BR
Per

>
> This patch fixes the above behavior and allows the async request mechanism
> to work in case of sequential read scenarios.
> The main idea is to replace the blocking wait for a completion with an
> event driven mechanism and add a new event of new_request.
> When the block layer notifies the MMC layer on a new request, we check
> for the above case where MMC layer is waiting on a previous request
> completion and the current request is NULL.
> In such a case the new_request event will be triggered to wakeup
> the waiting thread. MMC layer will then fetch the new request
> and after its preparation will go back to waiting on the completion.
>
> Our tests showed that this fix improves the read sequential throughput
> by 16%.
>
> Signed-off-by: Konstantin Dorfman 
>
> diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
> index 172a768..4d6431b 100644
> --- a/drivers/mmc/card/block.c
> +++ b/drivers/mmc/card/block.c
> @@ -112,17 +112,6 @@ struct mmc_blk_data {
>
>  static DEFINE_MUTEX(open_lock);
>
> -enum mmc_blk_status {
> -   MMC_BLK_SUCCESS = 0,
> -   MMC_BLK_PARTIAL,
> -   MMC_BLK_CMD_ERR,
> -   MMC_BLK_RETRY,
> -   MMC_BLK_ABORT,
> -   MMC_BLK_DATA_ERR,
> -   MMC_BLK_ECC_ERR,
> -   MMC_BLK_NOMEDIUM,
> -};
> -
>  module_param(perdev_minors, int, 0444);
>  MODULE_PARM_DESC(perdev_minors, "Minors numbers to allocate per device");
>
> @@ -1224,6 +1213,7 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req 
> *mqrq,
> }
>
> mqrq->mmc_active.mrq = &brq->mrq;
> +   mqrq->mmc_active.mrq->sync_data = &mq->sync_data;
> mqrq->mmc_active.err_check = mmc_blk_err_check;
>
> mmc_queue_bounce_pre(mqrq);
> @@ -1284,9 +1274,12 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
> struct request *rqc)
> areq = &mq->mqrq_cur->mmc_active;
> } else
> areq = NULL;
> -   areq = mmc_start_req(card->host, areq, (int *) &status);
> -   if (!areq)
> +   areq = mmc_start_data_req(card->host, areq, (int *)&status);
> +   if (!areq) {
> +   if (status == MMC_BLK_NEW_REQUEST)
> +   return status;
> return 0;
> +   }
>
> mq_rq = container_of(areq, struct mmc_queue_req, mmc_active);
> brq = &mq_rq->brq;
> @@ -1295,6 +1288,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
> struct request *rqc)
> mmc_queue_bounce_post(mq_rq);
>
> switch (status) {
> +   case MMC_BLK_NEW_REQUEST:
> +   BUG_ON(1); /* should never get here */
> case MMC_BLK_SUCCESS:
> case MMC_BLK_PARTIAL:
> /*
> @@ -1367,7 +1362,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
> struct request *rqc)
>  * prepare it again and resend.
>  */
> mmc_blk_rw_rq_prep(mq_rq, card, disable_multi, mq);
> -   mmc_start_req(card->host, &mq_rq->mmc_active, NULL);
> +   mmc_start_

[PATCH v1] mmc: fix async request mechanism for sequential read scenarios

2012-10-15 Thread Konstantin Dorfman
The main assumption of the async request design is that the file
system adds block requests to the block device queue asynchronously
without waiting for completion (see the Rationale section of
https://wiki.linaro.org/WorkingGroups/Kernel/Specs
/StoragePerfMMC-async-req).

We found out that in case of sequential read operations this is not
the case, due to the read ahead mechanism.
When mmcqt reports on completion of a request there should be
a context switch to allow the insertion of the next read ahead BIOs
to the block layer. Since the mmcqd tries to fetch another request
immediately after the completion of the previous request it gets NULL
and starts waiting for the completion of the previous request.
This wait on completion gives the FS the opportunity to insert the next
request but the MMC layer is already blocked on the previous request
completion and is not aware of the new request waiting to be fetched.

This patch fixes the above behavior and allows the async request mechanism
to work in case of sequential read scenarios.
The main idea is to replace the blocking wait for a completion with an
event driven mechanism and add a new event of new_request.
When the block layer notifies the MMC layer on a new request, we check
for the above case where MMC layer is waiting on a previous request
completion and the current request is NULL.
In such a case the new_request event will be triggered to wakeup
the waiting thread. MMC layer will then fetch the new request
and after its preparation will go back to waiting on the completion.

Our tests showed that this fix improves the read sequential throughput
by 16%.

Signed-off-by: Konstantin Dorfman 

diff --git a/drivers/mmc/card/block.c b/drivers/mmc/card/block.c
index 172a768..4d6431b 100644
--- a/drivers/mmc/card/block.c
+++ b/drivers/mmc/card/block.c
@@ -112,17 +112,6 @@ struct mmc_blk_data {
 
 static DEFINE_MUTEX(open_lock);
 
-enum mmc_blk_status {
-   MMC_BLK_SUCCESS = 0,
-   MMC_BLK_PARTIAL,
-   MMC_BLK_CMD_ERR,
-   MMC_BLK_RETRY,
-   MMC_BLK_ABORT,
-   MMC_BLK_DATA_ERR,
-   MMC_BLK_ECC_ERR,
-   MMC_BLK_NOMEDIUM,
-};
-
 module_param(perdev_minors, int, 0444);
 MODULE_PARM_DESC(perdev_minors, "Minors numbers to allocate per device");
 
@@ -1224,6 +1213,7 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
}
 
mqrq->mmc_active.mrq = &brq->mrq;
+   mqrq->mmc_active.mrq->sync_data = &mq->sync_data;
mqrq->mmc_active.err_check = mmc_blk_err_check;
 
mmc_queue_bounce_pre(mqrq);
@@ -1284,9 +1274,12 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
struct request *rqc)
areq = &mq->mqrq_cur->mmc_active;
} else
areq = NULL;
-   areq = mmc_start_req(card->host, areq, (int *) &status);
-   if (!areq)
+   areq = mmc_start_data_req(card->host, areq, (int *)&status);
+   if (!areq) {
+   if (status == MMC_BLK_NEW_REQUEST)
+   return status;
return 0;
+   }
 
mq_rq = container_of(areq, struct mmc_queue_req, mmc_active);
brq = &mq_rq->brq;
@@ -1295,6 +1288,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
struct request *rqc)
mmc_queue_bounce_post(mq_rq);
 
switch (status) {
+   case MMC_BLK_NEW_REQUEST:
+   BUG_ON(1); /* should never get here */
case MMC_BLK_SUCCESS:
case MMC_BLK_PARTIAL:
/*
@@ -1367,7 +1362,8 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
struct request *rqc)
 * prepare it again and resend.
 */
mmc_blk_rw_rq_prep(mq_rq, card, disable_multi, mq);
-   mmc_start_req(card->host, &mq_rq->mmc_active, NULL);
+   mmc_start_data_req(card->host, &mq_rq->mmc_active,
+   (int *)&status);
}
} while (ret);
 
@@ -1382,7 +1378,7 @@ static int mmc_blk_issue_rw_rq(struct mmc_queue *mq, 
struct request *rqc)
  start_new_req:
if (rqc) {
mmc_blk_rw_rq_prep(mq->mqrq_cur, card, 0, mq);
-   mmc_start_req(card->host, &mq->mqrq_cur->mmc_active, NULL);
+   mmc_start_data_req(card->host, &mq->mqrq_cur->mmc_active, NULL);
}
 
return 0;
@@ -1426,9 +1422,10 @@ static int mmc_blk_issue_rq(struct mmc_queue *mq, struct 
request *req)
}
 
 out:
-   if (!req)
+   if (!req && (ret != MMC_BLK_NEW_REQUEST))
/* release host only when there are no more requests */
mmc_release_host(card->host);
+
return ret;
 }
 
diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index e360a97..65cecf2 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@