On 08/24/2014 02:11 PM, Richard Weinberger wrote: > Am 23.08.2014 19:43, schrieb Thorsten Knabe: >> Hi Richard. >> >> On 08/23/2014 05:34 PM, Richard Weinberger wrote: >>> Hi! >>> >>> Am 23.08.2014 15:47, schrieb Thorsten Knabe: >>>> From: Thorsten Knabe <li...@thorsten-knabe.de> >>>> >>>> UML: UBD: Fix for processes stuck in D state forever in UserModeLinux. >>>> >>>> Starting with Linux 3.12 processes get stuck in D state forever in >>>> UserModeLinux under sync heavy workloads. This bug was introduced by >>>> commit 805f11a0d5 (um: ubd: Add REQ_FLUSH suppport). >>>> Fix bug by adding a check if FLUSH request was successfully submitted to >>>> the I/O thread and keeping the FLUSH request on the request queue on >>>> submission failures. >>>> >>>> Fixes: 805f11a0d5 (um: ubd: Add REQ_FLUSH suppport) >>>> Signed-off-by: Thorsten Knabe <li...@thorsten-knabe.de> >>> >>> Thanks a lot for hunting this issue down. >>> >>>> --- >>>> Patch applies to 3.16.1. >>>> >>>> diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c >>>> index 3716e69..b7d2840 100644 >>>> --- a/arch/um/drivers/ubd_kern.c >>>> +++ b/arch/um/drivers/ubd_kern.c >>>> @@ -1277,7 +1277,7 @@ static void do_ubd_request(struct request_queue *q) >>>> >>>> while(1){ >>>> struct ubd *dev = q->queuedata; >>>> - if(dev->end_sg == 0){ >>>> + if(dev->request == NULL){ >>> >>> Why do we need this specific change? >> >> This change is required, because for FLUSH requests dev->end_sg is >> initialized to 0 by blk_rq_map_sg() a few lines above, as FLUSH requests >> have no data blocks attached to themselves. > > You meant "below"? Looks like I really miss something here. > At the bottom of the while(1) loop we have > dev->end_sg = 0; > dev->request = NULL;
No. The problematic line is: dev->end_sg = blk_rq_map_sg(q, req, dev->sg); and blk_rq_map_sg() returning 0 for REQ_FLUSH requests, because they have no associated data blocks. Hence on the next iteration of the while(1) loop: if(dev->end_sg == 0){ will be true, even if the request has not been successfully submitted to the I/O thread in the previous iteration of the while(1) loop and a new request will be fetched: struct request *req = blk_fetch_request(q); if(req == NULL) return; dev->request = req; dev->rq_pos = blk_rq_pos(req); dev->start_sg = 0; dev->end_sg = blk_rq_map_sg(q, req, dev->sg); } Thus the REQ_FLUSH request got lost and will never get submitted to the I/O thread, there will be no matching answer from the I/O thread and the lost REQ_FLUSH request will never complete... Regards Thorsten > > Thanks, > //richard > -- ___ | | / E-Mail: li...@thorsten-knabe.de |horsten |/\nabe WWW: http://linux.thorsten-knabe.de -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/