On 03/09/2016 01:55 PM, Paolo Bonzini wrote:
> 
> 
> On 09/03/2016 13:21, Christian Borntraeger wrote:
>> I have some random crashes at startup 
>>                 
>>                 Stack trace of thread 48326:
>>                 #0  0x000002aa2e0cce46 bdrv_co_do_rw (qemu-system-s390x)
>>                 #1  0x000002aa2e159e8e coroutine_trampoline 
>> (qemu-system-s390x)
>>                 #2  0x000003ffbc35150a __makecontext_ret (libc.so.6)
>>
>>
>> that I was able to bisect.
>> commit 2906cddfecff21af20eedab43288b485a679f9ac does crash regularly, 
>> 2906cddfecff21af20eedab43288b485a679f9ac^ does not.
>>
>> I will try to find somebody that looks into that - unless you have an idea.
> 
> The only random idea is to move
> 
>     vblk->dataplane_started = true
> 
> to the beginning of virtio_blk_data_plane_start rather than the end.
> 
> Paolo

FWIW, it seems that this patch triggers this error, the "tracked_request_begin"
that I reported yesterday and / or some early read issues from the bootloader
in a random fashion.
Using 2906cddfecff21af20eedab43288b485a679f9ac^ seems to work all the time,
moving around vblk->dataplane_started = true also triggers all 3 types
of bugs, e.g.

Thread 1 (Thread 0x3ffaabff910 (LWP 32782)):
#0  0x0000000010329a70 in bdrv_co_do_rw (opaque=0x0) at 
/home/cborntra/REPOS/qemu/block/io.c:2170
#1  0x00000000103b2e7a in coroutine_trampoline (i0=1023, i1=-2147470992) at 
/home/cborntra/REPOS/qemu/util/coroutine-ucontext.c:79
#2  0x000003ffac85150a in __makecontext_ret () from /lib64/libc.so.6
(gdb) list
2165    
2166    /* Invoke bdrv_co_do_readv/bdrv_co_do_writev */
2167    static void coroutine_fn bdrv_co_do_rw(void *opaque)
2168    {
2169        BlockAIOCBCoroutine *acb = opaque;
2170        BlockDriverState *bs = acb->common.bs;
2171    
2172        if (!acb->is_write) {
2173            acb->req.error = bdrv_co_do_readv(bs, acb->req.sector,
2174                acb->req.nb_sectors, acb->req.qiov, acb->req.flags);



I will try to find somebody to work on this.





Reply via email to