Re: [Qemu-devel] qemu-2.8-rc4 is broken

Pavel Dovgalyuk Thu, 12 Jan 2017 00:09:07 -0800

> From: Alex Bennée [mailto:alex.ben...@linaro.org]
> >> From: Stefan Hajnoczi [mailto:stefa...@gmail.com]
> >> >
> >> > Yes, this option helps.
> >> > Thank you.
> >>
> >> Good news.  This can be fixed in 2.8.1 once someone finds a solution.
> >
> > It seems that something still goes wrong.
> > I'm using this workaround, but there is a kind of deadlock in translation.
> > call_rcu_thread hangs at some moment in qemu_event_wait.
> >
> > As far as I understand, it is used by QHT in translate-all.c.
> > I can't get more information yet, because logging makes everything too slow.
> 
> There are a number of users of RCU bit for QHT I think it only gets
> activated when it needs to re-size its hash table on insertion of new
> TranslationBlocks.
> 
> Can you get a backtrace of all threads when it deadlocks?


Sorry, this is another problem which occurs only in icount replay mode:
1. cpu_handle_exception tries to force exception when is cannot occur due to
   running out all the planned instructions:
    } else if (replay_has_exception()
               && cpu->icount_decr.u16.low + cpu->icount_extra == 0) {
        /* try to cause an exception pending in the log */
        cpu_exec_nocache(cpu, 1, tb_find(cpu, NULL, 0), true);
        *ret = -1;
        return true;

2. tb_find calls tb_gen_code, which cannot allocate new translation block 
   and calls tb_flush (which only queues the flushing) and cpu_loop_exit
3. cpu_loop_exit returns to infinite loop of cpu_exec and the condition
            if (cpu_handle_exception(cpu, &ret)) {
                break;
            }
   is checked again causing an infinite loop.

TB cache is not flushed because we never execute that break and real work of 
tb_flush
is made outside this loop.

Pavel Dovgalyuk

Re: [Qemu-devel] qemu-2.8-rc4 is broken

Reply via email to