> From: Alex Bennée [mailto:alex.ben...@linaro.org] > >> From: Stefan Hajnoczi [mailto:stefa...@gmail.com] > >> > > >> > Yes, this option helps. > >> > Thank you. > >> > >> Good news. This can be fixed in 2.8.1 once someone finds a solution. > > > > It seems that something still goes wrong. > > I'm using this workaround, but there is a kind of deadlock in translation. > > call_rcu_thread hangs at some moment in qemu_event_wait. > > > > As far as I understand, it is used by QHT in translate-all.c. > > I can't get more information yet, because logging makes everything too slow. > > There are a number of users of RCU bit for QHT I think it only gets > activated when it needs to re-size its hash table on insertion of new > TranslationBlocks. > > Can you get a backtrace of all threads when it deadlocks?
Sorry, this is another problem which occurs only in icount replay mode: 1. cpu_handle_exception tries to force exception when is cannot occur due to running out all the planned instructions: } else if (replay_has_exception() && cpu->icount_decr.u16.low + cpu->icount_extra == 0) { /* try to cause an exception pending in the log */ cpu_exec_nocache(cpu, 1, tb_find(cpu, NULL, 0), true); *ret = -1; return true; 2. tb_find calls tb_gen_code, which cannot allocate new translation block and calls tb_flush (which only queues the flushing) and cpu_loop_exit 3. cpu_loop_exit returns to infinite loop of cpu_exec and the condition if (cpu_handle_exception(cpu, &ret)) { break; } is checked again causing an infinite loop. TB cache is not flushed because we never execute that break and real work of tb_flush is made outside this loop. Pavel Dovgalyuk