Il 18/10/2012 17:12, Peter Maydell ha scritto: > On 18 October 2012 16:01, Paolo Bonzini <pbonz...@redhat.com> wrote: >> Il 18/10/2012 16:28, Peter Maydell ha scritto: >>> The function vnc_stop_worker_thread() is buggy, beacuse it tries to >>> delete jobs from the worker thread's queue but the worker thread itself >>> will not cope with this happening (it would end up trying to remove >>> an already-removed list item from its queue list). Fortunately >>> nobody ever calls vnc_stop_worker_thread(), so we can fix this by >>> simply deleting all the untested racy code. >> >> Note that there is just one queue. The queue global == the arg argument >> of vnc_worker_thread == the queue argument of vnc_worker_thread_loop. >> So I'm not sure I follow your reasoning. > > vnc_worker_thread_loop does this: > lock queue > get pointer to first job in queue > unlock queue > do stuff with job > lock queue > QTAILQ_REMOVE(&queue->jobs, job, next) > unlock queue
Uh, right... the QTAILQ_REMOVE looks completely misplaced, otoh vnc_has_job relies on that. > g_free(job) > > vnc_jobs_clear does this: > lock queue > QTAILQ_REMOVE each job from the queue > unlock queue > > So two problems: > (1) any job removed by vnc_jobs_clear is never g_free()d > (2) if vnc_jobs_clear removes job X from the queue while the worker loop > is in its "do stuff with job" phase, then the worker loop will > subsequently try to QTAILQ_REMOVE an object from a list it is not > in, which will probably crash or otherwise misbehave ... but vnc_jobs_clear should be dead code because vnc_jobs_join is called first. So the "right fix" would include anyway the half of your patch that deletes vnc_jobs_clear, and would revert the other half... > Here's a third which I just noticed: > (3) vnc_stop_worker thread sets queue->exit to true, and then does > a number of things with 'queue'. However, as soon as you've set > queue->exit you can't touch 'queue' again, because the worker > thread might (if you were unlucky) be just about to do the > 'if (queue->exit) { return -1; }', which will cause it to destroy > the condvar and muteux and free the queue memory. In particular, > even doing the 'vnc_unlock_queue()' in vnc_stop_worker_thread() > is unsafe, because the worker thread does its exit-check without > the lock held, so it could destroy the mutex before you unlocked it. Right, the solution for this is to move the destruction to the caller, because now we have joinable QemuThreads and those are a better way to synchronize. Paolo >> So the bug may be that we never call vnc_stop_worker_thread from >> vnc_disconnect_finish. BTW vnc_jobs_join is called there so we could >> just assert that the queue is empty... > > Yes, actually trying to find somewhere to make a clean shutdown > of the worker thread and fixing all the bugs in the shutdown > code would be the other approach. That seems like hard work to me :-) > > -- PMM > >