Am 26.03.21 um 15:53 schrieb Jens Axboe:
> On 3/26/21 8:45 AM, Stefan Metzmacher wrote:
>> Am 26.03.21 um 15:43 schrieb Stefan Metzmacher:
>>> Am 26.03.21 um 15:38 schrieb Jens Axboe:
>>>> On 3/26/21 7:59 AM, Jens Axboe wrote:
>>>>> On 3/26/21 7:54 AM, Jens Axboe wrote:
>>>>>>> The KILL after STOP deadlock still exists.
>>>>>>
>>>>>> In which tree? Sounds like you're still on the old one with that
>>>>>> incremental you sent, which wasn't complete.
>>>>>>
>>>>>>> Does io_wq_manager() exits without cleaning up on SIGKILL?
>>>>>>
>>>>>> No, it should kill up in all cases. I'll try your stop + kill, I just
>>>>>> tested both of them separately and didn't observe anything. I also ran
>>>>>> your io_uring-cp example (and found a bug in the example, fixed and
>>>>>> pushed), fwiw.
>>>>>
>>>>> I can reproduce this one! I'll take a closer look.
>>>>
>>>> OK, that one is actually pretty straight forward - we rely on cleaning
>>>> up on exit, but for fatal cases, get_signal() will call do_exit() for us
>>>> and never return. So we might need a special case in there to deal with
>>>> that, or some other way of ensuring that fatal signal gets processed
>>>> correctly for IO threads.
>>>
>>> And if (fatal_signal_pending(current)) doesn't prevent get_signal() from 
>>> being called?
>>
>> Ah, we're still in the first get_signal() from SIGSTOP, correct?
> 
> Yes exactly, we're waiting in there being stopped. So we either need to
> check to something ala:
> 
> relock:
> +     if (current->flags & PF_IO_WORKER && fatal_signal_pending(current))
> +             return false;
> 
> to catch it upfront and from the relock case, or add:
> 
>       fatal:
> +             if (current->flags & PF_IO_WORKER)
> +                     return false;
> 
> to catch it in the fatal section.
> 

Or something like io_uring_files_cancel()

Maybe change current->pf_io_worker with a generic current->io_thread
structure which, has exit hooks, as well as
io_wq_worker_sleeping() and io_wq_worker_running().

Maybe create_io_thread would take such an structure
as argument instead of a single function pointer.

struct io_thread_description {
        const char *name;
        int (*thread_fn)(struct io_thread_description *);
        void (*sleeping_fn)((struct io_thread_description *);
        void (*running_fn)((struct io_thread_description *);
        void (*exit_fn)((struct io_thread_description *);
};

And then
struct io_wq_manager {
        struct io_thread_description description;
        ... manager specific stuff...
};

metze

Reply via email to