On 2021/2/10 1:30, Martin Wilck wrote:
> On Tue, 2021-02-09 at 09:36 +0800, lixiaokeng wrote:
>>
>>>
>>> I still don't fully understand. Above you said "this coredump
>>> doesn't
>>> seem to appear any more". Am I understanding correctly that you
>>> observed *other* core dumps instead?
>
>>>
>>
>> No, it is not "instead".
>> As shown in https://www.spinics.net/lists/dm-devel/msg45293.html,
>> there are some different crashes in multipathd with no code change.
>> When blocking of thread cancellation during
>> udev_monitor_receive_device(),
>> no crash in udev_monitor_receive_device happens but others still
>> exist.
>
> Now I got it, eventually :-) Thanks for the clarification. Would it be
> ossible for you to categorize the different issues and provide core
> dumps?
>
> You mentioned in the systemd issue that you were playing around with
> the gcc -fexceptions flag. As I remarked there - how did it get set in
> the first place? What distro are you using?
>>>
>>> The "best" solution would probably be to generally disallow
>>> cancellation, and only run pthread_testcancel() at certain points
>>> in
>>> the code where we might block (and know that being cancelled would
>>> be
>>> safe). That would not only make multipathd safer from crashing, it
>>> would also enable us to remove hundreds of ugly
>>> pthread_cleanup_push()/pop() calls from our code.
>>>
>>> Finding all these points would be a challenge though, and if we
>>> don't
>>> find them, we risk hanging on exit again, which is bad too, and was
>>> just recently improved.
>>
>> Do you mean some patches have been made to solve these problem?
>
> No. I could hack up some, but it will take some time.
>
Hi Martin,
How is this problem going?
Regards
Lixiaokeng
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel