> On Feb 6, 2025, at 09:25, Richard Biener <[email protected]> wrote:
>
> On Thu, Feb 6, 2025 at 2:57 PM Qing Zhao <[email protected]> wrote:
>>
>>
>>
>>> On Feb 6, 2025, at 04:36, Richard Biener <[email protected]> wrote:
>>>
>>> On Wed, Feb 5, 2025 at 4:40 PM Qing Zhao <[email protected]> wrote:
>>>>
>>>>
>>>>
>>>>> On Feb 5, 2025, at 07:46, Richard Biener <[email protected]>
>>>>> wrote:
>>>>>
>>>>> On Tue, Feb 4, 2025 at 10:12 PM Qing Zhao <[email protected]> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> One of our big application segv in libgcc code while unwinding the stack.
>>>>>>
>>>>>> This is a random crash while the application throws a c++ exception and
>>>>>> unwinds the stack. Incidents are random and never can be reproduced by
>>>>>> any
>>>>>> test case.
>>>>>>
>>>>>> The libgcc that is used is based on GCC 8.
>>>>>>
>>>>>> Fortunately, we got some information through the stack trace, and
>>>>>> narrowed
>>>>>> down the crash is in the routine "init_object" of
>>>>>> libgcc/unwind-dw2-pde.c:
>>>>>>
>>>>>> 777 count = classify_object_over_fdes (ob, ob->u.single);
>>>>>>
>>>>>> when loading the 2nd parameter of the call to
>>>>>> "classify_object_over_fdes".
>>>>>> i.e, the address that is pointed by "ob->u.single" is invalid.
>>>>>>
>>>>>> And the outer caller we can track back is the following line 1066 of the
>>>>>> routine
>>>>>> "_Unwind_Find_FDE":
>>>>>>
>>>>>> 1060 /* Classify and search the objects we've not yet processed. */
>>>>>> 1061 while ((ob = unseen_objects))
>>>>>> 1062 {
>>>>>> 1063 struct object **p;
>>>>>> 1064
>>>>>> 1065 unseen_objects = ob->next;
>>>>>> 1066 f = search_object (ob, pc);
>>>>>>
>>>>>> Then we suspected that the construction of the static variable
>>>>>> "unseen_objects"
>>>>>> might have some bug.
>>>>>>
>>>>>> Then after studying the source code that constructs "unseen_objects" in
>>>>>> libgcc/unwind-dw2-pde.c, I found an issue in the routines
>>>>>> "__register_frame"
>>>>>> and "__register_frame_table" as following:
>>>>>>
>>>>>> 127 void
>>>>>> 128 __register_frame (void *begin)
>>>>>> 129 {
>>>>>> 130 struct object *ob;
>>>>>> 131
>>>>>> 132 /* If .eh_frame is empty, don't register at all. */
>>>>>> 133 if (*(uword *) begin == 0)
>>>>>> 134 return;
>>>>>> 135
>>>>>> 136 ob = malloc (sizeof (struct object));
>>>>>> 137 __register_frame_info (begin, ob);
>>>>>> 138 }
>>>>>>
>>>>>> 181 void
>>>>>> 182 __register_frame_table (void *begin)
>>>>>> 183 {
>>>>>> 184 struct object *ob = malloc (sizeof (struct object));
>>>>>> 185 __register_frame_info_table (begin, ob);
>>>>>> 186 }
>>>>>> 187
>>>>>>
>>>>>> In the above, one obvious issue in the source code is at line 136, line
>>>>>> 137,
>>>>>> and line 184 and line 185: the return value of call to "malloc" is not
>>>>>> checked
>>>>>> against NULL before it was passed as the second parameter of the
>>>>>> follwoing call.
>>>>>>
>>>>>> This might cause unpredicted random behavior.
>>>>>>
>>>>>> I checked the latest trunk GCC libgcc, the same issue is there too.
>>>>>>
>>>>>> This patch is for the latest trunk GCC.
>>>>>>
>>>>>> Please let me know any comments on this?
>>>>>
>>>>> I think I've seen this elsewhere —
>>>>
>>>> Do you mean that you saw this problem (malloc return NULL during
>>>> register_frame) previously?
>>>
>>> It was probably reported to us (SUSE) from a customer/partner, also
>>> from a (llvm?) JIT context.
>>
>> Okay.
>> So, that bug has not been resolved yet?
>
> It was resolved from our side IIRC as being a problem in the
> application with the JIT and
> otherwise a feature request (better unwind API for this use case,
> allowing a failure mode).
Okay, I see.
Thanks.
Qing
>
> Richard.
>
>> Qing
>>>
>>>>> the issue is the unwind register API does
>>>>> not allow for failures but I also think calling abort() is bad.
>>>>
>>>> Okay, I see.
>>>>
>>>>>
>>>>> Are you calling this from a JIT context or so?
>>>>
>>>> I will ask the bug submitter on this to make sure.
>>>>
>>>>> We're assuming that at program
>>>>> start malloc() will not fail.
>>>>
>>>> So, the routine __register_frame is only called at the _start_ of a
>>>> program?
>>>>>
>>>>> The proper solution is probably to add an alternate ABI which has a way
>>>>> to fail
>>>>> during registration.
>>>>
>>>> Any suggestion on this (or any other routine I can refer to?)
>>>>
>>>>
>>>> Thanks a lot for the help.
>>>>
>>>> Qing
>>>>>
>>>>> Richard.
>>>>>
>>>>>> thanks.
>>>>>>
>>>>>> Qing
>>>>>>
>>>>>>
>>>>>> =======================
>>>>>>
>>>>>> Fix two issues in libgcc/unwind-dw2-fde.c:
>>>>>>
>>>>>> A. The returned value of call to malloc is not checked against NULL.
>>>>>> This might cause unpredicted behavior during stack unwinding.
>>>>>>
>>>>>> B. Check for null begin parameter (as well as pointer to null) in
>>>>>> __register_frame_info_table_bases and __register_frame_table.
>>>>>>
>>>>>> libgcc/ChangeLog:
>>>>>>
>>>>>> * unwind-dw2-fde.c (__register_frame): Check the return value of
>>>>>> call
>>>>>> to malloc.
>>>>>> (__register_frame_info_table_bases): Check for null begin parameter.
>>>>>> (__register_frame_table): Check the return value of call to malloc,
>>>>>> Check for null begin parameter.
>>>>>> ---
>>>>>> libgcc/unwind-dw2-fde.c | 12 ++++++++++++
>>>>>> 1 file changed, 12 insertions(+)
>>>>>>
>>>>>> diff --git a/libgcc/unwind-dw2-fde.c b/libgcc/unwind-dw2-fde.c
>>>>>> index cdfd3974c99..f0bc29d682a 100644
>>>>>> --- a/libgcc/unwind-dw2-fde.c
>>>>>> +++ b/libgcc/unwind-dw2-fde.c
>>>>>> @@ -169,6 +169,8 @@ __register_frame (void *begin)
>>>>>> return;
>>>>>>
>>>>>> ob = malloc (sizeof (struct object));
>>>>>> + if (ob == NULL)
>>>>>> + abort ();
>>>>>> __register_frame_info (begin, ob);
>>>>>> }
>>>>>>
>>>>>> @@ -180,6 +182,10 @@ void
>>>>>> __register_frame_info_table_bases (void *begin, struct object *ob,
>>>>>> void *tbase, void *dbase)
>>>>>> {
>>>>>> + /* If .eh_frame is empty, don't register at all. */
>>>>>> + if ((const uword *) begin == 0 || *(const uword *) begin == 0)
>>>>>> + return;
>>>>>> +
>>>>>> ob->pc_begin = (void *)-1;
>>>>>> ob->tbase = tbase;
>>>>>> ob->dbase = dbase;
>>>>>> @@ -210,7 +216,13 @@ __register_frame_info_table (void *begin, struct
>>>>>> object *ob)
>>>>>> void
>>>>>> __register_frame_table (void *begin)
>>>>>> {
>>>>>> + /* If .eh_frame is empty, don't register at all. */
>>>>>> + if (*(uword *) begin == 0)
>>>>>> + return;
>>>>>> +
>>>>>> struct object *ob = malloc (sizeof (struct object));
>>>>>> + if (ob == NULL)
>>>>>> + abort ();
>>>>>> __register_frame_info_table (begin, ob);
>>>>>> }
>>>>>>
>>>>>> --
>>>>>> 2.31.1