> On Jun 3, 2024, at 02:29, Richard Biener <richard.guent...@gmail.com> wrote:
> 
> On Fri, May 31, 2024 at 11:23 PM Qing Zhao <qing.z...@oracle.com> wrote:
>> 
>> 
>> 
>>> On May 23, 2024, at 07:46, Richard Biener <richard.guent...@gmail.com> 
>>> wrote:
>>> 
>>> On Wed, May 22, 2024 at 8:53 PM Qing Zhao <qing.z...@oracle.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On May 22, 2024, at 03:38, Richard Biener <richard.guent...@gmail.com> 
>>>>> wrote:
>>>>> 
>>>>> On Tue, May 21, 2024 at 11:36 PM David Malcolm <dmalc...@redhat.com> 
>>>>> wrote:
>>>>>> 
>>>>>> On Tue, 2024-05-21 at 15:13 +0000, Qing Zhao wrote:
>>>>>>> Thanks for the comments and suggestions.
>>>>>>> 
>>>>>>>> On May 15, 2024, at 10:00, David Malcolm <dmalc...@redhat.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
>>>>>>>>> On Mon, 13 May 2024, Qing Zhao wrote:
>>>>>>>>> 
>>>>>>>>>> -Warray-bounds is an important option to enable linux kernal to
>>>>>>>>>> keep
>>>>>>>>>> the array out-of-bound errors out of the source tree.
>>>>>>>>>> 
>>>>>>>>>> However, due to the false positive warnings reported in
>>>>>>>>>> PR109071
>>>>>>>>>> (-Warray-bounds false positive warnings due to code duplication
>>>>>>>>>> from
>>>>>>>>>> jump threading), -Warray-bounds=1 cannot be added on by
>>>>>>>>>> default.
>>>>>>>>>> 
>>>>>>>>>> Although it's impossible to elinimate all the false positive
>>>>>>>>>> warnings
>>>>>>>>>> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
>>>>>>>>>> documentation says "always out of bounds"), we should minimize
>>>>>>>>>> the
>>>>>>>>>> false positive warnings in -Warray-bounds=1.
>>>>>>>>>> 
>>>>>>>>>> The root reason for the false positive warnings reported in
>>>>>>>>>> PR109071 is:
>>>>>>>>>> 
>>>>>>>>>> When the thread jump optimization tries to reduce the # of
>>>>>>>>>> branches
>>>>>>>>>> inside the routine, sometimes it needs to duplicate the code
>>>>>>>>>> and
>>>>>>>>>> split into two conditional pathes. for example:
>>>>>>>>>> 
>>>>>>>>>> The original code:
>>>>>>>>>> 
>>>>>>>>>> void sparx5_set (int * ptr, struct nums * sg, int index)
>>>>>>>>>> {
>>>>>>>>>> if (index >= 4)
>>>>>>>>>>  warn ();
>>>>>>>>>> *ptr = 0;
>>>>>>>>>> *val = sg->vals[index];
>>>>>>>>>> if (index >= 4)
>>>>>>>>>>  warn ();
>>>>>>>>>> *ptr = *val;
>>>>>>>>>> 
>>>>>>>>>> return;
>>>>>>>>>> }
>>>>>>>>>> 
>>>>>>>>>> With the thread jump, the above becomes:
>>>>>>>>>> 
>>>>>>>>>> void sparx5_set (int * ptr, struct nums * sg, int index)
>>>>>>>>>> {
>>>>>>>>>> if (index >= 4)
>>>>>>>>>>  {
>>>>>>>>>>    warn ();
>>>>>>>>>>    *ptr = 0;         // Code duplications since "warn" does
>>>>>>>>>> return;
>>>>>>>>>>    *val = sg->vals[index];   // same this line.
>>>>>>>>>>                              // In this path, since it's
>>>>>>>>>> under
>>>>>>>>>> the condition
>>>>>>>>>>                              // "index >= 4", the compiler
>>>>>>>>>> knows
>>>>>>>>>> the value
>>>>>>>>>>                              // of "index" is larger then 4,
>>>>>>>>>> therefore the
>>>>>>>>>>                              // out-of-bound warning.
>>>>>>>>>>    warn ();
>>>>>>>>>>  }
>>>>>>>>>> else
>>>>>>>>>>  {
>>>>>>>>>>    *ptr = 0;
>>>>>>>>>>    *val = sg->vals[index];
>>>>>>>>>>  }
>>>>>>>>>> *ptr = *val;
>>>>>>>>>> return;
>>>>>>>>>> }
>>>>>>>>>> 
>>>>>>>>>> We can see, after the thread jump optimization, the # of
>>>>>>>>>> branches
>>>>>>>>>> inside
>>>>>>>>>> the routine "sparx5_set" is reduced from 2 to 1, however,  due
>>>>>>>>>> to
>>>>>>>>>> the
>>>>>>>>>> code duplication (which is needed for the correctness of the
>>>>>>>>>> code),
>>>>>>>>>> we
>>>>>>>>>> got a false positive out-of-bound warning.
>>>>>>>>>> 
>>>>>>>>>> In order to eliminate such false positive out-of-bound warning,
>>>>>>>>>> 
>>>>>>>>>> A. Add one more flag for GIMPLE: is_splitted.
>>>>>>>>>> B. During the thread jump optimization, when the basic blocks
>>>>>>>>>> are
>>>>>>>>>> duplicated, mark all the STMTs inside the original and
>>>>>>>>>> duplicated
>>>>>>>>>> basic blocks as "is_splitted";
>>>>>>>>>> C. Inside the array bound checker, add the following new
>>>>>>>>>> heuristic:
>>>>>>>>>> 
>>>>>>>>>> If
>>>>>>>>>> 1. the stmt is duplicated and splitted into two conditional
>>>>>>>>>> paths;
>>>>>>>>>> +  2. the warning level < 2;
>>>>>>>>>> +  3. the current block is not dominating the exit block
>>>>>>>>>> Then not report the warning.
>>>>>>>>>> 
>>>>>>>>>> The false positive warnings are moved from -Warray-bounds=1 to
>>>>>>>>>> -Warray-bounds=2 now.
>>>>>>>>>> 
>>>>>>>>>> Bootstrapped and regression tested on both x86 and aarch64.
>>>>>>>>>> adjusted
>>>>>>>>>> -Warray-bounds-61.c due to the false positive warnings.
>>>>>>>>>> 
>>>>>>>>>> Let me know if you have any comments and suggestions.
>>>>>>>>> 
>>>>>>>>> At the last Cauldron I talked with David Malcolm about these kind
>>>>>>>>> of
>>>>>>>>> issues and thought of instead of suppressing diagnostics to
>>>>>>>>> record
>>>>>>>>> how a block was duplicated.  For jump threading my idea was to
>>>>>>>>> record
>>>>>>>>> the condition that was proved true when entering the path and do
>>>>>>>>> this
>>>>>>>>> by recording the corresponding locations
>>>>>>> 
>>>>>>> Is only recording the location for the TRUE path  enough?
>>>>>>> We might need to record the corresponding locations for both TRUE and
>>>>>>> FALSE paths since the VRP might be more accurate on both paths.
>>>>>>> Is only recording the location is enough?
>>>>>>> Do we need to record the pointer to the original condition stmt?
>>>>>> 
>>>>>> Just to be clear: I don't plan to work on this myself (I have far too
>>>>>> much already to work on...); I'm assuming Richard Biener is
>>>>>> hoping/planning to implement this himself.
>>>>> 
>>>>> While I think some of this might be an improvement to those vast
>>>>> number of "false positive" diagnostics we have from (too) late diagnostic
>>>>> passes I do not have the cycles to work on this.
>>>> 
>>>> I can study a little bit more on how to improve the diagnostics for PR 
>>>> 109071 first.
>>>> 
>>>> FYI, I just updated PR109071’s description as: Need more context for 
>>>> -Warray-bounds warnings due to code duplication from jump threading.
>>>> 
>>>> As we already agreed, this is NOT a false positive. It caught a real bug 
>>>> in linux kernel that need to be patched in the kernel source. (See 
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109071#c11)
>>>> 
>>>> In order to add more context to the diagnostics to help the end user 
>>>> locate the bug accurately in their source code, compiler needs:
>>>> 
>>>> 1. During jump threading phase, record the following information for each 
>>>> duplicated STMT:
>>>>       A. A pointer to the COND stmt;
>>>>       B. True/false for such COND
>>>> 2. During array out-of-bound checking phase, whenever locate an 
>>>> out-of-bound case,
>>>>       A. Use a rich_location for the diagnostic;
>>>>       B. Create an instance of diagnositic_path, add events to this 
>>>> diagnostic_path based on the COND stmt, True/False info recorded in the 
>>>> STMT;
>>>>       C. Call richloc.set_path() to associate the path with the 
>>>> rich_location;
>>>>       D. Then issue warning with this rich_location instead of the current 
>>>> regular location.
>>>> 
>>>> Any comment and suggestion to the above?
>>> 
>>> I was originally thinking of using location_adhoc_data to store 1.A
>>> and 1.B as a common bit to associate to each
>>> copied stmt.  IIRC location_adhoc_data->data is the stmts associated
>>> lexical block so we'd need to stuff another
>>> 'data' field into this struct, like a "copy history" where we could
>>> then even record copied-of-copy-of-X.
>>> locataion_adhoc_data seems imperfectly packed right now, with a 32bit
>>> hole before 'data' and 32bit unused
>>> at its end, so we might get away without memory use by re-ordering
>>> discriminator before 'data' ...
>> 
>> Like this?
>> 
>> diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
>> index e6e2b0897572..ee344f91333b 100644
>> --- a/libcpp/include/line-map.h
>> +++ b/libcpp/include/line-map.h
>> @@ -761,8 +761,9 @@ struct GTY(()) maps_info_macro {
>> struct GTY(()) location_adhoc_data {
>>   location_t locus;
>>   source_range src_range;
>> -  void * GTY((skip)) data;
>>   unsigned discriminator;
>> +  void * GTY((skip)) data;
>> +  void * GTY((skip)) copy_history;
>> };
>>   struct htab;
> 
> Yes.
> 
>> How about the copy_history? Do we need a new data structure (like the 
>> following, or other suggestion) for this? Where should I add this new data 
>> structure?
> 
> As it needs to be managed by libcpp it should be in this very same file.
Okay.
> 
>> struct copy_history {
>>  location_t condition;
>>  Bool is_true_path;
>> }
> 
> I think we want a pointer to the previous copy-of state as well in case a stmt
> was copied twice.  We'll see whether a single (condition) location
> plus edge flag
> is sufficient.  I'd say we should plan for an enum to indicate the duplication
> reason at least (jump threading, unswitching, unrolling come to my mind).  For
> jump threading being able to say "when <condition> is true/false" is probably
> good enough, though it might not always be easy to identify a single condition
> here given a threading path starts at an incoming edge to a CFG merge and
> will usually end with the outgoing edge of a condition that we can statically
> evaluate.  The condition controlling the path entry might not be determined
> fully by a single condition location.

Shall a linked list be enough to construct this path?

struct copy_from {
 location_t condition;
 bool is_true_path;
 enum optimization;
 structure copy_from * prev_copy;
}

Since I am not very familiar with the more complicated jump-threading 
transformation.might need to study a little bit more there. 
> 
> Possibly building a full "diagnostic path" object at threading time might be
> the only way to record all the facts, but that's also going to be more
> expensive.

The above linked list might help here?

> I can think of having a -fdiagnostic-try-to-explain-harder option to
> clarify confusing
> diagnostics people could use on-demand?
Yes, this is a good idea.

Qing
> 
> Richard.
> 
>> Qing
>>> 
>>> Richard.
>>> 
>>>> Thanks.
>>>> 
>>>> Qing
>>>> 
>>>> 
>>>>> 
>>>>> Richard.
>>>>> 
>>>>>> But feel free to ask me any questions about the diagnostic_path
>>>>>> machinery within the diagnostics subsystem.
>>>>>> 
>>>>>> [...snip...]
>>>>>> 
>>>>>> Regards
>>>>>> Dave


Reply via email to