> On May 23, 2024, at 07:46, Richard Biener <richard.guent...@gmail.com> wrote:
> 
> On Wed, May 22, 2024 at 8:53 PM Qing Zhao <qing.z...@oracle.com> wrote:
>> 
>> 
>> 
>>> On May 22, 2024, at 03:38, Richard Biener <richard.guent...@gmail.com> 
>>> wrote:
>>> 
>>> On Tue, May 21, 2024 at 11:36 PM David Malcolm <dmalc...@redhat.com> wrote:
>>>> 
>>>> On Tue, 2024-05-21 at 15:13 +0000, Qing Zhao wrote:
>>>>> Thanks for the comments and suggestions.
>>>>> 
>>>>>> On May 15, 2024, at 10:00, David Malcolm <dmalc...@redhat.com>
>>>>>> wrote:
>>>>>> 
>>>>>> On Tue, 2024-05-14 at 15:08 +0200, Richard Biener wrote:
>>>>>>> On Mon, 13 May 2024, Qing Zhao wrote:
>>>>>>> 
>>>>>>>> -Warray-bounds is an important option to enable linux kernal to
>>>>>>>> keep
>>>>>>>> the array out-of-bound errors out of the source tree.
>>>>>>>> 
>>>>>>>> However, due to the false positive warnings reported in
>>>>>>>> PR109071
>>>>>>>> (-Warray-bounds false positive warnings due to code duplication
>>>>>>>> from
>>>>>>>> jump threading), -Warray-bounds=1 cannot be added on by
>>>>>>>> default.
>>>>>>>> 
>>>>>>>> Although it's impossible to elinimate all the false positive
>>>>>>>> warnings
>>>>>>>> from -Warray-bounds=1 (See PR104355 Misleading -Warray-bounds
>>>>>>>> documentation says "always out of bounds"), we should minimize
>>>>>>>> the
>>>>>>>> false positive warnings in -Warray-bounds=1.
>>>>>>>> 
>>>>>>>> The root reason for the false positive warnings reported in
>>>>>>>> PR109071 is:
>>>>>>>> 
>>>>>>>> When the thread jump optimization tries to reduce the # of
>>>>>>>> branches
>>>>>>>> inside the routine, sometimes it needs to duplicate the code
>>>>>>>> and
>>>>>>>> split into two conditional pathes. for example:
>>>>>>>> 
>>>>>>>> The original code:
>>>>>>>> 
>>>>>>>> void sparx5_set (int * ptr, struct nums * sg, int index)
>>>>>>>> {
>>>>>>>> if (index >= 4)
>>>>>>>>   warn ();
>>>>>>>> *ptr = 0;
>>>>>>>> *val = sg->vals[index];
>>>>>>>> if (index >= 4)
>>>>>>>>   warn ();
>>>>>>>> *ptr = *val;
>>>>>>>> 
>>>>>>>> return;
>>>>>>>> }
>>>>>>>> 
>>>>>>>> With the thread jump, the above becomes:
>>>>>>>> 
>>>>>>>> void sparx5_set (int * ptr, struct nums * sg, int index)
>>>>>>>> {
>>>>>>>> if (index >= 4)
>>>>>>>>   {
>>>>>>>>     warn ();
>>>>>>>>     *ptr = 0;         // Code duplications since "warn" does
>>>>>>>> return;
>>>>>>>>     *val = sg->vals[index];   // same this line.
>>>>>>>>                               // In this path, since it's
>>>>>>>> under
>>>>>>>> the condition
>>>>>>>>                               // "index >= 4", the compiler
>>>>>>>> knows
>>>>>>>> the value
>>>>>>>>                               // of "index" is larger then 4,
>>>>>>>> therefore the
>>>>>>>>                               // out-of-bound warning.
>>>>>>>>     warn ();
>>>>>>>>   }
>>>>>>>> else
>>>>>>>>   {
>>>>>>>>     *ptr = 0;
>>>>>>>>     *val = sg->vals[index];
>>>>>>>>   }
>>>>>>>> *ptr = *val;
>>>>>>>> return;
>>>>>>>> }
>>>>>>>> 
>>>>>>>> We can see, after the thread jump optimization, the # of
>>>>>>>> branches
>>>>>>>> inside
>>>>>>>> the routine "sparx5_set" is reduced from 2 to 1, however,  due
>>>>>>>> to
>>>>>>>> the
>>>>>>>> code duplication (which is needed for the correctness of the
>>>>>>>> code),
>>>>>>>> we
>>>>>>>> got a false positive out-of-bound warning.
>>>>>>>> 
>>>>>>>> In order to eliminate such false positive out-of-bound warning,
>>>>>>>> 
>>>>>>>> A. Add one more flag for GIMPLE: is_splitted.
>>>>>>>> B. During the thread jump optimization, when the basic blocks
>>>>>>>> are
>>>>>>>>  duplicated, mark all the STMTs inside the original and
>>>>>>>> duplicated
>>>>>>>>  basic blocks as "is_splitted";
>>>>>>>> C. Inside the array bound checker, add the following new
>>>>>>>> heuristic:
>>>>>>>> 
>>>>>>>> If
>>>>>>>>  1. the stmt is duplicated and splitted into two conditional
>>>>>>>> paths;
>>>>>>>> +  2. the warning level < 2;
>>>>>>>> +  3. the current block is not dominating the exit block
>>>>>>>> Then not report the warning.
>>>>>>>> 
>>>>>>>> The false positive warnings are moved from -Warray-bounds=1 to
>>>>>>>> -Warray-bounds=2 now.
>>>>>>>> 
>>>>>>>> Bootstrapped and regression tested on both x86 and aarch64.
>>>>>>>> adjusted
>>>>>>>> -Warray-bounds-61.c due to the false positive warnings.
>>>>>>>> 
>>>>>>>> Let me know if you have any comments and suggestions.
>>>>>>> 
>>>>>>> At the last Cauldron I talked with David Malcolm about these kind
>>>>>>> of
>>>>>>> issues and thought of instead of suppressing diagnostics to
>>>>>>> record
>>>>>>> how a block was duplicated.  For jump threading my idea was to
>>>>>>> record
>>>>>>> the condition that was proved true when entering the path and do
>>>>>>> this
>>>>>>> by recording the corresponding locations
>>>>> 
>>>>> Is only recording the location for the TRUE path  enough?
>>>>> We might need to record the corresponding locations for both TRUE and
>>>>> FALSE paths since the VRP might be more accurate on both paths.
>>>>> Is only recording the location is enough?
>>>>> Do we need to record the pointer to the original condition stmt?
>>>> 
>>>> Just to be clear: I don't plan to work on this myself (I have far too
>>>> much already to work on...); I'm assuming Richard Biener is
>>>> hoping/planning to implement this himself.
>>> 
>>> While I think some of this might be an improvement to those vast
>>> number of "false positive" diagnostics we have from (too) late diagnostic
>>> passes I do not have the cycles to work on this.
>> 
>> I can study a little bit more on how to improve the diagnostics for PR 
>> 109071 first.
>> 
>> FYI, I just updated PR109071’s description as: Need more context for 
>> -Warray-bounds warnings due to code duplication from jump threading.
>> 
>> As we already agreed, this is NOT a false positive. It caught a real bug in 
>> linux kernel that need to be patched in the kernel source. (See 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109071#c11)
>> 
>> In order to add more context to the diagnostics to help the end user locate 
>> the bug accurately in their source code, compiler needs:
>> 
>> 1. During jump threading phase, record the following information for each 
>> duplicated STMT:
>>        A. A pointer to the COND stmt;
>>        B. True/false for such COND
>> 2. During array out-of-bound checking phase, whenever locate an out-of-bound 
>> case,
>>        A. Use a rich_location for the diagnostic;
>>        B. Create an instance of diagnositic_path, add events to this 
>> diagnostic_path based on the COND stmt, True/False info recorded in the STMT;
>>        C. Call richloc.set_path() to associate the path with the 
>> rich_location;
>>        D. Then issue warning with this rich_location instead of the current 
>> regular location.
>> 
>> Any comment and suggestion to the above?
> 
> I was originally thinking of using location_adhoc_data to store 1.A
> and 1.B as a common bit to associate to each
> copied stmt.  IIRC location_adhoc_data->data is the stmts associated
> lexical block so we'd need to stuff another
> 'data' field into this struct, like a "copy history" where we could
> then even record copied-of-copy-of-X.
> locataion_adhoc_data seems imperfectly packed right now, with a 32bit
> hole before 'data' and 32bit unused
> at its end, so we might get away without memory use by re-ordering
> discriminator before 'data' ...

Like this?

diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h
index e6e2b0897572..ee344f91333b 100644
--- a/libcpp/include/line-map.h
+++ b/libcpp/include/line-map.h
@@ -761,8 +761,9 @@ struct GTY(()) maps_info_macro {
 struct GTY(()) location_adhoc_data {
   location_t locus;
   source_range src_range;
-  void * GTY((skip)) data;
   unsigned discriminator;
+  void * GTY((skip)) data;
+  void * GTY((skip)) copy_history;
 };
   struct htab;

How about the copy_history? Do we need a new data structure (like the 
following, or other suggestion) for this? Where should I add this new data 
structure?

struct copy_history {
  location_t condition;
  Bool is_true_path;
}

Qing
> 
> Richard.
> 
>> Thanks.
>> 
>> Qing
>> 
>> 
>>> 
>>> Richard.
>>> 
>>>> But feel free to ask me any questions about the diagnostic_path
>>>> machinery within the diagnostics subsystem.
>>>> 
>>>> [...snip...]
>>>> 
>>>> Regards
>>>> Dave


Reply via email to