On Mar 20, 2023, at 3:06 PM, David Malcolm via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > c-c++-common/diagnostic-format-sarif-file-4.c is a test case for > quoting non-ASCII source code in a SARIF diagnostic log. > > The SARIF standard mandates that .sarif files are UTF-8 encoded. > > PR testsuite/105959 notes that the test case fails when the system > encoding is not UTF-8, such as when the "make" invocation is prefixed > with LC_ALL=C, whereas it works with in a UTF-8-locale. > > The root cause is that dg-scan opens the file for reading using the > "system" encoding; I believe it is falling back to treating all files as > effectively ISO 8859-1 in a non-UTF-8 locale. > > This patch fixes things by adding a mechanism to dg-scan to allow > callers to (optionally) specify an encoding to use when reading the > file, and updating scan-sarif-file (and the -not variant) to always > use UTF-8 when calling dg-scan, fixing the test case with LC_ALL=C.
> OK for trunk? Ok.