On Mon, 14 Nov 2022 at 11:38, Alejandro Colomar via Gcc <[email protected]> wrote:
>
> Hi Andrew!
>
> On 11/13/22 23:12, Andrew Pinski wrote:
> > On Sun, Nov 13, 2022 at 1:57 PM Alejandro Colomar via Gcc
> > <[email protected]> wrote:
> >>
> >> Hi!
> >>
> >> I'd like to get warnings if I write the following code:
> >>
> >> char foo[3] = "foo";
> >
> > This should be easy to add as it is already part of the -Wc++-compat
> > option as for C++ it is invalid code.
> >
> > <source>:2:19: warning: initializer-string for array of 'char' is too long
> > 2 | char two[2] = "foo"; // 'f' 'o'
> > | ^~~~~
> > <source>:3:19: warning: initializer-string for array of 'char' is too
> > long for C++ [-Wc++-compat]
> > 3 | char three[3] = "foo"; // 'f' 'o' 'o'
> > | ^~~~~
> >
> >
> > ... (for your more complex case [though I needed to modify one of the
> > strings to exactly 8]
> >
> > <source>:5:7: warning: initializer-string for array of 'char' is too
> > long for C++ [-Wc++-compat]
> > 5 | "01234567",
> > | ^~~~~~~~~~
> >
> > else if (warn_cxx_compat
> > && compare_tree_int (TYPE_SIZE_UNIT (type), len) <
> > 0)
> > warning_at (init_loc, OPT_Wc___compat,
> > ("initializer-string for array of %qT "
> > "is too long for C++"), typ1);
> >
> > That is the current code which does this warning even so it is just a
> > matter of adding an option to c-family/c.opt and then having
> > c++-compat enable it and using that new option here.
> >
> > Thanks,
> > Andrew Pinski
>
> Great! I'd like to implement it myself, as I've never written any GCC code
> yet,
> so it's interesting to me. If you recall any (hopefully recent) case where a
> similar thing happened (the warning was already implemented and only needed a
> name), it might help me check how it was done.
`git log gcc/c-family/c.opt` will show loads of changes adding warnings.
>
> BTW, I had another idea to add a suffix to string literals to make them
> unterminated:
>
> char foo[3] = "foo"u; // OK
> char bar[4] = "bar"; // OK
>
> char baz[4] = "baz"u; // Warning: initializer is too short.
> char etc[3] = "etc"; // Warning: unterminated string.
>
> Is that doable? Do you think it makes sense?
IMHO no. This is not useful enough to add a language extension, it's
an incredibly niche use case. Your suggested syntax also looks very
confusing with UTF-16 string literals, and is not sufficiently
distinct from a normal string literal to be obvious when quickly
reading the code. People expect string literals in C to be
null-terminated, having a subtle suffix that changes that would be a
bug farm.
You can do {'b', 'a', 'z'} if you want an explicitly unterminated array of char.