> I would like to fix
> QASV(char) to mean QASV(QChar(char)), not redefine char literals as
> UTF-8 and break many more users (QASV is relatively new; QChar(char) and
> QString::arg(char) are there since before Qt 4).
>
> What do you think?

+1 for this proposal.

I do not think that QASV(char) providing a broken UTF-8 sequence makes sense.
We also have

  QString s('\xe4'); // calls QString(QChar) c-tor, using an implicit 
QChar(char) c-tor

producing "ä", not an invalid UTF-8 sequence.

------------------------------

Ivan Solovev
Senior Software Engineer

The Qt Company GmbH
Erich-Thilo-Str. 10
12489 Berlin, Germany
ivan.solo...@qt.io
www.qt.io

Geschäftsführer: Mika Pälsi,
Juha Varelius, Jouni Lintunen
Sitz der Gesellschaft: Berlin,
Registergericht: Amtsgericht
Charlottenburg, HRB 144331 B

________________________________________
From: Development <development-boun...@qt-project.org> on behalf of Marc Mutz 
via Development <development@qt-project.org>
Sent: Monday, June 10, 2024 2:39 PM
To: development@qt-project.org
Subject: [Development] Are char literals L1 or U8 in Qt?

Hi,

TL;DR:
- QASV(char) is UTF-8, but QChar(char) is L1
- propose to fix QASV, not QChar
   - iow: char literals remain L1, not become UTF-8
     - but char[] remains UTF-8
- propose to deprecate char and char[] literals for u8 and _L1 in Qt 7
   (= make QT_NO_CAST_FROM_ASCII the default)


While porting QString::arg() to QAnyStringView¹, I've noticed that
QAnyStringView(char) is producing a 1-byte UTF-8 sequence (which is
invalid unless the character is from the US-ASCII subset), while
QChar(char) is producing a valid 1-codepoint UTF-16
"sequence",interpreting the ctor argument as L1.

Since QASV is supposed to make _one_ function replace all relevant
overload _sets_, incl. QChar ones², this inconsistency is creating
problems (first found by arg() test cases failing after porting to QASV).

As the original author, I can confirm that the intent was to match
whatever QChar does, so I consider the current QASV(char) behaviour to
be buggy.

OTOH, an argument can be made that, since char[] is considered UTF-8 in
Qt, so should `char`, and I think no-one is considering anything else
when it comes to he result of QUtf8StringView::first(1). But this is
about char literals.

C++ solves this by banning non-US-ASCII u8'' literals.

For Qt, my plan was to wait until we can depend on C++20's char8_t and
then eventually make QT_NO_CAST_FROM_ASCII the default (and keep u8""
and _L1 working implicitly). We will then have the same problem for
char8_t, but the standard has kinda decided for us: chat8_t is always
UTF-8, incl. single chars (but the language bans incompatible literals,
something we can't do for char, which is one of he reasons I think
QT_NO_CAST_FROM_ASCII (or, rather, _FROM_CHAR) should be the default
over the medium term).

Since there are four bugs³ in QString::arg() that are all fixed by the
existing patch chain porting the whole thing to QAnyStringView, and
since the medium-term goal is to deprecate use of char for characters
and char[] for strings (QT_ASCII_WARN), anyway, I would like to fix
QASV(char) to mean QASV(QChar(char)), not redefine char literals as
UTF-8 and break many more users (QASV is relatively new; QChar(char) and
QString::arg(char) are there since before Qt 4).

What do you think?

Thanks,
Marc

¹ chain ending in https://codereview.qt-project.org/c/qt/qtbase/+/562895

² See https://www.qt.io/blog/qstringview-diaries-qanystringview:> First,
it would need to accept anything that the overload sets above would
accept, too, to wit:
>
[...]
>     QChar, or anything that implicitly converts to it (within reason; QChar's 
> ctors are a mess)
[...]

³ to wit:
- https://bugreports.qt.io/browse/QTBUG-126053 (char8_t)
- https://bugreports.qt.io/browse/QTBUG-126054 (wchar_t)
- https://bugreports.qt.io/browse/QTBUG-126055 (qfloat16)
- https://bugreports.qt.io/browse/QTBUG-125588 (char16_t)
- and the issue at hand:
   https://bugreports.qt.io/browse/QTBUG-125730 (char)

--
Marc Mutz <marc.m...@qt.io> (he/his)
Principal Software Engineer

The Qt Company
Erich-Thilo-Str. 10 12489
Berlin, Germany
www.qt.io

Geschäftsführer: Mika Pälsi, Juha Varelius, Jouni Lintunen
Sitz der Gesellschaft: Berlin,
Registergericht: Amtsgericht Charlottenburg,
HRB 144331 B
--
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development
-- 
Development mailing list
Development@qt-project.org
https://lists.qt-project.org/listinfo/development

Reply via email to