Re: [Development] Should we change the default codec of QTextStream to UTF-8?

2012-05-25 Thread lars.knoll
On 5/24/12 11:38 PM, ext Thiago Macieira thiago.macie...@intel.com
wrote:

On quinta-feira, 24 de maio de 2012 13.15.16, 1+1=2 wrote:
 If we want to bootstrap tool such as qmake support utf8

Why would we want to?

One reason could be paths/filenames that are encoded in the .pro file. If
we interpret it as latin1, we can't support international filenames.

And we are now assuming utf8 for all our C++ and QML source code. One
could argue that assuming the same encoding for .pro files would be only
consistent.

Cheers,
Lars

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] Should we change the default codec of QTextStream to UTF-8?

2012-05-25 Thread Thiago Macieira
On sexta-feira, 25 de maio de 2012 06.27.05, lars.kn...@nokia.com wrote:
 On 5/24/12 11:38 PM, ext Thiago Macieira thiago.macie...@intel.com

 wrote:
 On quinta-feira, 24 de maio de 2012 13.15.16, 1+1=2 wrote:
  If we want to bootstrap tool such as qmake support utf8
 
 Why would we want to?

 One reason could be paths/filenames that are encoded in the .pro file. If
 we interpret it as latin1, we can't support international filenames.

If we are reading the .pro file as Latin 1 and we encode the filenames as Latin
1 when calling the POSIX functions, it will work just fine. So I don't see a
problem on Linux.

As for Windows, writing an 8-bit path is a mess anyway. Is the .pro file
supposed to be encoded in UTF-8 or in the Windows legacy ANSI encoding?

 And we are now assuming utf8 for all our C++ and QML source code. One
 could argue that assuming the same encoding for .pro files would be only
 consistent.

I agree: we'd assume that the encoding of the .pro file is UTF-8.

However, at this point I'd simply recommend ignoring the problem. Let's not
try and modify qmake any than we really must for getting 5.0 out.

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden


signature.asc
Description: This is a digitally signed message part.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] Should we change the default codec of QTextStream to UTF-8?

2012-05-25 Thread lars.knoll
On 5/25/12 9:37 AM, ext Thiago Macieira thiago.macie...@intel.com
wrote:

On sexta-feira, 25 de maio de 2012 06.27.05, lars.kn...@nokia.com wrote:
 On 5/24/12 11:38 PM, ext Thiago Macieira thiago.macie...@intel.com
 
 wrote:
 On quinta-feira, 24 de maio de 2012 13.15.16, 1+1=2 wrote:
  If we want to bootstrap tool such as qmake support utf8
 
 Why would we want to?
 
 One reason could be paths/filenames that are encoded in the .pro file.
If
 we interpret it as latin1, we can't support international filenames.

If we are reading the .pro file as Latin 1 and we encode the filenames as
Latin 
1 when calling the POSIX functions, it will work just fine. So I don't
see a 
problem on Linux.

You're right, since toLocal8Bit() is equivalent to toLatin1() in bootstrap
mode.

As for Windows, writing an 8-bit path is a mess anyway. Is the .pro file
supposed to be encoded in UTF-8 or in the Windows legacy ANSI encoding?

Yeah, this is somewhat messy, as the Win32 APIs take wchar's (ie. utf16).

 And we are now assuming utf8 for all our C++ and QML source code. One
 could argue that assuming the same encoding for .pro files would be only
 consistent.

I agree: we'd assume that the encoding of the .pro file is UTF-8.

However, at this point I'd simply recommend ignoring the problem. Let's
not 
try and modify qmake any than we really must for getting 5.0 out.

Ok, agree with that. It's certainly not worse in 5.0 than it has been for
years :)

Cheers,
Lars

___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] Should we change the default codec of QTextStream to UTF-8?

2012-05-24 Thread 1+1=2
On Thu, May 24, 2012 at 2:51 AM,  lars.kn...@nokia.com wrote:
 Not true. We do compile qutfcodec into qmake and the bootstrap tools, so
 QString::fromUtf8() does work.  With the change of QString(const char *)
 to convert from utf8 I would assume that qmake at least partially uses
 utf8 by now.

 I think it makes sense to also require utf8 encoding of .pro files to be
 consistent. Also, that's the only way we can get non latin paths to work.


Yes, QUtf8::convertFromUnicode() and QUtf8::convertToUnicode() which
are used by QString::from/toUtf8() can be used in bootstrap tools
directly, but they can't be used through QTextStream.

At present, when QT_NO_TEXTCODEC is defined, QTextStream use
QString::fromLatin1() /QString::toLocal8Bit()
to convert from/to bytes which looks like not very well. So I think it
will be better to replace them with QUtf8::convertFromUnicode() and
QUtf8::convertToUnicode().

Any suggestion?

Debao
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] Should we change the default codec of QTextStream to UTF-8?

2012-05-24 Thread Thiago Macieira
On quinta-feira, 24 de maio de 2012 10.05.50, 1+1=2 wrote:
 At present, when QT_NO_TEXTCODEC is defined, QTextStream use
 QString::fromLatin1() /QString::toLocal8Bit()
 to convert from/to bytes which looks like not very well. So I think it
 will be better to replace them with QUtf8::convertFromUnicode() and
 QUtf8::convertToUnicode().

Keep them Latin1. At least the tools will not screw up user data if they can't
figure out the locale.

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden


signature.asc
Description: This is a digitally signed message part.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] Should we change the default codec of QTextStream to UTF-8?

2012-05-24 Thread 1+1=2
On Thu, May 24, 2012 at 10:26 AM, Thiago Macieira
thiago.macie...@intel.com wrote:
 On quinta-feira, 24 de maio de 2012 10.05.50, 1+1=2 wrote:
 At present, when QT_NO_TEXTCODEC is defined, QTextStream use
 QString::fromLatin1() /QString::toLocal8Bit()
 to convert from/to bytes which looks like not very well. So I think it
 will be better to replace them with QUtf8::convertFromUnicode() and
 QUtf8::convertToUnicode().

 Keep them Latin1. At least the tools will not screw up user data if they can't
 figure out the locale.


The problem is that, QTextStream has different behavior depending on
whether  QT_NO_TEXTCODEC is defined or not.

When QT_NO_TEXTCODEC is defined, default codec is
QTextCodec::codecForLocale(), but can be changed using
QTextStream::setCodec().

When QT_NO_TEXTCODEC isn't defined, QString::fromLatin1()
/QString::toLocal8Bit() is used, and they can't canged by users. I
think this is broken for non-latin1-locale users. If they want to
read/write the same file, it indeed will screw up user data ;-)

Even if we want need to keep qmake project file latin1 only, we can
directly using QFile instead of QTextStream, as the features of
QTextStream aren't used by qmake. For example

Line 1825 of qmake/project.cpp:

QFile qfile(file);
if(qfile.open(QIODevice::ReadOnly)) {
QTextStream stream(qfile);
while(!stream.atEnd()) {
ret += split_value_list(stream.readLine().trimmed());
if(!singleLine)
ret += \n;
}
qfile.close();
}
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] Should we change the default codec of QTextStream to UTF-8?

2012-05-24 Thread Thiago Macieira
On quinta-feira, 24 de maio de 2012 12.05.47, 1+1=2 wrote:
 When QT_NO_TEXTCODEC is defined, default codec is
 QTextCodec::codecForLocale(), but can be changed using
 QTextStream::setCodec().

I don't see the need to change then. It should remain the locale.

 When QT_NO_TEXTCODEC isn't defined, QString::fromLatin1()
 /QString::toLocal8Bit() is used, and they can't canged by users. I
 think this is broken for non-latin1-locale users. If they want to
 read/write the same file, it indeed will screw up user data ;-)

You misunderstand Latin 1 then. If you read the contents with Latin1 and write
using Latin1, you get exactly what you had before. No data is lost.

That's a good behaviour, since there is no QTextCodec to change the codec.

For user applications (that is, other than qmake, moc, uic, etc.) that turned
off QTextCodec, there's nothing that we can do. If you turn off the concept of
codecs, then you can't change it.

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden


signature.asc
Description: This is a digitally signed message part.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] Should we change the default codec of QTextStream to UTF-8?

2012-05-24 Thread 1+1=2
On Thu, May 24, 2012 at 12:47 PM, Thiago Macieira
thiago.macie...@intel.com wrote:
 You misunderstand Latin 1 then. If you read the contents with Latin1 and write
 using Latin1, you get exactly what you had before. No data is lost.


I know Latin1, I just forget that QString::toLocal8Bit() always equals
QString::toLatin1() when  QT_NO_TEXTCODEC is defined.  ;-)

Debao.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] Should we change the default codec of QTextStream to UTF-8?

2012-05-24 Thread 1+1=2
On Thu, May 24, 2012 at 12:47 PM, Thiago Macieira
thiago.macie...@intel.com wrote:
 For user applications (that is, other than qmake, moc, uic, etc.) that turned
 off QTextCodec, there's nothing that we can do. If you turn off the concept of
 codecs, then you can't change it.

Then

If we want to bootstrap tool such as qmake support utf8, we must use
QUtf8::convert{From/To}Unicode(),
as QString::{from/to}Utf8() can not provide ConverterState information
and QTextStream is nearly useless at bootstrap mode.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


[Development] Should we change the default codec of QTextStream to UTF-8?

2012-05-23 Thread 1+1=2
Hi all,

It might be a bit late for 5.0.

But Qt 5 have started to enforce that source code file must be UTF-8.
So perhaps it make sense to change the default codec of QTextStream to
UTF-8.
This won't break many things, as the old behavior can be obtained by
calling QTextStream::setCodec().
I have push a commit for it: https://codereview.qt-project.org/#change,26985

In addition, this will  improve the unicode handling ability of Qt
tools with using bootstrap. Otherwise, tools such as qmake can not
deal with UTF-8 encoded files.

Any idea?

Regards,

Debao
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development


Re: [Development] Should we change the default codec of QTextStream to UTF-8?

2012-05-23 Thread Thiago Macieira
On quarta-feira, 23 de maio de 2012 13.30.05, 1+1=2 wrote:
 In addition, this will  improve the unicode handling ability of Qt
 tools with using bootstrap. Otherwise, tools such as qmake can not
 deal with UTF-8 encoded files.

That is irrelevant. qmake operates on Latin1 exclusively because it doesn't
even include QTextCodec.

--
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
 Intel Sweden AB - Registration Number: 556189-6027
 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden


signature.asc
Description: This is a digitally signed message part.
___
Development mailing list
Development@qt-project.org
http://lists.qt-project.org/mailman/listinfo/development