On 2020-09-14 14:20, Michael Van Canneyt wrote:
On Mon, 14 Sep 2020, Tomas Hajny via fpc-pascal wrote:
On 2020-09-14 13:39, Michael Van Canneyt wrote:
On Mon, 14 Sep 2020, Tomas Hajny via fpc-pascal wrote:
On 2020-09-12 23:03, Tomas Hajny wrote:
On 2020-09-12 18:51, Jonas Maebe via fpc-pascal wrote:
On 12/09/2020 18:44, Sven Barth via fpc-pascal wrote:
Jonas Maebe via fpc-pascal <fpc-pascal@lists.freepascal.org
<mailto:fpc-pascal@lists.freepascal.org>> schrieb am Sa., 12.
Sep. 2020,
17:47:
.
.
1) Wouldn't it be better if shortstrings are treated the same way as
ansistrings with CP_ACP? This would make a difference only during
assignments to strings with different codepages. Since strings with
different codepages didn't exist in the past (and in the current
situation they are simply broken), this change shouldn't break
compatibility hopefully.
No idea what to advise here. I would think shortstring is ASCII or
OEM
codepage, not even ANSI :/
As far as I'm concerned, there's no difference between OEM or ANSI (or
ISO 8859-x for that matter) _unless_ somebody targets Win32/Win64 and
never anything else. From this point of view, there's no reason why
shortstrings should be always OEM. Historically, they were used simply
for the default characters set on the particular operaing environment.
2) Shouldn't WriteLn with a untyped string constant parameter result
in calling some Unicode based version of WriteLn rather than the
shortstring overloaded version (since the constant
is stored in UTF-16 internally)?
What is the codepage of a constant string ? Should this not be used ?
That's what I wrote - internally, the (untyped) constant strings are
stored in UTF-16.
3) Shouldn't we try to make the output of Write with and without
unit Crt compatible to each other? If we do so, what should be the
encoding used for output redirected to a file - should it use
DefaultSystemCodePage, or scpConsoleCP, or what (remember that this
question doesn't exist with unit Crt, because unit Crt isn't
compatible with redirection).
I think this last one are in fact 3 questions:
- What to do if output is redirected externally ? (IMHO nothing)
There's no "nothing". Every text file record has an attribute stating
the codepage used for that text file. The question is which codepage
should be assigned there under which cases.
Why do you want to assign one ? It's the responsability of the user to
set the correct codepage.
As far as I can see it is initialized to the default one when the file
is
opened:
{$ifdef FPC_HAS_FEATURE_ANSISTRINGS}
{ if no codepage is yet assigned then assign default ansi codepage }
TextRec(t).CodePage:=TranslatePlaceholderCP(TextRec(t).CodePage);
{$else FPC_HAS_FEATURE_ANSISTRINGS}
I see no need to change this ?
Please, have a look at OpenStdIO (implemented at the end of text.inc).
That one changes the default codepage for Input, Output and StdErr, but
it doesn't care about possible redirection and/or piping to some other
application (not necessarily written in FPC). That approach has both
advantages and disadvantages. The advantage is that "type
<redirected_file>" will give correct results on the console. The
disadvantage is that opening the redirected file with any Win32 GUI
application (let's say notepad.exe) will result in garbage. I don't say
that it is necessarily bad, but it should be documented at least if we
want to keep it that way.
- What to do if output is redirected internally ? (IMHO, the codepage
should be kept)
Kept from what?
Things like AssignCRT() should not set the codepage on the passed text
record.
AssignCrt doesn't do that. The translation is performed when calling the
console output functions (on MS Windows).
- Whether and how to extend Crt so it works with unicode.
(Since Crt is legacy, I would not touch it; You'd need to rewrite
it
as unicode.)
That is a different question and I don't want to raise that one. My
question is simply whether WriteLn (shortstring) should behave
differently with and without Crt.
I think it will differ since Crt is not codepage aware. If you want it
to
work the same you'll have to make Crt codepage (and hence unicode)
aware.
As mentioned by me, Crt is currently more codepage aware than the System
unit output as far as output to console is concerned, because Crt
provides correct output even for shortstrings (unlike the System unit).
Tomas
_______________________________________________
fpc-pascal maillist - fpc-pascal@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal