On 12.12.2015 16:47, Bart wrote:
On 12/11/15, Sven Barth <pascaldra...@googlemail.com> wrote:
Not necessarily. You can use SetCodePage() to change the code page of
the string without triggering a codepage conversion by using the third
parameter which is a Boolean that tells the function to either do a
conversion (True; default) or not (False). You'd then need to declare
the UTF8* routines as RawByteString and explicitly handle the type
conversion.
That's not really an option since it will break every single program
using those functions.
AFAIK the Utf8* functions assume their input is UTF8 encoded (they do
not check), so something like this should work?
{$ifndef NO_CP_RTL}
procedure Utf8Delete(var S: Utf8String; StartCharIndex, CharCount:
PtrInt); overload;
var
Temp: String;
begin
SetLength(Temp, Length(S));
Move(S[1], Temp[1], Length(S));
//nex step might not be needed?
SetCodePage(RawBytestring(Temp), CP_UTF8, False);
UTF8Delete(Temp, StartCharIndex, CharCount);
SetLength(S, Length(Temp));
Move(Temp[1], S[1], Length(Temp));
end;
{$endif}
Anyhow, as stated before, there should be noneed to use the type
Utf8String in Lazarus programs.
Jonas has given me the following as a possible solution:
=== code begin ===
procedure UTF8Delete(var s: UTF8String; StartCharIndex, CharCount: PtrInt);
begin
...
end;
procedure UTF8Delete(var s: String; StartCharIndex, CharCount: PtrInt);
var
orgcp: tsystemcodepage;
tmp: utf8string;
begin
orgcp:=StringCodePage(s);
{ change code page without converting the data }
SetStringCodePage(s,CP_UTF8,false);
tmp:=s;
{ keep refcount to 1 if it was 1, to avoid unnecessary copies }
s:='';
UTF8Delete(tmp,StartCharIndex,CharCount);
{ same as above }
s:=tmp;
tmp:='';
SetStringCodePage(s,orgcp,false);
end;
=== code end ===
Regards,
Sven
--
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus.freepascal.org
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus