Re: Wide characters support in D

2010-06-10 Thread Jer
Ruslan Nikolaev wrote: Note: I posted this already on runtime D list, but I think that list was a wrong one for this question. Sorry for duplication :-) Hi. I am new to D. It looks like D supports 3 types of characters: char, wchar, dchar. This is cool, It's wrong, actually.

Re: Wide characters support in D

2010-06-09 Thread Simen kjaeraas
Pelle pelle.mans...@gmail.com wrote: On 06/08/2010 08:20 PM, Ruslan Nikolaev wrote: No. New messages are definitely not created by me. You can verify it here: http://blog.gmane.org/gmane.comp.lang.d.general You can easily see that in none of the top posts (except for the first one) my

Re: Wide characters support in D

2010-06-09 Thread Steven Schveighoffer
On Wed, 09 Jun 2010 07:22:17 -0400, Simen kjaeraas simen.kja...@gmail.com wrote: Pelle pelle.mans...@gmail.com wrote: On 06/08/2010 08:20 PM, Ruslan Nikolaev wrote: No. New messages are definitely not created by me. You can verify it here: http://blog.gmane.org/gmane.comp.lang.d.general

Re: Wide characters support in D

2010-06-08 Thread Nick Sabalausky
Ruslan Nikolaev nruslan_de...@yahoo.com wrote in message news:mailman.127.1275974825.24349.digitalmar...@puremagic.com... True. But even simple string handling is faster for UTF-16. The time required to read 2 bytes from UTF-16 string is the same 1 byte from UTF-8. Generally, we have to

Re: Wide characters support in D

2010-06-08 Thread Kagamin
Walter Bright Wrote: The problem with wchar's is that everyone forgets about surrogate pairs. Most UTF-16 programs in the wild, including nearly all Java programs, are broken with regard to surrogate pairs. I'm affraid, it will pretty hard to show the bug. I don't know whether java is

Re: Wide characters support in D

2010-06-08 Thread Ruslan Nikolaev
Maybe lousy is too strong a word, but aside from compatibility with other libs/software that use it (which I'll address separately), UTF-16 is not particularly useful compared to UTF-8 and UTF-32: ... I tried to avoid commenting this because I am afraid we'll stray away from the main

Re: Wide characters support in D

2010-06-08 Thread dennis luehring
please use the Reply Button On 08.06.2010 08:50, Ruslan Nikolaev wrote: Maybe lousy is too strong a word, but aside from compatibility with other libs/software that use it (which I'll address separately), UTF-16 is not particularly useful compared to UTF-8 and UTF-32: ... I tried to avoid

Re: Wide characters support in D

2010-06-08 Thread Nick Sabalausky
Ruslan Nikolaev nruslan_de...@yahoo.com wrote in message news:mailman.128.1275979841.24349.digitalmar...@puremagic.com... Secondly, Java and Windows adapted 16-bit encodings back when many people were still under the mistaken impression that would allow them to hold any character in one

Re: Wide characters support in D

2010-06-08 Thread Nick Sabalausky
Nick Sabalausky a...@a.a wrote in message news:huktq1$8t...@digitalmars.com... Ruslan Nikolaev nruslan_de...@yahoo.com wrote in message news:mailman.128.1275979841.24349.digitalmar...@puremagic.com... In addition, C# has been released already when UTF-16 became variable length. Right,

Re: Wide characters support in D

2010-06-08 Thread Ruslan Nikolaev
I'm well aware why UTF-32 is useful. Earlier, you had started out saying that there should only be one string type, the OS-native type. Now you're changing your tune and saying that we do need multiple types. No. From the very beginning I said it would also be nice to have some

Re: Wide characters support in D

2010-06-08 Thread Andrei Alexandrescu
On 06/08/2010 03:12 AM, Nick Sabalausky wrote: Nick Sabalauskya...@a.a wrote in message news:huktq1$8t...@digitalmars.com... Ruslan Nikolaevnruslan_de...@yahoo.com wrote in message news:mailman.128.1275979841.24349.digitalmar...@puremagic.com... In addition, C# has been released already when

Re: Wide characters support in D

2010-06-08 Thread Michel Fortin
On 2010-06-08 04:15:50 -0400, Ruslan Nikolaev nruslan_de...@yahoo.com said: No. From the very beginning I said it would also be nice to have some builtin function for conversion to dchar. That means it would be nice to have function that converts from tchar (regardless of its width) to

Re: Wide characters support in D

2010-06-08 Thread Ruslan Nikolaev
Is this what you want?     version (utf16)         alias wchar tchar;     else         alias char tchar;     alias immutable(tchar)[] tstring;     import std.utf;     unittest {         tstring tstr = hello;         dstring dstr = toUTF32(tstr);     } Yes, I think

Re: Wide characters support in D

2010-06-08 Thread Michel Fortin
On 2010-06-08 09:22:02 -0400, Ruslan Nikolaev nruslan_de...@yahoo.com said: you don't need to provide instances for every other character type, and at the same time - use native character encoding available on system. My opinion is thinking this will work is a fallacy. Here's why...

Re: Wide characters support in D

2010-06-08 Thread Ruslan Nikolaev
Generally Linux systems use UTF-8 so I guess the system encoding there will be UTF-8. But then if you start to use QT you have to use UTF-16, but you might have to intermix UTF-8 to work with other libraries in the backend (libraries which are not necessarily D libraries, nor system

Re: Wide characters support in D

2010-06-08 Thread dennis luehring
please stop top-posting - just click on the post you want to reply and click then reply - your flooding the newsgroup root with replies ... Am 08.06.2010 17:11, schrieb Ruslan Nikolaev: Generally Linux systems use UTF-8 so I guess the system encoding there will be UTF-8. But then if you

Re: Wide characters support in D

2010-06-08 Thread Yao G.
Every time you reply to somebody, a new message is created. Is kinda difficult to follow this discussion when you need to look more than 15 separated messages about the same issue. Please check your news client or something. Yao G. On Tue, 08 Jun 2010 10:11:34 -0500, Ruslan Nikolaev

Re: Wide characters support in D

2010-06-08 Thread Walter Bright
Ruslan Nikolaev wrote: No. From the very beginning I said it would also be nice to have some builtin function for conversion to dchar. That means it would be nice to have function that converts from tchar (regardless of its width) to UTF-32. The reason was always clear - you normally don't need

Re: Wide characters support in D

2010-06-08 Thread Ruslan Nikolaev
Every time you reply to somebody, a new message is created. Is kinda difficult to follow this discussion when you need to look more than 15 separated messages about the same issue. Please check your news client or something. Yao G. Sorry for that, I did not know there was some problem

Re: Wide characters support in D

2010-06-08 Thread Ruslan Nikolaev
...@digitalmars.com wrote: From: Walter Bright newshou...@digitalmars.com Subject: Re: Wide characters support in D To: digitalmars-d@puremagic.com Date: Tuesday, June 8, 2010, 8:36 PM Ruslan Nikolaev wrote: No. From the very beginning I said it would also be nice to have some builtin function

Re: Wide characters support in D

2010-06-08 Thread Nick Sabalausky
Andrei Alexandrescu seewebsiteforem...@erdani.org wrote in message news:hul65q$o9...@digitalmars.com... On 06/08/2010 03:12 AM, Nick Sabalausky wrote: Nick Sabalauskya...@a.a wrote in message news:huktq1$8t...@digitalmars.com... Ruslan Nikolaevnruslan_de...@yahoo.com wrote in message

Re: Wide characters support in D

2010-06-08 Thread Nick Sabalausky
dennis luehring dl.so...@gmx.net wrote in message news:hulqni$1ss...@digitalmars.com... please stop top-posting - just click on the post you want to reply and click then reply - your flooding the newsgroup root with replies ... Am 08.06.2010 17:11, schrieb Ruslan Nikolaev: Generally Linux

Re: Wide characters support in D

2010-06-08 Thread Ruslan Nikolaev
: Nick Sabalausky a...@a.a Subject: Re: Wide characters support in D To: digitalmars-d@puremagic.com Date: Tuesday, June 8, 2010, 9:50 PM dennis luehring dl.so...@gmx.net wrote in message news:hulqni$1ss...@digitalmars.com... please stop top-posting - just click on the post you want to reply

Re: Wide characters support in D

2010-06-08 Thread dennis luehring
but - there are serveral others using the web-interface and you the only power-top-poster around - maybe you should switch over to thunderbird or something --- On Tue, 6/8/10, Nick Sabalauskya...@a.a wrote: From: Nick Sabalauskya...@a.a Subject: Re: Wide characters support in D To: digitalmars-d

Re: Wide characters support in D

2010-06-08 Thread Ruslan Nikolaev
to other's comments. Ruslan. --- On Tue, 6/8/10, dennis luehring dl.so...@gmx.net wrote: From: dennis luehring dl.so...@gmx.net Subject: Re: Wide characters support in D To: digitalmars-d@puremagic.com Date: Tuesday, June 8, 2010, 10:11 PM Am 08.06.2010 19:55, schrieb Ruslan Nikolaev: Yeah

Re: Wide characters support in D

2010-06-08 Thread Nick Sabalausky
Ruslan Nikolaev nruslan_de...@yahoo.com wrote in message news:mailman.134.1276019725.24349.digitalmar...@puremagic.com... Yeah... Exactly. I just verified our posts via web interface. Why did he blame me for top posting (at least it can be inferred from that my message has been addressed to)? I

Re: Wide characters support in D

2010-06-08 Thread Nick Sabalausky
dennis luehring dl.so...@gmx.net wrote in message news:hum3fc$2dp...@digitalmars.com... Am 08.06.2010 20:20, schrieb Ruslan Nikolaev: No. New messages are definitely not created by me. You can verify it here: http://blog.gmane.org/gmane.comp.lang.d.general You can easily see that in none

Re: Wide characters support in D

2010-06-08 Thread Nick Sabalausky
Nick Sabalausky a...@a.a wrote in message news:hum6c8$2j0...@digitalmars.com... dennis luehring dl.so...@gmx.net wrote in message news:hum3fc$2dp...@digitalmars.com... Am 08.06.2010 20:20, schrieb Ruslan Nikolaev: No. New messages are definitely not created by me. You can verify it here:

Re: Wide characters support in D

2010-06-08 Thread bearophile
Walter Bright: The problem with dchar's is strings of them consume memory at a prodigious rate. Warning: lazy musings ahead. I hope we'll soon have computers with 200+ GB of RAM where using strings that use less than 32-bit chars is in most cases a premature optimization (like today is

Re: Wide characters support in D

2010-06-08 Thread Walter Bright
bearophile wrote: Walter Bright: The problem with dchar's is strings of them consume memory at a prodigious rate. Warning: lazy musings ahead. I hope we'll soon have computers with 200+ GB of RAM where using strings that use less than 32-bit chars is in most cases a premature optimization

Re: Wide characters support in D

2010-06-08 Thread Rainer Deyke
On 6/8/2010 13:57, bearophile wrote: I hope we'll soon have computers with 200+ GB of RAM where using strings that use less than 32-bit chars is in most cases a premature optimization (like today is often a silly optimization to use arrays of 16-bit ints instead of 32-bit or 64-bit ints. Only

Re: Wide characters support in D

2010-06-08 Thread Pelle
On 06/08/2010 08:20 PM, Ruslan Nikolaev wrote: No. New messages are definitely not created by me. You can verify it here: http://blog.gmane.org/gmane.comp.lang.d.general You can easily see that in none of the top posts (except for the first one) my name appears first. In fact, you have just

Re: Wide characters support in D

2010-06-08 Thread Nick Sabalausky
Rainer Deyke rain...@eldwood.com wrote in message news:humes8$s...@digitalmars.com... On 6/8/2010 13:57, bearophile wrote: I hope we'll soon have computers with 200+ GB of RAM where using strings that use less than 32-bit chars is in most cases a premature optimization (like today is often a

Re: Wide characters support in D

2010-06-08 Thread Nick Sabalausky
Nick Sabalausky a...@a.a wrote in message news:humfrk$2g...@digitalmars.com... Rainer Deyke rain...@eldwood.com wrote in message news:humes8$s...@digitalmars.com... On 6/8/2010 13:57, bearophile wrote: I hope we'll soon have computers with 200+ GB of RAM where using strings that use less

Wide characters support in D

2010-06-07 Thread Ruslan Nikolaev
Note: I posted this already on runtime D list, but I think that list was a wrong one for this question. Sorry for duplication :-) Hi. I am new to D. It looks like D supports 3 types of characters: char, wchar, dchar. This is cool, however, I have some questions about it: 1. When we have 2

Re: Wide characters support in D

2010-06-07 Thread Simen kjaeraas
Ruslan Nikolaev nruslan_de...@yahoo.com wrote: 1. When we have 2 methods (one with wchar[] and another with char[]), how D will determine which one to use if I pass a string hello world? String literals in D(2) are of type immutable(char)[] (char[] in D1) by default, and thus will be handled

Re: Wide characters support in D

2010-06-07 Thread Ali Çehreli
Ruslan Nikolaev wrote: 1. When we have 2 methods (one with wchar[] and another with char[]), how D will determine which one to use if I pass a string hello world? I asked the same question on the D.learn group recently. Literals like that don't have a particular encoding. The programmer

Re: Wide characters support in D

2010-06-07 Thread Robert Clipsham
On 07/06/10 22:48, Ruslan Nikolaev wrote: Note: I posted this already on runtime D list, but I think that list was a wrong one for this question. Sorry for duplication :-) Hi. I am new to D. It looks like D supports 3 types of characters: char, wchar, dchar. This is cool, however, I have some

Re: Wide characters support in D

2010-06-07 Thread justin
This doesn't answer all your questions and suggestions, but here goes. In answer to #1, Hello world is a literal of type char[] (or string). If you want to use UTF-16 or 32, use Hello worldw and Hello worldd respectively. In partial answer to #2 and #3, it's generally pretty easy to adapt a

Re: Wide characters support in D

2010-06-07 Thread Ruslan Nikolaev
Ok, ok... that was just a suggestion... Thanks, for reply about Hello world representation. Was postfix w and d added initially or just recently? I did not know about it. I thought D does automatic conversion for string literals. Yes, templates may help. However, that unnecessary make code

Re: Wide characters support in D

2010-06-07 Thread Walter Bright
Ruslan Nikolaev wrote: Note: I posted this already on runtime D list, Although D is designed to be fairly agnostic about character types, in practice I recommend the following: 1. Use the string type for strings, it's char[] on D1 and immutable(char)[] on D2. 2. Use dchar's to hold

Re: Wide characters support in D

2010-06-07 Thread Ruslan Nikolaev
their code to be completely broken. Thanks, Ruslan. --- On Tue, 6/8/10, Ruslan Nikolaev nruslan_de...@yahoo.com wrote: From: Ruslan Nikolaev nruslan_de...@yahoo.com Subject: Re: Wide characters support in D To: digitalmars.D digitalmars-d@puremagic.com Date: Tuesday, June 8, 2010, 3:16 AM

Re: Wide characters support in D

2010-06-07 Thread Steven Schveighoffer
On Mon, 07 Jun 2010 17:48:09 -0400, Ruslan Nikolaev nruslan_de...@yahoo.com wrote: Note: I posted this already on runtime D list, but I think that list was a wrong one for this question. Sorry for duplication :-) Hi. I am new to D. It looks like D supports 3 types of characters: char,

Re: Wide characters support in D

2010-06-07 Thread Nick Sabalausky
Ruslan Nikolaev nruslan_de...@yahoo.com wrote in message news:mailman.122.1275952601.24349.digitalmar...@puremagic.com... Ok, ok... that was just a suggestion... Thanks, for reply about Hello world representation. Was postfix w and d added initially or just recently? I did not know about it.

Re: Wide characters support in D

2010-06-07 Thread Walter Bright
Ruslan Nikolaev wrote: Just one more addition: it is possible to have built-in function that converts multibyte (or multiword) char sequence (even though in my proposal it can be of different size) to dchar (UTF-32) character. Again, my only point is that it would be nice to have something

Re: Wide characters support in D

2010-06-07 Thread Ruslan Nikolaev
It only generates code for the types that are actually needed. If, for instance, your progam never uses anything except UTF-8, then only one version of the function will be made - the UTF-8 version.  If you don't use every char type, then it doesn't generate it for every char type -

Re: Wide characters support in D

2010-06-07 Thread Ali Çehreli
Steven Schveighoffer wrote: a function that takes a char[] can also take a dchar[] if it is sent through a converter (i.e. toUtf8 on Tango I think). In Phobos, there are text, wtext, and dtext in std.conv: /** Convenience functions for converting any number and types of arguments into

Re: Wide characters support in D

2010-06-07 Thread Jesse Phillips
On Mon, 07 Jun 2010 19:26:02 -0700, Ruslan Nikolaev wrote: It only generates code for the types that are actually needed. If, for instance, your progam never uses anything except UTF-8, then only one version of the function will be made - the UTF-8 version.  If you don't use every char type,

Re: Wide characters support in D

2010-06-07 Thread Ruslan Nikolaev
--- On Tue, 6/8/10, Jesse Phillips jessekphillip...@gmail.com wrote: I think you really need to look more into what templates are and do. Excuse me? Unless templates are something different in D (I can't be 100% sure since I am new D), it should be the case. At least in C++, that would be

Re: Wide characters support in D

2010-06-07 Thread Ruslan Nikolaev
Yes, to clarify what I suggest, I can put it as follows (2 possibilities): 1. Have a special standardized type tchar and tstring. Then, system libraries as well as users can use this type unless they want to do something special. There can be a compiler switch to change tchar width

Re: Wide characters support in D

2010-06-07 Thread BCS
Hello Ruslan, --- On Tue, 6/8/10, Jesse Phillips jessekphillip...@gmail.com wrote: I think you really need to look more into what templates are and do. As I said, for libraries you need to compile every commonly used instance, so that user will not be burdened with this overhead. You only

Re: Wide characters support in D

2010-06-07 Thread Ruslan Nikolaev
You only need to do that where you are shipping closed source and for that, it should be trivial to get the compiler to generate all three versions. You will also need to do it in open source projects if you want to include generated template code into dynamic library as opposed to user's

Re: Wide characters support in D

2010-06-07 Thread Nick Sabalausky
Ruslan Nikolaev nruslan_de...@yahoo.com wrote in message news:mailman.124.1275963971.24349.digitalmar...@puremagic.com... Nick wrote: It only generates code for the types that are actually needed. If, for instance, your progam never uses anything except UTF-8, then only one version of the