Re: AA and struct with const member
On Tuesday, 28 December 2021 at 01:45:42 UTC, Era Scarecrow wrote: Success! So i summarize, either work with a pointer, or drop the const... Of course casting the const away was the first thing I did but I think this is not very clean :D
Re: AA and struct with const member
On Tuesday, 28 December 2021 at 06:38:03 UTC, Tejas wrote: On Tuesday, 28 December 2021 at 01:45:42 UTC, Era Scarecrow wrote: On Monday, 27 December 2021 at 19:38:38 UTC, frame wrote: [...] const/immutable members are to be set/assigned instantiation. Most likely the problem is a bug and sounds like [...] The workaround is okay, but I think we should file a bug report for this. This is very ~~stupid~~ undesirable behaviour I agree. I'll just wait if somebody can explain why this isn't a bug or wanted behaviour or a known issue.
Re: How to print unicode characters (no library)?
On Monday, 27 December 2021 at 21:38:03 UTC, Era Scarecrow wrote: Well to add functionality with say ANSI you entered an escape code and then stuff like offset, color, effect, etc. UTF-8 automatically has escape codes being anything 128 or over, so as long as the terminal understand it, it should be what's handling it. https://www.robvanderwoude.com/ansi.php In the end it's all just a binary string of 1's and 0's. Thanks for that post!! I already knew about some of this "escape codes" but I full list of them will come in handy ;)
Re: How to print unicode characters (no library)?
On Monday, 27 December 2021 at 14:47:51 UTC, Kagamin wrote: https://utf8everywhere.org/ - this is an advise from a windows programmer, I use it too. Windows allocates a per thread buffer and when you call, say, WriteConsoleA, it first transcodes the string to UTF-16 in the buffer and calls WriteConsoleW, you would do something like that. That's awesome! Like I said to Adam, I will not officially write code for Windows myself (at least for now) so It will probably be up to the contributors to decide anyway. Tho knowing that there will not be compatibility problems with the latest versions of Windows is just nice to know. Thanks a lot for the info man!
Re: How to print unicode characters (no library)?
On Monday, 27 December 2021 at 14:30:55 UTC, Adam D Ruppe wrote: Most unix things do utf-8 more often than not, but technically you are supposed to check the locale and change the terminal settings to do it right. Cool! I mean, I don't plan on supporting legacy systems so I think we're fine if the up-to-date systems fully support UTF-8 as the default. You should ALWAYS use the -W suffix functions on Windows when available, and pass them utf-16 encoded strings. There's a bunch of windows things taking utf-8 nowdays too, but utf-16 is what they standardized on back in the 1990's so it gives you a lot of compatibility. The Windows OS will convert to other things for you it for you do this utf-16 consistently. That's pretty nice. In this case is even better because at least for now, I will not work on Windows by myself because making the library work on Linux is a bit of a challenge itself. So I will wait for any contributors to work with that and they will probably know how windows convert UTF-8 to UTF-16 and they will be able to do tests. Also I plan to support only Windows 10/11 64-bit officially so just like with Unix, I don't mind if legacy systems don't work. The Windows API is an absolute pleasure to work with next to much of the trash you're forced to deal with on Linux. Wht??? Don't crash my dreams sempai!!! I mean, this may sound stupid but which kind of API you are referring to? Do you mean system library stuff (like "unistd.h" for linux and "windows.h" for Windows) or low level system calls?
Re: How to print unicode characters (no library)?
On 27.12.21 15:23, Adam D Ruppe wrote: Let's look at: "Hello \n"; [...] Finally, there's "string", which is utf-8, meaning each element is 8 bits, but again, there is a buffer you need to build up to get the code points you feed into that VM. [...] H, e, l, l, o, , MORE elements>, , , final work-in-progress element>, [...] Notice how each element here told you how many elements are left. This is encoded into the bit pattern and is part of why it took 4 elements instead of just three; there's some error-checking redundancy in there. This is a nice part of the design allowing you to validate a utf-8 stream more reliably and even recover if you jumped somewhere in the middle of a multi-byte sequence. It's actually just the first byte that tells you how many are in the sequence. The continuation bytes don't have redundancies for that. To recover from the middle of a sequence, you just skip the orphaned continuation bytes one at a time.
Re: AA and struct with const member
On Tuesday, 28 December 2021 at 01:45:42 UTC, Era Scarecrow wrote: On Monday, 27 December 2021 at 19:38:38 UTC, frame wrote: [...] const/immutable members are to be set/assigned instantiation. Most likely the problem is a bug and sounds like [...] The workaround is okay, but I think we should file a bug report for this. This is very ~~stupid~~ undesirable behaviour
Re: How to print unicode characters (no library)?
On Monday, 27 December 2021 at 14:23:37 UTC, Adam D Ruppe wrote: [...] After reading the whole things, I said it and I'll say it again! You guys must get paid for your support I also helped a guy in another forum yesterday writing a very big reply and tbh it felt great :P (or of course when you get to a human reader, they can interpret it differently too but obviously human language is a whole other mess lol) Yep! If machines are complicated, humans are even more complicated. Tho machine are also made from humans so... h!
Re: Ambiguity issue with expanding and evaluating single template type parameter enums
On Tuesday, 28 December 2021 at 00:57:27 UTC, Paul Backus wrote: ```d enum instantiate(string type, string expr) = type ~ "(" ~ expr ~ ")"; pragma(msg, instantiate!("RVector!(SEXPTYPE.REALSXP)", "x")); ``` One possibility is to generate a collection of compile time strings that denote the types and then to a comparison with the type something like `is(T == mixin(CString)`, where `CString = "RVector!(SEXPTYPE.REALSXP)"` to discover the correct string which I can then use to generate the code without having to use `T.stringof` anywhere in the code at all.
Re: AA and struct with const member
On Monday, 27 December 2021 at 19:38:38 UTC, frame wrote: I feel stupid right now: One cannot assign a struct that contains const member to AA? Error: cannot modify struct instance ... of type ... because it contains `const` or `immutable` members This is considered a modification? ```d struct S { const(int) a; } S[string] test; test["a"] = S(1); ``` Whats the workaround for that? const/immutable members are to be set/assigned instantiation. Most likely the problem is a bug and sounds like a) the struct doesn't exist in the AA, so it creates it (with a default) b) It tries to copy but contains a const and thus fails Passing a pointer will do you no good, since structs are likely to be on the stack. So let's try opAssign. ```d auto ref opAssign(S s) { this=s; return this; } ``` So we get ``` 'cannot modify struct instance `this` of type `S` because it contains `const` or `immutable` members'. ``` Alright let's look at the members we can work with. https://dlang.org/spec/hash-map.html I don't see an 'add' but i do see a 'require' which will add something in. So we try that. test.require("a", S(1)); ``` Now we get: Error: cannot modify struct instance `*p` of type `S` because it contains `const` or `immutable` members test.d(??): Error: template instance `object.require!(string, S)` error instantiating ``` Hmmm it really doesn't like it. Finally we can fake it. Let's make a mirror struct without the const, for the purposes of adding it. ```d struct S { const(int) a; } struct S2 { int a; } S[string] test; cast(S2[string])test = S2(1); ``` ``` Error: `cast(S2[string])test` is not an lvalue and cannot be modified ``` Well that's not going to work. Let's make it a pointer and allocate it instead. ```d S*[string] test; test["a"] = new S(1); ``` Success! So i summarize, either work with a pointer, or drop the const...
Re: Ambiguity issue with expanding and evaluating single template type parameter enums
On Tuesday, 28 December 2021 at 00:42:18 UTC, data pulverizer wrote: On Tuesday, 28 December 2021 at 00:32:03 UTC, Paul Backus wrote: In this case, the simplest solution is to have your code generator accept a string as its input, rather than a type. For example: ```d enum instantiate(string type, string expr) = type ~ "(" ~ expr ~ ")"; pragma(msg, instantiate!("RVector!(SEXPTYPE.REALSXP)", "x")); ``` Well the code needs to be responsive from parameter types `T` generated from other code. I'm allowing the user to create functions select those they wish to access in R by UDA decorators in the D script which I then filter for and wrap the necessary functions generating any type conversion code I need at compile time to create functions callable in R. I see. So, you need access to the type as a type in order to reflect on it, but you also want it as a string in order to generate code. My guess is that you don't actually *need* to use string mixins for most of this, but I can't say for sure without seeing a more complete example.
Re: Ambiguity issue with expanding and evaluating single template type parameter enums
On Tuesday, 28 December 2021 at 00:32:03 UTC, Paul Backus wrote: The result of `.stringof` is completely implementation-defined, may change arbitrarily between compiler releases, and is not even guaranteed to be valid D code in the first place. Wow, I didn't know this. In this case, the simplest solution is to have your code generator accept a string as its input, rather than a type. For example: ```d enum instantiate(string type, string expr) = type ~ "(" ~ expr ~ ")"; pragma(msg, instantiate!("RVector!(SEXPTYPE.REALSXP)", "x")); ``` Well the code needs to be responsive from parameter types `T` generated from other code. I'm allowing the user to create functions select those they wish to access in R by UDA decorators in the D script which I then filter for and wrap the necessary functions generating any type conversion code I need at compile time to create functions callable in R.
Re: Ambiguity issue with expanding and evaluating single template type parameter enums
On Tuesday, 28 December 2021 at 00:13:13 UTC, data pulverizer wrote: There are various requirements, sometimes I have to cast or type convert, so I **need** the type to paste correctly and explicitly. You almost never actually need types as strings. I'm almost certain there's a better way for you to get the same work done. Have you tried just using T directly in your mixin? You can frequently just use the local name and skip the string getting entirely.
Re: Ambiguity issue with expanding and evaluating single template type parameter enums
On Tuesday, 28 December 2021 at 00:13:13 UTC, data pulverizer wrote: The types I'm generating are a template type I've constructed for R's SEXP, so that my wrapped numeric vector (struct) type is denoted `RVector!(REALSXP)`. But `alias REALSXP = SEXPTYPE.REALSXP` where `SEXPTYPE` is an `enum`. So if I start using `T.stringof` where `T = RVector!(SEXPTYPE.REALSXP)` to generate code it starts to create chaos because `T.stringof = "RVector!SEXPTYPE.REALSXP"`, so if I'm trying to convert or instantiate a type using `T.stringof ~ "(x)"`, I'll get `RVector!SEXPTYPE.REALSXP(x)` which gives an error, and various types like this can occur many times in a script. The new template allows me to safely paste the type and get what I want `RVector!(SEXPTYPE.REALSXP)(x)`. The correct answer here is, "don't use `T.stringof` to generate code." The result of `.stringof` is completely implementation-defined, may change arbitrarily between compiler releases, and is not even guaranteed to be valid D code in the first place. You should not rely on it unless you have literally no other choice. In this case, the simplest solution is to have your code generator accept a string as its input, rather than a type. For example: ```d enum instantiate(string type, string expr) = type ~ "(" ~ expr ~ ")"; pragma(msg, instantiate!("RVector!(SEXPTYPE.REALSXP)", "x")); ```
Re: Ambiguity issue with expanding and evaluating single template type parameter enums
On Monday, 27 December 2021 at 23:04:40 UTC, Adam Ruppe wrote: On Monday, 27 December 2021 at 21:21:30 UTC, data pulverizer wrote: alias T = MyType!(INTEGER); What is MyType? enum code = "writeln(\"instance: \", adder(" ~ T.stringof ~ "(), " ~ U.stringof ~ "()" ~ "));"; And why is this a string mixin instead of a plain simple function? prolly need more context Sorry the example is a bit contrived but basically I'm generating a whole bunch of code using string mixins. The types I'm generating are a template type I've constructed for R's SEXP, so that my wrapped numeric vector (struct) type is denoted `RVector!(REALSXP)`. But `alias REALSXP = SEXPTYPE.REALSXP` where `SEXPTYPE` is an `enum`. So if I start using `T.stringof` where `T = RVector!(SEXPTYPE.REALSXP)` to generate code it starts to create chaos because `T.stringof = "RVector!SEXPTYPE.REALSXP"`, so if I'm trying to convert or instantiate a type using `T.stringof ~ "(x)"`, I'll get `RVector!SEXPTYPE.REALSXP(x)` which gives an error, and various types like this can occur many times in a script. The new template allows me to safely paste the type and get what I want `RVector!(SEXPTYPE.REALSXP)(x)`. There are various requirements, sometimes I have to cast or type convert, so I **need** the type to paste correctly and explicitly. Which is what the `safe_stringof` template does for my baby example - the same methodology will work just as well for my `RVector` code.
Re: Ambiguity issue with expanding and evaluating single template type parameter enums
On Monday, 27 December 2021 at 21:21:30 UTC, data pulverizer wrote: alias T = MyType!(INTEGER); What is MyType? enum code = "writeln(\"instance: \", adder(" ~ T.stringof ~ "(), " ~ U.stringof ~ "()" ~ "));"; And why is this a string mixin instead of a plain simple function? prolly need more context
Re: Ambiguity issue with expanding and evaluating single template type parameter enums
On Monday, 27 December 2021 at 22:52:58 UTC, data pulverizer wrote: I think the only thing to do for now is probably for me to construct a template that creates a proper string for this type. It would look something like this: ``` enum safe_stringof(T) = T.stringof; template safe_stringof(T: MyType!U, alias U) { enum string safe_stringof = "MyType!(" ~ U.stringof ~ ")"; } ``` So this ``` alias DOUBLE = MyEnum.DOUBLE; alias STRING = MyEnum.STRING; alias INTEGER = MyEnum.INTEGER; void main() { alias T = MyType!(INTEGER); alias U = MyType!(STRING); enum code = "writeln(\"instance: \", adder(" ~ safe_stringof!(T) ~ "(), " ~ safe_stringof!(U) ~ "()" ~ "));"; pragma(msg, code); } ``` Which works. Now back to my very late dinner.
Re: Ambiguity issue with expanding and evaluating single template type parameter enums
On Monday, 27 December 2021 at 21:31:03 UTC, Adam Ruppe wrote: if you can paste teh code where you generate this I can prolly show you a much easier way to do it. stringof sucks really hard. I think the only thing to do for now is probably for me to construct a template that creates a proper string for this type.
Re: Ambiguity issue with expanding and evaluating single template type parameter enums
On Monday, 27 December 2021 at 21:31:03 UTC, Adam Ruppe wrote: if you can paste teh code where you generate this I can prolly show you a much easier way to do it. stringof sucks really hard. Will the above `mixin` example suffice? It expands to the code that I described.
Re: How to print unicode characters (no library)?
On Monday, 27 December 2021 at 07:12:24 UTC, rempas wrote: On Sunday, 26 December 2021 at 21:22:42 UTC, Adam Ruppe wrote: write just transfers a sequence of bytes. It doesn't know nor care what they represent - that's for the receiving end to figure out. Oh, so it was as I expected :P Well to add functionality with say ANSI you entered an escape code and then stuff like offset, color, effect, etc. UTF-8 automatically has escape codes being anything 128 or over, so as long as the terminal understand it, it should be what's handling it. https://www.robvanderwoude.com/ansi.php In the end it's all just a binary string of 1's and 0's.
Re: Ambiguity issue with expanding and evaluating single template type parameter enums
On Monday, 27 December 2021 at 21:05:51 UTC, data pulverizer wrote: adder(MyType!MyEnum.INTEGER(), MyType!MyEnum.STRING()); The rule for !(args) is of you leave the parenthesis off, it only uses the next single token as the argument. So it will never include a dot; it is like you wrote `MyType!(MyEnum).INTEGER`. You might just always use the () in your generated code when you create that mixin string can't just just change the generator to put the () around it? Or is the stringof generating this? (Another reason why stringof is terrible and should never be used ever for anything.) `MyType!MyEnum.STRING` is generated with `T.stringof `. I get the error: if you can paste teh code where you generate this I can prolly show you a much easier way to do it. stringof sucks really hard.
Re: Ambiguity issue with expanding and evaluating single template type parameter enums
On Monday, 27 December 2021 at 21:05:51 UTC, data pulverizer wrote: Hello, ... ... an equivalent mixin error would be ``` //... alias DOUBLE = MyEnum.DOUBLE; alias STRING = MyEnum.STRING; alias INTEGER = MyEnum.INTEGER; void main() { alias T = MyType!(INTEGER); alias U = MyType!(STRING); enum code = "writeln(\"instance: \", adder(" ~ T.stringof ~ "(), " ~ U.stringof ~ "()" ~ "));"; mixin(code); } ```
Ambiguity issue with expanding and evaluating single template type parameter enums
Hello, I'm generating code using mixins and one of my mixins expands to something like this: ``` adder(MyType!MyEnum.INTEGER(), MyType!MyEnum.STRING()); ``` `MyType!MyEnum.STRING` is generated with `T.stringof `. I get the error: ``` Error: template instance `MyType!(MyEnum)` does not match template declaration `MyType(MyEnum type) ``` and if I manually amend the code to this: ``` adder(MyType!(MyEnum.INTEGER)(), MyType!(MyEnum.STRING)()); ``` It runs fine. It looks like the ambiguity of UFCS and type is messing things up. This is a simplified example. Since the code is being generated automatically in many places I can't go round adding the brackets. A simplified functional example is given below: ``` import std.stdio: writeln; enum MyEnum { DOUBLE = 0, STRING = 1, INTEGER = 2 } struct MyType(MyEnum type) {} auto getValue(T: MyType!U, alias U)(T x) { return U; } auto adder(T, U)(T x, U y) { return getValue(x) + getValue(y); } void main() { writeln("instance: ", adder(MyType!MyEnum.INTEGER(), MyType!MyEnum.STRING())); } ```
AA and struct with const member
I feel stupid right now: One cannot assign a struct that contains const member to AA? Error: cannot modify struct instance ... of type ... because it contains `const` or `immutable` members This is considered a modification? ```d struct S { const(int) a; } S[string] test; test["a"] = S(1); ``` Whats the workaround for that?
Re: How to print unicode characters (no library)?
On Mon, Dec 27, 2021 at 04:40:19PM +, Adam D Ruppe via Digitalmars-d-learn wrote: > On Monday, 27 December 2021 at 15:26:16 UTC, H. S. Teoh wrote: > > A lot of modern Linux applications don't even work properly under > > anything non-UTF-8 > > yeah, you're supposed to check the locale but since so many people > just assume that's becoming the new de facto reality Yep, sad reality. > just like how people blindly shoot out vt100 codes without checking > TERM and that usually works too. Haha, doesn't terminal.d do that in a few places too? ;-) To be fair, though, most of the popular terminal apps are based off of extensions of vt100 codes anyway, so the basic escape sequences more-or-less work across the board. AFAIK non-vt100 codes are getting rarer and can practically be treated as legacy these days. (At least on Linux, that is. Can't say for the other *nixen.) > > I'm not a regular Windows user, but I did remember running into problems > > where sometimes command.exe doesn't handle Unicode properly, and needs > > an API call to switch it to UTF mode or something. > > That'd be because someone called the -A function instead of the -W ones. The > -W ones just work if you use them. The -A ones are there for compatibility > with Windows 95 and have quirks. This is the point behind my blog post i > linked before, people saying to make that api call don't understand the > problem and are patching over one bug with another bug instead of actually > fixing it with the correct function call. Point. T -- Just because you survived after you did it, doesn't mean it wasn't stupid!
Re: How to print unicode characters (no library)?
On Monday, 27 December 2021 at 15:26:16 UTC, H. S. Teoh wrote: A lot of modern Linux applications don't even work properly under anything non-UTF-8 yeah, you're supposed to check the locale but since so many people just assume that's becoming the new de facto reality just like how people blindly shoot out vt100 codes without checking TERM and that usually works too. I'm not a regular Windows user, but I did remember running into problems where sometimes command.exe doesn't handle Unicode properly, and needs an API call to switch it to UTF mode or something. That'd be because someone called the -A function instead of the -W ones. The -W ones just work if you use them. The -A ones are there for compatibility with Windows 95 and have quirks. This is the point behind my blog post i linked before, people saying to make that api call don't understand the problem and are patching over one bug with another bug instead of actually fixing it with the correct function call.
Re: How to print unicode characters (no library)?
On Mon, Dec 27, 2021 at 02:30:55PM +, Adam D Ruppe via Digitalmars-d-learn wrote: > On Monday, 27 December 2021 at 11:21:54 UTC, rempas wrote: > > So should I just use UTF-8 only for Linux? > > Most unix things do utf-8 more often than not, but technically you are > supposed to check the locale and change the terminal settings to do it > right. Technically, yes. But practically all modern Linux distros have standardized on UTF-8, and you're quite unlikely to run into non-UTF-8 environments except on legacy systems or extremely specialized applications. I don't know what's the situation on BSD, but I'd imagine it's pretty similar. A lot of modern Linux applications don't even work properly under anything non-UTF-8, so for practical purposes I'd say don't even worry about it, unless you're specifically targeting a non-UTF8 environment for a specific reason. > > But what about Windows? > > You should ALWAYS use the -W suffix functions on Windows when > available, and pass them utf-16 encoded strings. [...] I'm not a regular Windows user, but I did remember running into problems where sometimes command.exe doesn't handle Unicode properly, and needs an API call to switch it to UTF mode or something. T -- First Rule of History: History doesn't repeat itself -- historians merely repeat each other.
Re: How to print unicode characters (no library)?
On Monday, 27 December 2021 at 11:21:54 UTC, rempas wrote: So should I just use UTF-8 only for Linux? What about other operating systems? I suppose Unix-based OSs (maybe MacOS as well if I'm lucky) work the same as well. But what about Windows? Unfortunately I have to support this OS too with my library so I should know. If you know and you can tell me of course... https://utf8everywhere.org/ - this is an advise from a windows programmer, I use it too. Windows allocates a per thread buffer and when you call, say, WriteConsoleA, it first transcodes the string to UTF-16 in the buffer and calls WriteConsoleW, you would do something like that.
Re: How to print unicode characters (no library)?
On Monday, 27 December 2021 at 11:21:54 UTC, rempas wrote: So should I just use UTF-8 only for Linux? Most unix things do utf-8 more often than not, but technically you are supposed to check the locale and change the terminal settings to do it right. But what about Windows? You should ALWAYS use the -W suffix functions on Windows when available, and pass them utf-16 encoded strings. There's a bunch of windows things taking utf-8 nowdays too, but utf-16 is what they standardized on back in the 1990's so it gives you a lot of compatibility. The Windows OS will convert to other things for you it for you do this utf-16 consistently. Unfortunately I have to support this OS too with my library so I should know. The Windows API is an absolute pleasure to work with next to much of the trash you're forced to deal with on Linux.
Re: How to print unicode characters (no library)?
On Monday, 27 December 2021 at 07:12:24 UTC, rempas wrote: Oh yeah. About that, I wasn't given a demonstration of how it works so I forgot about it. I saw that in Unicode you can combine some code points to get different results but I never saw how that happens in practice. The emoji is one example, the one you posted is two code points. Some other common ones are accented letters will SOMETIMES - there's exceptions - be created by the letter followed by an accent mark. Some of those complicated emojis are several points with optional changes. Like it might be "woman" followed by "skin tone 2" . Some of them are "dancing" followed by "skin tone 0" followed by "male" and such. So it displays as one thing, but it is composed by 2 or more code points, and each code point might be composed from several code units, depending on the encoding. Again, think of it more as a little virtual machine building up a thing. A lot of these are actually based on combinations of old typewriters and old state machine terminal hardware. Like the reason "a" followed by "backspace" followed by "_" - SOMETIMES, it depends on the receiving program, this isn't a unicode thing - might sometimes be an underlined a because of think about typing that on a typewriter with a piece of paper. The "a" gets stamped on the paper. Backspace just moves back, but since the "a" is already on the paper, it isn't going to be erased. So when you type the _, it gets stamped on the paper along with the a. So some programs emulate that concept. The emoji thing is the same basic idea (though it doesn't use backspace): start by drawing a woman, then modify it with a skin color. Or start by drawing a person, then draw another person, then add a skin color, then make them female, and you have a family emoji. Impossible to do that stamping paper, but a little computer VM can understand this and build up the glyph. Yes, that's a great way of seeing it. I suppose that this all happens under the hood and it is OS specific so why have to know how the OS we are working with works under the hood to fully understand how this happens. 9 Well, it isn't necessarily OS, any program can do its own thing. Of course, the OS can define something: Windows, for example, defines its things are UTF-16, or you can use a translation layer which does its own things for a great many functions. But still applications might treat it differently. For example, the xterm terminal emulator can be configured to use utf-8 or something else. It can be configured to interpret them in a way that emulated certain old terminals, including ones that work like a printer or the state machine things. However, do you know what we do from cross compatibility then? Because this sounds like a HUGE mess real world applications Yeah, it is a complete mess, especially on Linux. But even on Windows where Microsoft standardized on utf-16 for text functions, there's still weird exceptions. Like writing to the console vs piping to an application can be different. If you've ever written a single character to a windows pipe and seen different reults than if you wrote two, now you get an idea why it is trying to auto-detect if it is two-byte characters or one-byte streams. I wrote a little bit about this on my public blog: http://dpldocs.info/this-week-in-d/Blog.Posted_2019_11_25.html Or view the source of my terminal.d to see some of the "fun" in decoding all this nonsense. http://arsd-official.dpldocs.info/arsd.terminal.html The module there does a lot more than just the basics, but still most the top half of the file is all about this stuff. Mouse input might be encoded as utf characters, then you gotta change the mode and check various detection tricks. Ugh. I don't understand that. Based on your calculations, the results should have been different. Also how are the numbers fixed? Like you said the amount of bytes of each encoding is not always standard for every character. Even if they were fixed this means 2-bytes for each UTF-16 character and 4-bytes for each UTF-32 character so still the numbers doesn't make sense to me. They're not characters, they're code points. Remember, multiple code points can be combined to form one character on screen. Let's look at: "Hello \n"; This is actually a series of 8 code points: H, e, l, l, o, , , Those code points can themselves be encoded in three different ways: dstring: encodes each code point as a single element. That's why dstring there length is 8. Each *element* of this though is 32 bits which you see if you cast it to ubyte[], the length in bytes is 4x the length of the dstring, but dstring.length returns the number of units, not the number of bytes. So here one unit = one point, but remember each *point* is NOT necessarily anything you see on screen. It represents just one complete instruction to the VM. wstring: encodes each code point as one
Re: How to print unicode characters (no library)?
On Monday, 27 December 2021 at 09:29:38 UTC, Kagamin wrote: D strings are plain arrays without any text-specific logic, the element is called code unit, which has a fixed size, and the array length specifies how many elements are in the array. This model is most adequate for memory correctness, i.e. it shows what takes how much memory and where it will fit. D doesn't impose fixed interpretations like characters or code points, because there are many of them and neither is the correct one, you need one or another in different situations. Linux console one example of such situation: it doesn't accept characters or code points, it accepts utf8 code units, using anything else is an error. So should I just use UTF-8 only for Linux? What about other operating systems? I suppose Unix-based OSs (maybe MacOS as well if I'm lucky) work the same as well. But what about Windows? Unfortunately I have to support this OS too with my library so I should know. If you know and you can tell me of course...
Re: Starting and managing threads
On 12/27/21 1:33 AM, Bagomot wrote: > separate thread, without blocking the main one. I think you can use std.concurrency there. I have a chapter here: http://ddili.org/ders/d.en/concurrency.html Look for 'struct Exit' to see how the main thread signals workers to stop running. And some std.concurrency hints appear in my DConf Online 2020 presentation here: https://dconf.org/2020/online/#ali1 Ali
Re: How to print unicode characters (no library)?
On Monday, 27 December 2021 at 07:29:05 UTC, rempas wrote: How can you do that? I'm trying to print the codes for them but it doesn't work. Or you cannot choose to have this behavior and there are only some terminals that support this? Try it on https://en.wikipedia.org/wiki/Teletype_Model_33
Starting and managing threads
Hello everybody! My program uses the fswatch library to track changes in a directory. It runs on the main thread of the program. I need it to do its work in a separate thread, without blocking the main one. In addition, I need to be able to terminate the thread at the moment I want from the main thread of the program. I tried to get my head around Thread and Fiber but still didn't figure out how to properly start and manage threads. I have using Thread turns it into a zombie when the main thread of the program ends. I will not even write my code here, because it is at the level of examples from the documentation. Please tell me how to start threads correctly, how to manage them and how to end them without turning them into zombies.
Re: How to print unicode characters (no library)?
D strings are plain arrays without any text-specific logic, the element is called code unit, which has a fixed size, and the array length specifies how many elements are in the array. This model is most adequate for memory correctness, i.e. it shows what takes how much memory and where it will fit. D doesn't impose fixed interpretations like characters or code points, because there are many of them and neither is the correct one, you need one or another in different situations. Linux console one example of such situation: it doesn't accept characters or code points, it accepts utf8 code units, using anything else is an error.