Re: "Error: `TypeInfo` cannot be used with -betterC" on a CTFE function

2024-04-14 Thread Richard (Rikki) Andrew Cattermole via Digitalmars-d-learn

On 15/04/2024 10:36 AM, Liam McGillivray wrote:
Well, it did work when I tried it (using a string variable, not a 
literal of course). It displayed as it is supposed to. But from the 
information I can find on the web it looks like strings are sometimes 
but not |always| zero-terminated. Not a great look for the language. Are 
there any rules to determine when it is and when it isn't (for string 
variables)?


String literals, which are constants that the compiler puts into ROM of 
the object file, are zero terminated because it doesn't cost anything to 
do this.


At runtime, unless you explicitly append the null terminator, no string 
contains it.


D's strings are slices, pointer + length. These are superior in both 
performance (not having to strlen all the time) and are safer (bounds 
checked access).


Null terminated strings are a technical debt due to legacy constraints. 
We would all be better off if C supported slices. Plenty of notable 
exploits couldn't of happened if they were used instead.


Re: "Error: `TypeInfo` cannot be used with -betterC" on a CTFE function

2024-04-14 Thread Steven Schveighoffer via Digitalmars-d-learn

On Sunday, 14 April 2024 at 22:36:18 UTC, Liam McGillivray wrote:
On Friday, 12 April 2024 at 15:24:38 UTC, Steven Schveighoffer 
wrote:

```d
void InitWindow(int width, int height, ref string title) {
InitWindow(width, height, cast(const(char)*)title);
}
```


This is invalid, a string may not be zero-terminated. You 
can't just cast.


Well, it did work when I tried it (using a string variable, not 
a literal of course). It displayed as it is supposed to.


A cast "working" isn't enough. It could work in certain cases, 
with certain environmental conditions, etc., but fail horribly 
with memory corruption in other cases. It could even happen on 
different runs of the program. It could happen that it works 
99.999% of the time. The risk is not worth it.


But from the information I can find on the web it looks like 
strings are sometimes but not `always` zero-terminated. Not a 
great look for the language. Are there any rules to determine 
when it is and when it isn't (for string variables)?


string literals are zero-terminated. All other strings are not. 
If you have a string generated at compile time, the chances are 
good it has zero termination. However, the implicit conversion to 
`char *` is the clue that it is zero terminated. If that doesn't 
happen automatically, it's not guaranteed to be zero terminated.


A string generated at runtime only has zero termination if you 
add a 0. You should not cast to a pointer assuming the zero is 
going to be there.


Casting is a blunt instrument, which does not validate what you 
are doing is sound. A cast says "compiler, I know what I'm doing 
here, let me do this even though it's outside the language rules".



So there are a few things to consider:

1. Is the string *transiently used*. That is, does the 
function just quickly use the string and never refers to it 
again? Given that this is raylib, the source is pretty 
readable, so you should be able to figure this out.


I suppose. But if it turns out that the string is used 
continuously (as I assume to be the case with `InitWindow` and 
`SetWindowTitle`) and it doesn't make a copy of it, I imagine 
it would be difficult to design the function overload, as it 
would need to store a copy of the string somewhere. In that 
case, the only clean solution would be to have a global array 
of strings to store everything that's been passed to such 
functions, but that doesn't feel like a very satisfying 
solution. I may take a look inside some Raylib functions if I 
get back to this task.


You can pin memory in the GC to ensure it's not collected by 
using `core.memory.GC.addRoot`, which is effectively "storing in 
a global array".


2. If 1 is false, will it be saved in memory that is scannable 
by the GC? This is one of the most pernicious issues with 
using C libraries from D. In this case, you will need to 
either allocate the memory with C `malloc` or pin the GC 
memory.


You mean that the GC can destroy objects that still have 
references from the C code?


Yes. If the GC is unaware of the memory that is being used by the 
C code, it can't scan that code for pointers. It may collect 
these strings early.




For transiently used strings, I would point you at the 
function 
[`tempCString`](https://github.com/dlang/phobos/blob/0663564600edb3cce6e0925599ebe8a6da8c20fd/std/internal/cstring.d#L77), which allocates a temporary C string using malloc or a stack buffer, and then frees it when done with it.


Thank you. In a previous thread, someone told me that having to 
do many deallocations slows down the program, and the GC is 
more efficient because it deallocates many objects 
simultaneously. Is this something worth considering here, or is 
the overhead going to be tiny even when it's called a few times 
per frame?


In an *application*, I would recommend not worrying about the 
allocation performance until it becomes an issue. I'm writing a 
simple game, and never have worried about GC performance. When 
you do need to worry, you can employ strategies like 
preallocating all things that need allocation (still with the GC).


In a *general library*, you do have to worry about the 
requirements of your users. If you can allocate locally (on the 
stack), this is the most efficient option. This is what 
`tempCString` does (with a fallback to `malloc` when the string 
gets to be large).


The obvious problem in all this is to avoid accepting string 
literals (which are magic and automatically convert to const 
char *). This is currently impossible with function 
overloading, and so you need a separate function name, or put 
them in a different module.


Aren't there any compile-time conditions for this?


Unfortunately no. `string` does not implicitly convert to `char 
*` unless it is a string literal, and string literals bind to 
`string` before `char *`. So you can't rely on the overload 
working.


-Steve


Re: "Error: `TypeInfo` cannot be used with -betterC" on a CTFE function

2024-04-14 Thread Liam McGillivray via Digitalmars-d-learn
On Friday, 12 April 2024 at 15:24:38 UTC, Steven Schveighoffer 
wrote:

```d
void InitWindow(int width, int height, ref string title) {
InitWindow(width, height, cast(const(char)*)title);
}
```


This is invalid, a string may not be zero-terminated. You can't 
just cast.


Well, it did work when I tried it (using a string variable, not a 
literal of course). It displayed as it is supposed to. But from 
the information I can find on the web it looks like strings are 
sometimes but not `always` zero-terminated. Not a great look for 
the language. Are there any rules to determine when it is and 
when it isn't (for string variables)?



So there are a few things to consider:

1. Is the string *transiently used*. That is, does the function 
just quickly use the string and never refers to it again? Given 
that this is raylib, the source is pretty readable, so you 
should be able to figure this out.


I suppose. But if it turns out that the string is used 
continuously (as I assume to be the case with `InitWindow` and 
`SetWindowTitle`) and it doesn't make a copy of it, I imagine it 
would be difficult to design the function overload, as it would 
need to store a copy of the string somewhere. In that case, the 
only clean solution would be to have a global array of strings to 
store everything that's been passed to such functions, but that 
doesn't feel like a very satisfying solution. I may take a look 
inside some Raylib functions if I get back to this task.


2. If 1 is false, will it be saved in memory that is scannable 
by the GC? This is one of the most pernicious issues with using 
C libraries from D. In this case, you will need to either 
allocate the memory with C `malloc` or pin the GC memory.


You mean that the GC can destroy objects that still have 
references from the C code?


For transiently used strings, I would point you at the function 
[`tempCString`](https://github.com/dlang/phobos/blob/0663564600edb3cce6e0925599ebe8a6da8c20fd/std/internal/cstring.d#L77), which allocates a temporary C string using malloc or a stack buffer, and then frees it when done with it.


Thank you. In a previous thread, someone told me that having to 
do many deallocations slows down the program, and the GC is more 
efficient because it deallocates many objects simultaneously. Is 
this something worth considering here, or is the overhead going 
to be tiny even when it's called a few times per frame?


The obvious problem in all this is to avoid accepting string 
literals (which are magic and automatically convert to const 
char *). This is currently impossible with function 
overloading, and so you need a separate function name, or put 
them in a different module.


Aren't there any compile-time conditions for this?