Re: More fun with toStringz and the GC

2022-08-06 Thread Don Allen via Digitalmars-d-announce

On Saturday, 6 August 2022 at 13:40:12 UTC, Don Allen wrote:
On Saturday, 6 August 2022 at 02:14:24 UTC, Steven 
Schveighoffer wrote:

On 8/5/22 8:51 PM, Don Allen wrote:


And this, from Section 32.2 of the Language Reference Manual:

If pointers to D garbage collector allocated memory are 
passed to C functions, it's critical to ensure that the 
memory will not be collected by the garbage collector before 
the C function is done with it. This is accomplished by:


     Making a copy of the data using 
core.stdc.stdlib.malloc() and passing the copy instead.
     -->Leaving a pointer to it on the stack (as a parameter 
or automatic variable), as the garbage collector will scan 
the stack.<--
     Leaving a pointer to it in the static data segment, as 
the garbage collector will scan the static data segment.
     Registering the pointer with the garbage collector with 
the std.gc.addRoot() or std.gc.addRange() calls.


I did what the documentation says and it does not work.


I know, I felt exactly the same way in my post on it:

https://forum.dlang.org/post/sial38$7v0$1...@digitalmars.com

I even issued a PR to remove the problematic recommendation:

https://github.com/dlang/dlang.org/pull/3102

But there was pushback to the point where it wasn't worth it. 
So I closed it.


As I said in my previous post, the documentation issue really 
needs to be addressed.


I do realize now that I *assumed* that what I did was going to 
result in a stack reference to the c-string I was trying to 
keep alive.


At the risk of over-doing this, one more thing I want to say in 
the interest of clarity: the incorrect documentation led me right 
into this error: "This is accomplished by . Leaving a pointer 
to it on the stack (as a parameter or automatic variable), as the 
garbage collector will scan

the stack."

I've fixed my code using addRoot/removeRoot and so far it seems 
to work.





Re: More fun with toStringz and the GC

2022-08-06 Thread Don Allen via Digitalmars-d-announce
On Saturday, 6 August 2022 at 02:14:24 UTC, Steven Schveighoffer 
wrote:

On 8/5/22 8:51 PM, Don Allen wrote:


And this, from Section 32.2 of the Language Reference Manual:

If pointers to D garbage collector allocated memory are passed 
to C functions, it's critical to ensure that the memory will 
not be collected by the garbage collector before the C 
function is done with it. This is accomplished by:


     Making a copy of the data using core.stdc.stdlib.malloc() 
and passing the copy instead.
     -->Leaving a pointer to it on the stack (as a parameter 
or automatic variable), as the garbage collector will scan the 
stack.<--
     Leaving a pointer to it in the static data segment, as 
the garbage collector will scan the static data segment.
     Registering the pointer with the garbage collector with 
the std.gc.addRoot() or std.gc.addRange() calls.


I did what the documentation says and it does not work.


I know, I felt exactly the same way in my post on it:

https://forum.dlang.org/post/sial38$7v0$1...@digitalmars.com

I even issued a PR to remove the problematic recommendation:

https://github.com/dlang/dlang.org/pull/3102

But there was pushback to the point where it wasn't worth it. 
So I closed it.


As I said in my previous post, the documentation issue really 
needs to be addressed.


I do realize now that I *assumed* that what I did was going to 
result in a stack reference to the c-string I was trying to keep 
alive. Bad assumption, obviously. But I think the point is that 
there is a simple, reliable mechanism -- addRoot, removeRoot -- 
that works and the documentation should say that and only that. 
Walter said this in his 9/25/21 post: "Use GC.addRoot() to keep a 
reference alive. That's what it's for.
". That's all that's needed. All the rest leads people like me 
who don't think like a compiler to make the mistake I made.


Re: More fun with toStringz and the GC

2022-08-06 Thread H. S. Teoh via Digitalmars-d-announce
On Fri, Aug 05, 2022 at 10:14:24PM -0400, Steven Schveighoffer via 
Digitalmars-d-announce wrote:
> On 8/5/22 8:51 PM, Don Allen wrote:
> 
> > And this, from Section 32.2 of the Language Reference Manual:
> > 
> > If pointers to D garbage collector allocated memory are passed to C
> > functions, it's critical to ensure that the memory will not be
> > collected by the garbage collector before the C function is done
> > with it. This is accomplished by:
> > 
> >      Making a copy of the data using core.stdc.stdlib.malloc() and
> >  passing the copy instead.
> >      -->Leaving a pointer to it on the stack (as a parameter or
> > automatic variable), as the garbage collector will scan the stack.<--
> >      Leaving a pointer to it in the static data segment, as the garbage
> > collector will scan the static data segment.
> >      Registering the pointer with the garbage collector with the
> > std.gc.addRoot() or std.gc.addRange() calls.
> > 
> > I did what the documentation says and it does not work.
> 
> I know, I felt exactly the same way in my post on it:
> 
> https://forum.dlang.org/post/sial38$7v0$1...@digitalmars.com
> 
> I even issued a PR to remove the problematic recommendation:
> 
> https://github.com/dlang/dlang.org/pull/3102
> 
> But there was pushback to the point where it wasn't worth it. So I
> closed it.
[...]

IMO this PR should be revived. The one thing worse than no documentation
is misleading documentation.  This state of things should not be allowed
to continue.


T

-- 
Study gravitation, it's a field with a lot of potential.


Re: More fun with toStringz and the GC

2022-08-05 Thread Steven Schveighoffer via Digitalmars-d-announce

On 8/5/22 8:51 PM, Don Allen wrote:


And this, from Section 32.2 of the Language Reference Manual:

If pointers to D garbage collector allocated memory are passed to C 
functions, it's critical to ensure that the memory will not be collected 
by the garbage collector before the C function is done with it. This is 
accomplished by:


     Making a copy of the data using core.stdc.stdlib.malloc() and 
passing the copy instead.
     -->Leaving a pointer to it on the stack (as a parameter or 
automatic variable), as the garbage collector will scan the stack.<--
     Leaving a pointer to it in the static data segment, as the garbage 
collector will scan the static data segment.
     Registering the pointer with the garbage collector with the 
std.gc.addRoot() or std.gc.addRange() calls.


I did what the documentation says and it does not work.


I know, I felt exactly the same way in my post on it:

https://forum.dlang.org/post/sial38$7v0$1...@digitalmars.com

I even issued a PR to remove the problematic recommendation:

https://github.com/dlang/dlang.org/pull/3102

But there was pushback to the point where it wasn't worth it. So I 
closed it.


-Steve


Re: More fun with toStringz and the GC

2022-08-05 Thread Don Allen via Digitalmars-d-announce
On Friday, 5 August 2022 at 23:38:22 UTC, Steven Schveighoffer 
wrote:

On 8/5/22 7:13 PM, jfondren wrote:

On Friday, 5 August 2022 at 22:51:07 UTC, Don Allen wrote:
My theory: because gc_protect2 is never referenced, I'm 
guessing that the compiler is optimizing away the storage of 
the returned pointer, the supporting evidence being what I 
said in the previous paragraph. Anyone have a better idea?


A local variable definitely isn't enough: 
https://forum.dlang.org/thread/xchnfzvpmxgytqprb...@forum.dlang.org


This package came of it: 
https://code.dlang.org/packages/keepalive




Yes, but I will warn you, the compilers are smart buggers. I 
think someone came up with a case where this still doesn't keep 
it alive (been a while since I made that).


The only true solution is to use `GC.addRoot` on the string and 
`GC.removeRoot` when you are done.


Steve --

Thanks for this.

But this time I *did* read the documentation, specifically this:

Interfacing Garbage Collected Objects With Foreign Code

The garbage collector looks for roots in:

the static data segment
the stacks and register contents of each thread
the TLS (thread-local storage) areas of each thread
any roots added by core.memory.GC.addRoot() or 
core.memory.GC.addRange()
If the only pointer to an object is held outside of these areas, 
then the collector will miss it and free the memory.


To avoid this from happening, either

maintain a pointer to the object in an area the collector 
does scan for pointers;
add a root where a pointer to the object is stored using 
core.memory.GC.addRoot() or core.memory.GC.addRange().
reallocate and copy the object using the foreign code's 
storage allocator or using the C runtime library's malloc/free.



And this, from Section 32.2 of the Language Reference Manual:

If pointers to D garbage collector allocated memory are passed to 
C functions, it's critical to ensure that the memory will not be 
collected by the garbage collector before the C function is done 
with it. This is accomplished by:


Making a copy of the data using core.stdc.stdlib.malloc() and 
passing the copy instead.
-->Leaving a pointer to it on the stack (as a parameter or 
automatic variable), as the garbage collector will scan the 
stack.<--
Leaving a pointer to it in the static data segment, as the 
garbage collector will scan the static data segment.
Registering the pointer with the garbage collector with the 
std.gc.addRoot() or std.gc.addRange() calls.


I did what the documentation says and it does not work.

Having a better version of C and C++ with a gc and the ability to 
directly call useful C/C++ libraries is a big D selling point, as 
far as I am concerned. It was a major motivation for the creation 
of Go. But getting the interaction between the GC and foreign 
functions properly documented is essential. Right now, there are 
bits and pieces of advice in the Language Reference, the Feature 
Overview, and the toStringz documentation and none of it tells 
you what you need to know. In fact, it does the opposite, telling 
you to do something (stick a pointer on the stack) that does not 
work, which leads to the "nasty bug" spoken of in the toStringz 
doc. When you waste a lot of a user's time with poor and 
inaccurate documentation, as this did mine, you are not making 
friends. I would advise fixing this asap.


/Don


Re: More fun with toStringz and the GC

2022-08-05 Thread Steven Schveighoffer via Digitalmars-d-announce

On 8/5/22 7:13 PM, jfondren wrote:

On Friday, 5 August 2022 at 22:51:07 UTC, Don Allen wrote:
My theory: because gc_protect2 is never referenced, I'm guessing that 
the compiler is optimizing away the storage of the returned pointer, 
the supporting evidence being what I said in the previous paragraph. 
Anyone have a better idea?


A local variable definitely isn't enough: 
https://forum.dlang.org/thread/xchnfzvpmxgytqprb...@forum.dlang.org


This package came of it: https://code.dlang.org/packages/keepalive



Yes, but I will warn you, the compilers are smart buggers. I think 
someone came up with a case where this still doesn't keep it alive (been 
a while since I made that).


The only true solution is to use `GC.addRoot` on the string and 
`GC.removeRoot` when you are done.


-Steve


Re: More fun with toStringz and the GC

2022-08-05 Thread jfondren via Digitalmars-d-announce

On Friday, 5 August 2022 at 22:51:07 UTC, Don Allen wrote:
My theory: because gc_protect2 is never referenced, I'm 
guessing that the compiler is optimizing away the storage of 
the returned pointer, the supporting evidence being what I said 
in the previous paragraph. Anyone have a better idea?


A local variable definitely isn't enough: 
https://forum.dlang.org/thread/xchnfzvpmxgytqprb...@forum.dlang.org


This package came of it: https://code.dlang.org/packages/keepalive



More fun with toStringz and the GC

2022-08-05 Thread Don Allen via Digitalmars-d-announce
Remember all the fun we had last year when I failed to heed the 
warning in the toStringz documentation about retaining a 
reference to a char * passed into C? It took a long time to find 
that one, with a lot of help from Steve Schveighoffer and others.


Well, I've got another one. Consider this:


// Get number of children of the parent account
auto gc_protect = bind_text(n_children_stmt, 1, 
parent.guid);
parent.children.length = one_row!(int)(n_children_stmt, 
&get_int);


auto gc_protect2 = bind_text(account_child_stmt, 1, 
parent.guid);
for (int i = 0; next_row_available_p(account_child_stmt, 
&sqlite3_reset); i++) {

parent.children[i] = new Account;
parent.children[i].name = 
fromStringz(sqlite3_column_text(account_child_stmt, 0)).idup;
parent.children[i].guid = 
fromStringz(sqlite3_column_text(account_child_stmt, 1)).idup;
parent.children[i].flags = 
sqlite3_column_int(account_child_stmt, 2);
parent.children[i].value = 
get_account_value(parent.children[i]);

}

bind_text takes a D string, turns it into a C string with 
toStringz, uses that to call sqlite3_bind_text and returns the C 
string, which I store as you can see with the intention of 
protecting it from the gc. The code as written above does not 
work. At some point, I get an index-out-of-bounds error, because 
the loop is seeing too many children. If I turn off the GC, the 
code works correctly and the application completes normally.


With the GC on, if I put a debugging writeln inside the loop, 
right after the 'for', that prints, among other things, the value 
of gc_protect2 (I wanted to convince myself that the GC wasn't 
moving what it points to; yes, I know the documentation says the 
current GC won't do that), the problem goes away. A Heisenbug!


My theory: because gc_protect2 is never referenced, I'm guessing 
that the compiler is optimizing away the storage of the returned 
pointer, the supporting evidence being what I said in the 
previous paragraph. Anyone have a better idea?


By the way, I get the same error compiling this with dmd or ldc.

/Don Allen