Re: AA vs __gshared

2023-08-15 Thread IchorDev via Digitalmars-d-learn
On Tuesday, 15 August 2023 at 17:36:13 UTC, Steven Schveighoffer 
wrote:

On 8/12/23 5:55 AM, IchorDev wrote:
On Thursday, 10 August 2023 at 15:20:28 UTC, Steven 
Schveighoffer wrote:

That shouldn't matter.


Well, it does here. The AA is mutated during the loop, so 
perhaps this is an optimisation quirk where it works with 
`for` but segfaults in `foreach`? I've pretty thoroughly 
abused the `for` version and I haven't gotten it to segfault 
yet.


oh yeah. That is not allowed. Any mutation of the AA during 
iteration can invalidate existing foreach or ranges over the AA.


And yet this apparently doesn’t apply to an equivalent `for` 
somehow. Perhaps `foreach` shouldn’t have this arbitrary problem 
in the first place…


In fact, that statement is way too broad. Invalidation of 
iteration should be based on the type's requirements.


We really should put a note in the AA spec page.


Probably yeah.


https://dlang.org/phobos/core_lifetime.html#.emplace

Any attribute requirements would be inferred based on the 
attributes of your constructor, because emplace is a template.


-Steve


Well the docs don’t say that at all, and the code is an 
unreadable mess. I dunno how anymore is meant to figure that out?


Re: AA vs __gshared

2023-08-15 Thread Steven Schveighoffer via Digitalmars-d-learn

On 8/12/23 5:55 AM, IchorDev wrote:

On Thursday, 10 August 2023 at 15:20:28 UTC, Steven Schveighoffer wrote:

That shouldn't matter.


Well, it does here. The AA is mutated during the loop, so perhaps this 
is an optimisation quirk where it works with `for` but segfaults in 
`foreach`? I've pretty thoroughly abused the `for` version and I haven't 
gotten it to segfault yet.


oh yeah. That is not allowed. Any mutation of the AA during iteration 
can invalidate existing foreach or ranges over the AA.


Basically, a rehash can cause the buckets to jumble up, and in that 
case, the "current index" can be changed to point at a null bucket.


More here: https://dlang.org/spec/statement.html#foreach_restrictions

In fact, that statement is way too broad. Invalidation of iteration 
should be based on the type's requirements.


We really should put a note in the AA spec page.

I also highly recommend using `emplace` to handle all the sticky 
issues with lifetime/construction.


Have not run into the aforementioned sticky issues yet, but I can't even 
find `emplace`'s docs anywhere now.


https://dlang.org/phobos/core_lifetime.html#.emplace

I recall it being incompatible with 
classes that have @nogc/nothrow constructors though, which made it 
pretty useless to me, and it wouldn't work with BetterC, which was a 
requirement for the allocation wrapper I was writing at the time.


It probably won't work with betterC, but that's probably just because of 
linker errors.


Any attribute requirements would be inferred based on the attributes of 
your constructor, because emplace is a template.


-Steve


Re: AA vs __gshared

2023-07-28 Thread Steven Schveighoffer via Digitalmars-d-learn

On 7/28/23 11:15 AM, IchorDev wrote:

On Friday, 28 July 2023 at 11:15:31 UTC, Steven Schveighoffer wrote:
All `__gshared` does is give you storage that is accessible from all 
threads,


"All __gshared does is give you [a live bomb, ready to go off at any 
moment]"

!!


It seems like it's not __gshared at all, but misusing malloc with GC 
pointers.




On Friday, 28 July 2023 at 14:10:16 UTC, Kagamin wrote:
Your error is using allocating the object with malloc. Since gc 
doesn't see your AA, the AA is freed and you get UAF.


Friend, I think you nailed it. After adding this I haven't been able to 
reproduce the segfault again:

```d
this(long n){
 (){
     cache = new typeof(cache);
     assert(cache);
     import core.memory;
     core.memory.GC.addRoot(cast(void*)cache);
 }();
 // ...
```
It did always happen at random, so perhaps I haven't spent enough time 
testing it yet, but I've gone far longer without it segfaulting than 
ever before.


This is the wrong approach, it's the allocating call that should add the 
root (actually a range).


For instance, the mutex is not added, that might be collected. Or if you 
add more GC-pointing things into the class, that could be a problem.


What I'd do is:

```d
T alloc(T, A...)(auto ref A args){
enum classSize = __traits(classInstanceSize, T);
void* mem = core.stdc.stdlib.malloc(classSize);
assert(mem !is null, "Out of memory");
core.memory.GC.addRange(mem[0 .. classSize]);
scope(failure) {
   core.memory.GC.removeRange(mem[0 .. classSize]);
   core.stdc.stdlib.free(mem);
}
T inst = cast(T)mem;
inst.emplace(__traits(parameters));
return inst;
}
```

And of course, a `dealloc` that removes the range should also be added.

-Steve


Re: AA vs __gshared

2023-07-28 Thread IchorDev via Digitalmars-d-learn
On Friday, 28 July 2023 at 11:15:31 UTC, Steven Schveighoffer 
wrote:
All `__gshared` does is give you storage that is accessible 
from all threads,


"All __gshared does is give you [a live bomb, ready to go off at 
any moment]"

!!

On Friday, 28 July 2023 at 14:10:16 UTC, Kagamin wrote:
Your error is using allocating the object with malloc. Since gc 
doesn't see your AA, the AA is freed and you get UAF.


Friend, I think you nailed it. After adding this I haven't been 
able to reproduce the segfault again:

```d
this(long n){
(){
cache = new typeof(cache);
assert(cache);
import core.memory;
core.memory.GC.addRoot(cast(void*)cache);
}();
// ...
```
It did always happen at random, so perhaps I haven't spent enough 
time testing it yet, but I've gone far longer without it 
segfaulting than ever before.


Re: AA vs __gshared

2023-07-28 Thread Kagamin via Digitalmars-d-learn
Your error is using allocating the object with malloc. Since gc 
doesn't see your AA, the AA is freed and you get UAF.


Re: AA vs __gshared

2023-07-28 Thread Steven Schveighoffer via Digitalmars-d-learn

On 7/28/23 4:39 AM, IchorDev wrote:


Issue is, this code I posted actually runs just fine, unlike the real code.
My actual code does this HIGHLY SUSPICIOUS thing when printing their 
length each time before using them:

```
766
766
765
766
767
768
768
768
768
768
768
768
768
768
768
768
768
768
768
128000
Error Program exited with code -11
(backtrace:)
#0  0x55670c46 in rt.aaA.Impl.findSlotLookup(ulong, scope 
const(void*), scope const(TypeInfo)) inout ()

#1  0x55661592 in _aaInX ()
```


My suspicion would be a race condition in your real code, and no race 
condition in this toy code. Second suspicion would be memory corruption 
(possibly caused by a race condition).


Therefore I must logically conclude that DRuntime's AAs are cursed! 
Unless this is a well-known issue or something...


AAs have worked forever. I've never had problems with them. Not saying 
there's not a bug here, maybe there is. But I would be surprised.


Thinking back, I've actually had them cause segfaults in non-threaded 
code, maybe `__gshared` was never the cause at all.


All `__gshared` does is give you storage that is accessible from all 
threads, but is not typed as `shared`. It doesn't change how the data is 
used.


-Steve


Re: AA vs __gshared

2023-07-28 Thread IchorDev via Digitalmars-d-learn

On Thursday, 27 July 2023 at 21:31:02 UTC, Jonathan M Davis wrote:
What should normally be happening is that you use shared, and 
then when you've protected the object so that you know that it 
can only be accessed on the current thread by the section of 
code that you're in (e.g. by locking a mutex), you temporarily 
cast away shared to operate on the object via a thread-local 
reference. Then, before exiting that section of code and 
removing the protections that are preventing other threads from 
accessing the object (e.g. by unlocking the mutex), you make 
sure that you've gotten rid of all of the thread-local 
references to the object so that only the shared reference 
exists. That way, you don't accidentally mutate the object 
while it's not protected from access by other threads.


Doing this doesn't help, unfortunately.
This is an abstract version what the problematic code looks like 
now:

```d
import std.stdio: writeln;
import core.sync.mutex;
import core.thread;
import core.time;
static import core.stdc.stdlib;

struct Pos{
long x,y;
}

shared Obj obj;

final class Obj{
CacheData[Pos] cache;
CacheData* getCache(Pos key){
if(auto item = key in cache){
if(++item.useCount >= 8){
cache.remove(key);
//I thought this ^ might cause a use-after-free issue, but 
apparently not.

}
return item;
}
return null;
}
struct CacheData{
double[1000] data;
uint useCount = 1;
}

double[1000] doStuff(Pos pos){
immutable data = (){
if(auto data = getCache(pos)){
return *data;
}else{
double[1000] data;
//initialisses it with something, this is 
arbirray though:
foreach(ref item; data){
import std.random;
item = uniform01();
}
cache[pos] = CacheData(data);
return CacheData(data);
}
}();

//do stuff with data

return data.data;
}   
}

__gshared OtherObj otherObj;

final class OtherObj{
shared Mutex mutex;
__gshared ubyte[2^^18] data;

this(long n){
obj = cast(shared Obj)alloc!Obj(n);
mutex = new shared Mutex;
}

void callObj(Pos pos){
double[1000] data;

{
auto objRef = cast(Obj)obj;
data = objRef.doStuff(pos);
}

//do things with returned value...
}
}

void thread(){
bool run = true;
Pos pos;
while(run){
otherObj.mutex.lock();
foreach(i; 0..100){
otherObj.callObj(pos);
//this is pretty arbirary:
import std.random;
if(uniform01() > 0.9){
auto v = uniform01();
if(v < 0.25) pos.x--;
else if(v < 0.5) pos.y++;
else if(v < 0.75) pos.y--;
else pos.x++;
}
}
otherObj.mutex.unlock();

Thread.sleep(1.seconds / 20);
}
}

void main(){
otherObj = alloc!OtherObj(-7);
assert(otherObj !is null);

auto otherThread = new Thread();
otherThread.start();

bool run = true;
while(run){
if(!otherThread.isRunning()) otherThread.join();
otherObj.mutex.lock();
foreach(val; otherObj.data){
//do stuff
}
otherObj.mutex.unlock();
Thread.sleep(1.seconds / 80);
}
}

T alloc(T, A...)(auto ref A args){
enum classSize = __traits(classInstanceSize, T);
void* mem = core.stdc.stdlib.malloc(classSize);
scope(failure) core.stdc.stdlib.free(mem);
assert(mem !is null, "Out of memory");
mem[0..classSize] = __traits(initSymbol, T);
T inst = cast(T)mem;
static if(__traits(hasMember, T, "__ctor")){

inst.__ctor(__traits(parameters));
}
return inst;
}
```
Issue is, this code I posted actually runs just fine, unlike the 
real code.
My actual code does this HIGHLY SUSPICIOUS thing when printing 
their length each time 

Re: AA vs __gshared

2023-07-28 Thread IchorDev via Digitalmars-d-learn

On Friday, 28 July 2023 at 04:13:13 UTC, Kagamin wrote:
The difference between them is purely formal if you're not on 
an old gdc, where shared was synchronized like C# volatile.


I'm not sure that's correct. Also I always use the latest 
compiler versions where possible.



start many threads and access the locked AA concurrently.


It's only being accessed from one thread, so such a test wouldn't 
be an accurate reproduction of the issue I'm having. Also I'm 
only using 2 threads total.


Re: AA vs __gshared

2023-07-27 Thread Kagamin via Digitalmars-d-learn

On Friday, 28 July 2023 at 03:54:53 UTC, IchorDev wrote:
I was told that using `__gshared` is quite a bit faster at 
runtime than using `shared`, but I also don't really know 
anything concrete about `shared` because the spec is so 
incredibly vague about it.


The difference between them is purely formal if you're not on an 
old gdc, where shared was synchronized like C# volatile. If the 
crashes are frequent, can you reproduce a crash with a minimal 
amount of code, start many threads and access the locked AA 
concurrently.


Re: AA vs __gshared

2023-07-27 Thread IchorDev via Digitalmars-d-learn

On Thursday, 27 July 2023 at 21:31:02 UTC, Jonathan M Davis wrote:


Now, as to what's happening in your code that's causing 
segfaults, the most likely culprit would be that you're 
accessing the AA without actually having done anything to 
prevent other threads from accessing it at the same time (or 
your protections were inadequate).


I use a shared Mutex from `core.sync.mutex`.
The AA itself and the `final class` that it's a member of is only 
ever accessed by one thread for now.
The code inside that `final class` that accesses the AA is called 
by a function from another `final class`, and that other `final 
class` is used by multiple threads, but it's fully guarded by 
Mutex locks—I haven't had any issues with the millions of 
non-atomic reads and writes on its data I've performed from the 
two threads. Both objects are "static" (declared at module scope) 
if that matters at all.


if the AA were shared, the only sections of code where you 
would have to worry about thread-local references escaping 
would be in the sections of code where you've cast away shared 
after locking the relevant mutex. So, similar to what happens 
with @safe and @trusted, using shared allows you to limit the 
code that you have to examine to find the problem.


I was told that using `__gshared` is quite a bit faster at 
runtime than using `shared`, but I also don't really know 
anything concrete about `shared` because the spec is so 
incredibly vague about it.


Re: AA vs __gshared

2023-07-27 Thread Jonathan M Davis via Digitalmars-d-learn
On Thursday, July 27, 2023 9:57:51 AM MDT IchorDev via Digitalmars-d-learn 
wrote:
> I've been getting a lot of segfaults from using associative
> arrays recently. The faults happen seemingly at random, and from
> pretty mundane stuff like `if(auto x = y in z)` that run very
> often:
> ```
> Segmentation fault.
> #0  0x55670f4a in rt.aaA.Impl.findSlotLookup(ulong, scope
> const(void*), scope const(TypeInfo)) inout ()
> #1  0x55661662 in _aaInX ()
> ```
>
> I suspect that this is because they've all been placed inside
> `__ghared` structs. Are DRuntime's AAs simply incompatible with
> `__gshared`? Do they need to be marked as `shared` to prevent
> these shenanigans?

Strictly speaking, __gshared is really only intended for stuff like C
globals (which can't be shared due to name-mangling issues). Using it on
anything else can at least potentially cause problems due to the fact that
the compiler will assume that the variable is thread-local. So, I would
strongly advise against using __gshared in a case like this. In practice,
you can often get away with it, because the compiler doesn't do much in the
way of optimizing stuff based on objects being thread-local right now, but
it's definitely risking problems with the type system if you used __gshared
when you're not trying to do something like bind to a C global.

What should normally be happening is that you use shared, and then when
you've protected the object so that you know that it can only be accessed on
the current thread by the section of code that you're in (e.g. by locking a
mutex), you temporarily cast away shared to operate on the object via a
thread-local reference. Then, before exiting that section of code and
removing the protections that are preventing other threads from accessing
the object (e.g. by unlocking the mutex), you make sure that you've gotten
rid of all of the thread-local references to the object so that only the
shared reference exists. That way, you don't accidentally mutate the object
while it's not protected from access by other threads.

Now, as to what's happening in your code that's causing segfaults, the most
likely culprit would be that you're accessing the AA without actually having
done anything to prevent other threads from accessing it at the same time
(or your protections were inadequate). And because the object is being
treated as thread-local by the compiler, it would be easy to have
accidentally let a reference to it leak somewhere that wasn't being
protected by whatever mutex you're using, whereas if the AA were shared, the
only sections of code where you would have to worry about thread-local
references escaping would be in the sections of code where you've cast away
shared after locking the relevant mutex. So, similar to what happens with
@safe and @trusted, using shared allows you to limit the code that you have
to examine to find the problem.

- Jonathan M Davis





Re: AA vs __gshared

2023-07-27 Thread Dennis via Digitalmars-d-learn

On Thursday, 27 July 2023 at 15:57:51 UTC, IchorDev wrote:
The faults happen seemingly at random, and from pretty mundane 
stuff like `if(auto x = y in z)` that run very often:


Are you accessing the AA from multiple threads?


AA vs __gshared

2023-07-27 Thread IchorDev via Digitalmars-d-learn
I've been getting a lot of segfaults from using associative 
arrays recently. The faults happen seemingly at random, and from 
pretty mundane stuff like `if(auto x = y in z)` that run very 
often:

```
Segmentation fault.
#0  0x55670f4a in rt.aaA.Impl.findSlotLookup(ulong, scope 
const(void*), scope const(TypeInfo)) inout ()

#1  0x55661662 in _aaInX ()
```

I suspect that this is because they've all been placed inside 
`__ghared` structs. Are DRuntime's AAs simply incompatible with 
`__gshared`? Do they need to be marked as `shared` to prevent 
these shenanigans?