Re: Possible bug in associative array implementation (and/or @safe checking)

Steven Schveighoffer via Digitalmars-d-learn Fri, 17 Aug 2018 06:15:53 -0700

On 8/16/18 4:45 PM, Aaron D. Trout wrote:

On Thursday, 16 August 2018 at 18:56:45 UTC, Steven Schveighoffer wrote:
On 8/16/18 2:32 PM, Aaron D. Trout wrote:
[...]
On Thursday, 16 August 2018 at 17:20:23 UTC, Steven Schveighoffer wrote:
Yes, this is the effect I would expect.
D has traditionally simply allowed slicing stack data withoutquestion (even in @safe code), but that will change when dip1000 isfully realized. It will be allowed, but only when assigning to scopevariables.
Thanks for the quick and knowledgeable reply! I think I understandwhat's going on, but I'm surprised it is allowed in @safe code sincethe compiler doesn't allow the following, even in non-@safe code:
int[] badSlice()
{
     int[2] buffer;
     return buffer[];
}
It's because it's on the same line. This is a crude "safe" featurethat is easily duped.
This is allowed to compile:

int[2] buffer;
auto buf = buffer[];
return buf;

But add -dip1000 to the dmd options and that fails.
I would warn you that I think dip1000 is too crude to start trying toapply it to your project, and may have linker errors with Phobos.
I guess the compiler just isn't (yet!) able to catch that theassociative array is storing a slice of expired stack. I'm surprisedthat the built-in AA implementation *allows* using slices as keys in@safe code without copying the underlying data to the heap first.This is clearly dangerous, but perhaps heap-copying slicesdefensively would result in an unacceptable performance hit.
I wouldn't put too much stock in having safety in the AA. The AA is avery very old piece of the compiler, that pre-dates safety checks, andstill is a bit of a kludge in terms of type and memory safety. If youdo find any obvious bugs, it's good to report them.
This issue came up while trying to eliminate unnecessary allocationin my code. In my case, I could set a maximum key length at compiletime and switch my key type to a struct wrapping a static array buffer.
In hindsight, it was silly for me to think I could eliminateseparately allocating the keys when the key type was a variablelength array, since the AA must store the keys. That said, a suitableadmonition from the compiler here would have been very educational. Ilook forward to seeing the full inclusion of DIP1000!
In this case, actually, the AA does NOT store the key data, but justthe reference to the keys. An array slice is a pointer and length, andthe data is stored elsewhere. The static version, however, does storeall the key data inside the AA.
That being said, you can potentially avoid more allocation with thekeys with various tricks, such as pre-allocating all the keys and thenusing the reference.
In other words, eagerly stick the data into an array of arrays:
auto sets = setA.map!(j => setB.filter!(i => i % j ==0).array).array;
and then not worry about duping them. But it all depends on your usecase.
Thanks again for the quick reply! I have a pretty firm grasp on what aslice is (pointer + offset).


pointer + length, but maybe that's what you meant.

What I had meant by the comment "the AA muststore the keys" was that I had somehow gotten the (of course totallymistaken!) idea that the AA only ever needed to *examine* the key ratherthan actually storing it.

Right, the hash only gets you to a bucket, you still need the actualvalue to compare for equality.

If that were the case, a slice ofabout-to-be-expired stack would be perfectly fair game as a key. Am Icorrect that doing this *would* be an OK way to avoid unnecessaryallocation if we knew the key already existed (as a heap allocatedslice) in the AA and we simply wanted to modify the associated value?

Yes, definitely! There have been a few new functions added to AAsrecently to help with only allocating *values* when not present, but nota way to do the same with keys.


What you *can* do (but this involves 2 lookups) is:

int[2] buf = ...;

if (auto valptr = buf[] in aa)
{
   // use *valptr to get the value
}
else
{
   aa[buf.idup] = 0; // initial value
}

I don't think the storage of the key was considered when adding the newfunctions (`require` and `update`).

Thanks also for the advice about -dip1000 and the state of the built-inAA implementation. My code base has been changing to include moreAA-heavy data structures, so I think that in the near future I will needto do some refactoring to make changing AA implementation easier.

I maybe said it more strongly than needed; AAs are generally safe, it'sjust that I'm not surprised if there are holes. It's a type that thecompiler generally ignores a lot of rules for, and not everything iscovered. However, in this case, it was the slicing that was unsafe, theAA had nothing to do with it.

Also, one last question: should this issue be reported as a new bug? Myunderstanding was that @safe code should not allow obtaining referencesto expired stack memory, but perhaps this is already a known problem?I'm happy to file a new bug report if that would be helpful!


No, it's an old bug:

https://issues.dlang.org/show_bug.cgi?id=8838

Closed as fixed since dip1000 fixes it.

What you could do is try to get your code to work with dip1000 and thenif you can't, file a bug against *that*. But this may be something thatisn't going to be easy to do, or may take more time than it's worth.


-Steve

Re: Possible bug in associative array implementation (and/or @safe checking)

Reply via email to