Re: Passing dynamic arrays

Steven Schveighoffer Wed, 10 Nov 2010 07:05:39 -0800

On Tue, 09 Nov 2010 15:13:55 -0500, Pillsy <pillsb...@gmail.com> wrote:

Steven Schveighoffer Wrote:
On Tue, 09 Nov 2010 08:14:40 -0500, Pillsy <pillsb...@gmail.com> wrote:
[...]
> Ah! This is a lot of what was confusing me about arrays; I stillthought
> they had this behavior. The fact that they don't makes me a good deal
> more comfortable with them, though I still don't like the
> non-deterministic way that they may copy their elements or they may
> share structure after you append stuff to them.
As I said before, this rarely affects code.  The common cases I've seen:
1. You append to an array and return it.
2. You modify data in the array.
3. You use a passed in array as a buffer, which means you overwrite the
array, and then start appending when it runs out of space.
I don't ever remember seeing:
You append to an array, then go back and modify the first few bytes ofthe
array.
I've certainly encountered situations in at least one other languagewhere standard library functions will return mutable arrays which may ormay not share structure with their inputs. This has been such a frequentsource of pain when using that language that I tend to react verynegatively to the possibility in any context.

Care to name names? I want to understand this dislike of D arrays,because out of all the languages I've ever used, D arrays are by far theeasiest and most intuitive to use. I don't expect to be convinced, but atleast we can have some debate on this, and maybe we can avoid mistakesmade by other languages.

Let's assume this is a very common thing and absolutely needs to be
addressed.  What would you like the behavior to be?
Using a different, library type for a buffer you can append to. I thinkof "a buffer or abstract list you can cheaply append to" as a differentsort of type from a fixed size buffer anyway, since it so often is adifferent type. Arrays/slices are a very basic type in D, and I'mgenerally thinking that giving your basic types simpler, easier tounderstand semantics is worth paying a modest cost.

There was a time when the T[new] idea was expected to be part of thelanguage. Both Andrei and Walter were behind it, and seldom doessomething not make it into the language when that happens.

It turns out, that after all the academic and theoretical discussions werefinished, and it came time to implement, it was a clunky and confusingfeature. Andrei said that for TDPL he had a whole table dedicated to whattype to use in which cases (T[] or T[new]) and he didn't even know how tofill out the table.

The beauty of D's arrays are that the slice and the array are both thesame type, so you only need to define one function to handle both, andappending "just works". I feel like this is simply a case of 'not wellenough understood.'


BTW, you can allocate a fixed buffer by doing:

T[BUFSIZE] buffer;

This cannot be appended to. It is still difficult to allocate one ofthese on the heap, which is a language shortcoming, but it can be fixed.

[...]
IMO, the benefits of just being able to append to an array any time you
want without having to set up some special type far outweighs thislittle
quirk that almost nobody encounters.  You can append to *any* array, no
matter where the data is located, or whether the data is a slice, and it
just works.  I can't see how anyone would prefer another solution!
There's a difference between appending and appending in place. Theproblem with not appending in place (and arrays not having thepossibility of a reserve that's larger than the actual amount, ofcourse) is one of efficiency. Having
auto s = "foo";
s ~= "bar";
result in a new array being allocated that is of length 6 and contains"foobar", and assigning that array to `s`, is obviously useful anddesirable behavior. If the expansion can happen in place, that's aperfectly reasonable performance optimization to have in the case ofstrings or other immutable arrays. Indeed, one of the reasons thatfunctional programming and GC go together like peanut butter and jellyis that together they let you get all sorts of wins in terms ofefficiency from shared structure.
However, I've found working with languages that mix a lot of imperativeand functional constructs (Lisp is one, but not the only one) that ifyou're going to do this, it's really very important that there not beany doubt about when mutable state is shared and when it isn't. D istrying to be that same kind of multi-paradigm language. This means that,for mutable arrays, having
int[] x = [1, 2, 3];
x ~= [4, 5, 6];


To leave no doubt about whether this reallocates or not try:

bool willReallocate = x.length + 3 > x.capacity;

But I still don't understand this concept. If you find out it's not goingto reallocate, what are you going to do? I mean, you have three caseshere:

1. You *don't* want it to reallocate -- well, you can't enforce this, butyou can use ref to ensure the original is always affected

2. You *want* it to reallocate -- use dup or ~
3. You don't care -- just use the array directly

I don't see how these three options aren't enough.

maybe reallocate and maybe not seems like it's only really there toprotect people from doing inefficient things by accident when theyappend onto the back of an array repeatedly (or to make that admittedlycommon case more convenient). This really doesn't strike me as worth thetrouble. Like I said elsewhere, the uncertainty gives me the screamingwillies.

I hear you, but at the same time, we are talking about common and uncommoncases here. D (at least in my mind) tries to be a practical language --make the common things easy as long as they are safe. And the cases whereD's arrays may surprise you are pretty uncommon IMO.


-Steve

Re: Passing dynamic arrays

Reply via email to