Re: why allocators are not discussed here

2013-06-28 Thread Marco Leise
Am Thu, 27 Jun 2013 01:59:00 +0200
schrieb Adam D. Ruppe destructiona...@gmail.com:

 void fillBuffer(lent char[] buffer) {}
 
 would be disallowed and that is something I would definitely want.

Isn't that what scope is for?

-- 
Marco



Re: why allocators are not discussed here

2013-06-28 Thread Dicebot

On Friday, 28 June 2013 at 07:07:39 UTC, Marco Leise wrote:

Am Thu, 27 Jun 2013 01:59:00 +0200
schrieb Adam D. Ruppe destructiona...@gmail.com:


void fillBuffer(lent char[] buffer) {}

would be disallowed and that is something I would definitely 
want.


Isn't that what scope is for?


Reading dlang.org makes you guess so but official position is 
that 'scope' does not exist, so it is hard to say what it is 
really for.


Re: why allocators are not discussed here

2013-06-28 Thread deadalnix

On Thursday, 27 June 2013 at 22:50:47 UTC, John Colvin wrote:

Old but perhaps relevant?

http://www.linkedin.com/news?viewArticle=articleID=-1gid=86782type=memberitem=253295471articleURL=http%3A%2F%2Fwww%2Eallendowney%2Ecom%2Fss08%2Fhandouts%2Fberger02reconsidering%2Epdfurlhash=96TJgoback=%2Egmr_86782%2Egde_86782_member_253295471

(It's an academic article about memory allocation from 2002)


Interesting paper. Still concurrency isn't really addressed, 
which is a problem to be future proof.


Re: why allocators are not discussed here

2013-06-28 Thread Adam D. Ruppe

On Friday, 28 June 2013 at 07:07:39 UTC, Marco Leise wrote:

Isn't that what scope is for?


I don't really know. In practice, it does something else (usually 
nothing, but suppresses heap closure allocation on delegates). 
The DIPs relating to it all talk about returning refs from 
functions and I'm not sure if they relate to the built ins or 
not- I don't think it would quite work for what I have in mind.


Re: why allocators are not discussed here

2013-06-28 Thread Dicebot

On Friday, 28 June 2013 at 11:55:46 UTC, Adam D. Ruppe wrote:

On Friday, 28 June 2013 at 07:07:39 UTC, Marco Leise wrote:

Isn't that what scope is for?


I don't really know. In practice, it does something else 
(usually nothing, but suppresses heap closure allocation on 
delegates). The DIPs relating to it all talk about returning 
refs from functions and I'm not sure if they relate to the 
built ins or not- I don't think it would quite work for what I 
have in mind.


It is no-op keyword in current implementation for everything but 
delegates. DIP speculation was based on 
http://dlang.org/attribute.html#scope and Parameter Storage 
Classes in http://dlang.org/function.html but that info is 
obviously outdated.


Re: why allocators are not discussed here

2013-06-28 Thread Brian Rogoff

On Friday, 28 June 2013 at 10:57:45 UTC, deadalnix wrote:

On Thursday, 27 June 2013 at 22:50:47 UTC, John Colvin wrote:

Old but perhaps relevant?

http://www.linkedin.com/news?viewArticle=articleID=-1gid=86782type=memberitem=253295471articleURL=http%3A%2F%2Fwww%2Eallendowney%2Ecom%2Fss08%2Fhandouts%2Fberger02reconsidering%2Epdfurlhash=96TJgoback=%2Egmr_86782%2Egde_86782_member_253295471

(It's an academic article about memory allocation from 2002)


Interesting paper. Still concurrency isn't really addressed, 
which is a problem to be future proof.


http://en.wikipedia.org/wiki/Hoard_memory_allocator



Re: why allocators are not discussed here

2013-06-28 Thread Jonathan M Davis
On Friday, June 28, 2013 13:55:45 Adam D. Ruppe wrote:
 On Friday, 28 June 2013 at 07:07:39 UTC, Marco Leise wrote:
  Isn't that what scope is for?
 
 I don't really know. In practice, it does something else (usually
 nothing, but suppresses heap closure allocation on delegates).
 The DIPs relating to it all talk about returning refs from
 functions and I'm not sure if they relate to the built ins or
 not- I don't think it would quite work for what I have in mind.

Per the spec, all scope is supposed to do is prevent references in a parameter 
to be escaped. To be specific, it says

---
ref­er­ences in the pa­ra­me­ter can­not be es­caped (e.g. as­signed to a 
global vari­able)
---

So, in theory, if you had something like

auto foo(scope int[] i) {...}

it would prevent i or anything refering to it from being returned or assigned 
to any variable which will outlive the function call. However, scope currently 
does _nothing_ for anything other than delegates - which is why I think that 
using the in attribute is such an incredibly bad idea. Using either in or 
scope on anything other than delegates could result in all kinds of code 
breakage if/when scope is ever implemented for types other than delegates.

For delegates, it has the advantage of telling the compiler that it doesn't 
need to allocate a closure (since the delegate won't be used passed the point 
when it's calling scope will exist as could occur if the delegate escaped the 
function it was passed to), but I'm not sure that even that works 100% 
correctly right now.

We really should sort out exactly what we're going to do with scope one of 
these days soon.

But the stuff that some of the DIPS do with scope (e.g. returning with scope - 
which is completely against the spec at this point) are suggestions and not at 
all how it currently works.

- Jonathan M Davis


Re: why allocators are not discussed here

2013-06-28 Thread Adam D. Ruppe

On Friday, 28 June 2013 at 17:43:21 UTC, Jonathan M Davis wrote:
it would prevent i or anything refering to it from being 
returned or assigned to any variable which will outlive the

function call. However,


That's fairly close to what I'd want. But there's two cases I'm 
not sure it would cover:


1:

struct Unique(T) {
   scope T borrow();
}

If the unique pointer decides to let its reference slip, it 
wouldn't want it going somewhere else and escaping, since that 
breaks the unique need.


This is important for a few cases. Here's one:

  int* foo;
  {
 Unique!(int*) bar;
 foo = bar.borrow;

 int* ok = bar.borrow; // this should be ok, because this 
never exists outside the same scope as the Unique

  }

  // foo now talks to a freed *bar, so that shouldn't be allowed

Similarly, if bar were reassigned, this could cause trouble, but 
what we might do is just disallow such reassignments, but maybe 
it could work if it always goes down in scope. I'd have to think 
about that.



(I'm thinking my borrowed thing might have to be a type 
constructor rather than a storage class. Otherwise, you could get 
around it by:


int* bar(scope int* foo) {
  int* b = foo;
  return b;
}

Unless the compiler is very smart about following where it goes.)


But if scope works on the return value too, it might be ok.


maybe 2:

void bar(scope int* foo, int** bar) {
*bar = foo;
}


Actually, I'm reasonably clear the spec's scope words would work 
for this one. But we'd need to be sure - this is one case where 
pure wouldn't help (pure generally would help, since it disallows 
assignments to the outside world, but there's enough holes that 
you could leak a reference).



To be memory safe, these would all have to be guaranteed.


Re: why allocators are not discussed here

2013-06-28 Thread Jonathan M Davis
On Friday, June 28, 2013 19:56:44 Adam D. Ruppe wrote:
 struct Unique(T) {
 scope T borrow();
 }

Per the current spec, this would not be a valid use of scope, as scope is 
specifically a parameter storage class and can only be used on function 
parameters (just like in, out, ref, and lazy). scope seems to be specifically 
intended for guaranteeing that an argument passed to a function does not 
escape that function.

- Jonathan M Davis


Re: why allocators are not discussed here

2013-06-28 Thread Brad Anderson

On Wednesday, 26 June 2013 at 13:16:25 UTC, Jason House wrote:
Bloomberg released an STL alternative called BSL which contains 
an alternate allocator model. In a nutshell object supporting 
custom allocators can optionally take an allocator pointer as 
an argument. Containers will save the pointer and use it for 
all their allocations. It seems simple enough and does not 
embed the allocator in the type.


https://github.com/bloomberg/bsl/wiki/BDE-Allocator-model


There is also EASTL's (Electronic Arts version of STL for 
gamedev) take on allocators.


http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html#eastl_allocator


Re: why allocators are not discussed here

2013-06-27 Thread BLM768

On Wednesday, 26 June 2013 at 23:59:01 UTC, Adam D. Ruppe wrote:

On Wednesday, 26 June 2013 at 23:02:47 UTC, H. S. Teoh wrote:

Maybe a type distinction akin to C++'s auto_ptr might help?


It might not be so bad if we modified D to add a lent storage 
class, or something, similar to some discussions about scope in 
the past.


These would be values you may work with, but never keep; 
assigning them to anything is not allowed and you may only pass 
them to a function or return them from a function if that is 
also marked lent. Any regular reference would be implicitly 
usable as lent.


Something along those lines would probably be a good solution.

It seems that we're working with three types of objects:

1. Objects that are owned by a scope (can be stack-allocated)
2. Objects that are owned by a another object (C/C++-like 
memory management)

3. Objects that have no single owner (GC memory management)

The first two would probably operate under semantics like lent 
or scope, although I'd like to propose an extension to the 
rules: it should be possible to store a weak reference to these 
types (or at least to #2) once we have weak reference support.


The third type seems to be pretty much solved, seeing as we have 
a (mostly) working GC.


Something like this might be a nice way to implement it:

class Thing {}

void doSomething(scope Thing t); //Takes #1, #2, or #3 by 
reference

void doSomethingElse(owned Thing t); //Takes only #2 or #3

void main() {
  scope Thing t1; //stack-allocated
  doSomething(t1);
  owned Thing t2 = new Thing; //heap-allocated but freed at end 
of scope

  doSomething(t2);
}


Re: why allocators are not discussed here

2013-06-27 Thread John Colvin

On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
I know Andrey mentioned he was going to work on Allocators a 
year ago. In DConf 2013 he described the problems he needs to 
solve with Allocators. But I wonder if I am missing the 
discussion around that - I tried searching this forum, found a 
few threads that was not actually a brain storm for Allocators 
design.


Please point me in the right direction
or
is there a reason it is not discussed
or
should we open the discussion?


The easiest approach for Allocators design I can imagine would 
be to let user specify which Allocator operator new should get 
the memory from (introducing a new keyword allocator). This 
gives a total control, but assumes user knows what he is doing.


Example:

CustomAllocator ca;
allocator(ca) {
  auto a = new A; // operator new will use 
ScopeAllocator::malloc()

  auto b = new B;

  free(a); // that should call ScopeAllocator::free()
  // if free() is missing for allocated area, it is a user 
responsibility to make sure custom Allocator can handle that

}

By default allocator is the druntime using GC, free(a) does 
nothing for it.



if some library defines its allocator (e.g. specialized 
container), there should be ability to:

1. override allocator
2. get access to the allocator used

I understand that I spent 5 mins thinking about the way 
Allocators may look.
My point is - if somebody is working on it, can you please 
share your ideas?


Old but perhaps relevant?

http://www.linkedin.com/news?viewArticle=articleID=-1gid=86782type=memberitem=253295471articleURL=http%3A%2F%2Fwww%2Eallendowney%2Ecom%2Fss08%2Fhandouts%2Fberger02reconsidering%2Epdfurlhash=96TJgoback=%2Egmr_86782%2Egde_86782_member_253295471

(It's an academic article about memory allocation from 2002)


Re: why allocators are not discussed here

2013-06-26 Thread Robert Schadek
On 06/26/2013 12:50 AM, Adam D. Ruppe wrote:
 On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
 (introducing a new keyword allocator)

 It would be easier to just pass an allocator object that provides the
 necessary methods and don't use new at all. (I kinda wish new wasn't
 in the language. It'd make this a little more consistent.)


I did think about this as well, but than I came up with something that
IMHO is even simpler.

Imagine we have two delegates:

void* delegate(size_t);  // this one allocs
void delegate(void*);// this one frees

you pass both to a function that constructs you object. The first is
used for allocation the
memory, the second gets attached to the TypeInfo and is used by the gc
to free
the object. This would be completely transparent to the user.

The use in a container is similar. Just use the alloc delegate to
construct the objects and
attach the free delegate to the typeinfo. You could even mix allocator
strategies in the middle
of the lifetime of the container.



Re: why allocators are not discussed here

2013-06-26 Thread Jacob Carlborg

On 2013-06-26 01:16, Adam D. Ruppe wrote:


You'd want it to be RAII or delegate based, so the scope is clear.

with_allocator(my_alloc, {
  do whatever here
});


or

{
ChangeAllocator!my_alloc dummy;

do whatever here
} // dummy's destructor ends the allocator scope


I think the former is a bit nicer, since the dummy variable is a bit
silly. We'd hope that delegate can be inlined.


It won't be inlined. You would need to make it a template parameter to 
have it inlined.


--
/Jacob Carlborg


Re: why allocators are not discussed here

2013-06-26 Thread Jason House
Bloomberg released an STL alternative called BSL which contains 
an alternate allocator model. In a nutshell object supporting 
custom allocators can optionally take an allocator pointer as an 
argument. Containers will save the pointer and use it for all 
their allocations. It seems simple enough and does not embed the 
allocator in the type.


https://github.com/bloomberg/bsl/wiki/BDE-Allocator-model

On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
I know Andrey mentioned he was going to work on Allocators a 
year ago. In DConf 2013 he described the problems he needs to 
solve with Allocators. But I wonder if I am missing the 
discussion around that - I tried searching this forum, found a 
few threads that was not actually a brain storm for Allocators 
design.


Please point me in the right direction
or
is there a reason it is not discussed
or
should we open the discussion?


The easiest approach for Allocators design I can imagine would 
be to let user specify which Allocator operator new should get 
the memory from (introducing a new keyword allocator). This 
gives a total control, but assumes user knows what he is doing.


Example:

CustomAllocator ca;
allocator(ca) {
  auto a = new A; // operator new will use 
ScopeAllocator::malloc()

  auto b = new B;

  free(a); // that should call ScopeAllocator::free()
  // if free() is missing for allocated area, it is a user 
responsibility to make sure custom Allocator can handle that

}

By default allocator is the druntime using GC, free(a) does 
nothing for it.



if some library defines its allocator (e.g. specialized 
container), there should be ability to:

1. override allocator
2. get access to the allocator used

I understand that I spent 5 mins thinking about the way 
Allocators may look.
My point is - if somebody is working on it, can you please 
share your ideas?


Re: why allocators are not discussed here

2013-06-26 Thread cybervadim

On Wednesday, 26 June 2013 at 13:16:25 UTC, Jason House wrote:
Bloomberg released an STL alternative called BSL which contains 
an alternate allocator model. In a nutshell object supporting 
custom allocators can optionally take an allocator pointer as 
an argument. Containers will save the pointer and use it for 
all their allocations. It seems simple enough and does not 
embed the allocator in the type.


https://github.com/bloomberg/bsl/wiki/BDE-Allocator-model


I think the problem with such approach is that you have to 
maniacally add support for custom allocator to every class if you 
want them to be on a custom allocator.
If we simply able to say - all memory allocated in this area {} 
should use my custom allocator, that would simplify the code and 
no need to change std lib.
The next step is to notify allocator when the memory should be 
released. But for the stack based allocator that is not required.
More over, if we introduce access to different GCs (e.g. 
mark-n-sweep, semi-copy, ref counted), we should be able to say 
this {} piece of code is my temporary, so use semi-copy GC, the 
other code is long lived and not much objects created, so use ref 
counted. That is, it is all runtime support and no need changing 
the library code.


Re: why allocators are not discussed here

2013-06-26 Thread Dmitry Olshansky

26-Jun-2013 14:03, Robert Schadek пишет:

On 06/26/2013 12:50 AM, Adam D. Ruppe wrote:

On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:

(introducing a new keyword allocator)


It would be easier to just pass an allocator object that provides the
necessary methods and don't use new at all. (I kinda wish new wasn't
in the language. It'd make this a little more consistent.)



I did think about this as well, but than I came up with something that
IMHO is even simpler.

Imagine we have two delegates:

void* delegate(size_t);  // this one allocs
void delegate(void*);// this one frees

you pass both to a function that constructs you object. The first is
used for allocation the
memory, the second gets attached to the TypeInfo and is used by the gc
to free
the object.


Then it's just GC but with an extra complication.


This would be completely transparent to the user.

The use in a container is similar. Just use the alloc delegate to
construct the objects and
attach the free delegate to the typeinfo. You could even mix allocator
strategies in the middle
of the lifetime of the container.




--
Dmitry Olshansky


Re: why allocators are not discussed here

2013-06-26 Thread Dmitry Olshansky

26-Jun-2013 02:22, cybervadim пишет:

I know Andrey mentioned he was going to work on Allocators a year ago.
In DConf 2013 he described the problems he needs to solve with
Allocators. But I wonder if I am missing the discussion around that - I
tried searching this forum, found a few threads that was not actually a
brain storm for Allocators design.

Please point me in the right direction
or
is there a reason it is not discussed
or
should we open the discussion?


The easiest approach for Allocators design I can imagine would be to let
user specify which Allocator operator new should get the memory from
(introducing a new keyword allocator). This gives a total control, but
assumes user knows what he is doing.

Example:

CustomAllocator ca;
allocator(ca) {
   auto a = new A; // operator new will use ScopeAllocator::malloc()
   auto b = new B;

   free(a); // that should call ScopeAllocator::free()
   // if free() is missing for allocated area, it is a user
responsibility to make sure custom Allocator can handle that
}


Awful. What that extra syntax had brought you? Except that now new is 
unsafe by design?
Other questions involve how does this allocation scope goes inside of 
functions, what is the mechanism of passing it up and down of call-stack.


Last but not least I fail to see how scoped allocators alone (as 
presented) solve even half of the problem.


--
Dmitry Olshansky


Re: why allocators are not discussed here

2013-06-26 Thread H. S. Teoh
On Wed, Jun 26, 2013 at 04:10:49PM +0200, cybervadim wrote:
 On Wednesday, 26 June 2013 at 13:16:25 UTC, Jason House wrote:
 Bloomberg released an STL alternative called BSL which contains an
 alternate allocator model. In a nutshell object supporting custom
 allocators can optionally take an allocator pointer as an
 argument. Containers will save the pointer and use it for all
 their allocations. It seems simple enough and does not embed the
 allocator in the type.
 
 https://github.com/bloomberg/bsl/wiki/BDE-Allocator-model
 
 I think the problem with such approach is that you have to
 maniacally add support for custom allocator to every class if you
 want them to be on a custom allocator.

Yeah, that's a major inconvenience with the C++ allocator model. There's
no way to say switch to allocator A within this block of code; if
you're given a binary-only library that doesn't support allocators,
you're out of luck. And even if you have the source code, you have to
manually modify every single line of code that performs allocation to
take an additional parameter -- not a very feasible approach.


 If we simply able to say - all memory allocated in this area {}
 should use my custom allocator, that would simplify the code and no
 need to change std lib.
 The next step is to notify allocator when the memory should be
 released. But for the stack based allocator that is not required.
 More over, if we introduce access to different GCs (e.g.
 mark-n-sweep, semi-copy, ref counted), we should be able to say this
 {} piece of code is my temporary, so use semi-copy GC, the other
 code is long lived and not much objects created, so use ref counted.
 That is, it is all runtime support and no need changing the library
 code.

Yeah, I think the best approach would be one that doesn't require
changing a whole mass of code to support. Also, one that doesn't require
language changes would be far more likely to be accepted, as the core D
devs are leery of adding yet more complications to the language.

That's why I proposed that gc_alloc and gc_free be made into
thread-global function pointers, that can be swapped with a custom
allocator's version. This doesn't have to be visible to user code; it
can just be an implementation detail in std.allocator, for example. It
allows us to implement custom allocators across a block of code that
doesn't know (and doesn't need to know) what allocator will be used.


T

-- 
Fact is stranger than fiction.


Re: why allocators are not discussed here

2013-06-26 Thread cybervadim
On Wednesday, 26 June 2013 at 14:17:03 UTC, Dmitry Olshansky 
wrote:
Awful. What that extra syntax had brought you? Except that now 
new is unsafe by design?
Other questions involve how does this allocation scope goes 
inside of functions, what is the mechanism of passing it up and 
down of call-stack.


Last but not least I fail to see how scoped allocators alone 
(as presented) solve even half of the problem.


Extra syntax allows me not touching the existing code.
Imagine you have a stateless event processing. That is event 
comes, you do some calculation, prepare the answer and send it 
back. It will look like:


void onEvent(Event event)
{
   process();
}

Because it is stateless, you know all the memory allocated during 
processing will not be required afterwards. So the syntax I 
suggested requires a very little change in code. process() may be 
implemented using std lib, doing several news and resizing.


With new syntax:


void onEvent(Event event)
{
   ScopedAllocator alloc;
   allocator(alloc) {
 process();
   }
}

So now you do not use GC for all that is created inside the 
process().
ScopedAllocator is a simple stack that will free all memory in 
one go.


It is up to the runtime implementation to make sure all memory 
that is allocated inside allocator{} scope is actually allocated 
using ScopedAllocator and not GC.


Does it make sense?


Re: why allocators are not discussed here

2013-06-26 Thread Robert Schadek

 Imagine we have two delegates:

 void* delegate(size_t);  // this one allocs
 void delegate(void*);// this one frees

 you pass both to a function that constructs you object. The first is
 used for allocation the
 memory, the second gets attached to the TypeInfo and is used by the gc
 to free
 the object.

 Then it's just GC but with an extra complication.

IMHO, not really, as the place you get the memory from is not managed by
the GC, or at least not
directly. The GC algorithm would see that there is a free delegate
attached to the object and would
use this to free the memory.

The same should hold true for calling GC.free.

Or are you talking about ref counting and such?


Re: why allocators are not discussed here

2013-06-26 Thread cybervadim

On Wednesday, 26 June 2013 at 14:26:03 UTC, H. S. Teoh wrote:
Yeah, I think the best approach would be one that doesn't 
require
changing a whole mass of code to support. Also, one that 
doesn't require
language changes would be far more likely to be accepted, as 
the core D

devs are leery of adding yet more complications to the language.

That's why I proposed that gc_alloc and gc_free be made into
thread-global function pointers, that can be swapped with a 
custom
allocator's version. This doesn't have to be visible to user 
code; it
can just be an implementation detail in std.allocator, for 
example. It
allows us to implement custom allocators across a block of code 
that
doesn't know (and doesn't need to know) what allocator will be 
used.




Yes, being able to change gc_alloc, gc_free would do the work. If 
runtime  remembers the stack of gc_alloc/gc_free functions like 
pushd, popd, that would simplify its usage.

I think this is a very nice and simple solution to the problem.



Re: why allocators are not discussed here

2013-06-26 Thread Dmitry Olshansky

26-Jun-2013 18:27, cybervadim пишет:

On Wednesday, 26 June 2013 at 14:17:03 UTC, Dmitry Olshansky wrote:

Awful. What that extra syntax had brought you? Except that now new is
unsafe by design?
Other questions involve how does this allocation scope goes inside of
functions, what is the mechanism of passing it up and down of call-stack.

Last but not least I fail to see how scoped allocators alone (as
presented) solve even half of the problem.


Extra syntax allows me not touching the existing code.
Imagine you have a stateless event processing. That is event comes, you
do some calculation, prepare the answer and send it back. It will look
like:

void onEvent(Event event)
{
process();
}

Because it is stateless, you know all the memory allocated during
processing will not be required afterwards.


Here is a chief problem - the assumption that is required to make it 
magically work.


Now what I see is:

T arr[];//TLS

//somewhere down the line
arr = ... ;
else{
...
alloctor(myAlloc){
arr = array(filter!);
}
...
}
return arr;

Having an unsafe magic wand that may transmogrify some code to switch 
allocation strategy I consider naive and dangerous.


Who ever told you process does return before allocating a few Gigs of 
RAM (and hoping on GC collection)? Right, nobody. Maybe it's an event 
loop that may run forever.


What is missing is that code up to date assumes new == GC and works 
_like that_.



So the syntax I suggested
requires a very little change in code. process() may be implemented
using std lib, doing several news and resizing.

With new syntax:


void onEvent(Event event)
{
ScopedAllocator alloc;
allocator(alloc) {
  process();
}
}

So now you do not use GC for all that is created inside the process().
ScopedAllocator is a simple stack that will free all memory in one go.

It is up to the runtime implementation to make sure all memory that is
allocated inside allocator{} scope is actually allocated using
ScopedAllocator and not GC.

Does it make sense?


Yes, but it's horribly broken.

--
Dmitry Olshansky


Re: why allocators are not discussed here

2013-06-26 Thread Dmitry Olshansky

26-Jun-2013 03:16, Adam D. Ruppe пишет:

On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:

And maybe (b) can be implemented by making gc_alloc / gc_free
overridable function pointers? Then we can override their values and
use scope guards to revert them back to the values they were before.


Yea, I was thinking this might be a way to go. You'd have a global
(well, thread-local) allocator instance that can be set and reset
through stack calls.

You'd want it to be RAII or delegate based, so the scope is clear.

with_allocator(my_alloc, {
  do whatever here
});


or

{
ChangeAllocator!my_alloc dummy;

do whatever here
} // dummy's destructor ends the allocator scope



Both suffer from
a) being totally unsafe and in fact bug prone since all references 
obtained in there are now dangling (and there is no indication where 
they came from)
b) imagine you need to use an allocator for a stateful object. Say 
forward range of some other ranges (e.g. std.regex) both scoped/stacked 
to allocate its internal stuff. 2nd one may handle it but not the 1st one.
c) transfer of objects allocated differently up the call graph (scope 
graph?), is pretty much neglected I see.


I kind of wondering how our knowledgeable community has come to this.
(must have been starving w/o allocators way too long)


{
malloced_string str;
auto got = to!string(10, str);
} // str is out of scope, so it gets free()'d. unsafe though: if you
stored a copy of got somewhere, it is now a pointer to freed memory. I'd
kinda like language support of some sort to help mitigate that though,
like being a borrowed pointer that isn't allowed to be stored, but
that's another discussion.

In contrast 'container as an output range' works both safely and would 
be still customizable.


IMHO the only place for allocators is in containers other kinds of code 
may just ignore allocators completely.


std.algorithm and friends should imho be customized on 2 things only:

a) containers to use (instead of array)
b) optionally a memory source (or allocator) f container is 
temporary(scoped) to tie its life-time to smth.


Want temporary stuff? Use temporary arrays, hashmaps and whatnot i.e. 
types tailored for a particular use case (e.g. with a temporary/scoped 
allocator in mind).
These would all be unsafe though. Alternative is ref-counting pointers 
to an allocator. With word on street about ARC it could be nice 
direction to pursue.


Allocators (as Andrei points out in his video) have many kinds:
a) persistence: infinite, manual, scoped
b) size: unlimited vs fixed
c) block-size: any, fixed, or *any* up to some maximum size

Most of these ARE NOT interchangeable!
Yet some are composable however I'd argue that allocators are not 
composable but have some reusable parts that in turn are composable.


Code would have to cutter for specific flavors of allocators still so 
we'd better reduce this problem to the selection of containers.


--
Dmitry Olshansky


Re: why allocators are not discussed here

2013-06-26 Thread Dmitry Olshansky

26-Jun-2013 05:24, Adam D. Ruppe пишет:

I was just quickly skimming some criticism of C++ allocators, since my
thought here is similar to what they do. On one hand, maybe D can do it
right by tweaking C++'s design rather than discarding it.



Criticisms are:

A) Was defined to not have any state (as noted in the standard)
B) Parametrized on type (T) yet a container that is parametrized on it 
may need to allocate something else completely (a node with T).
C) Containers are parametrized on allocators so say 2 lists with 
different allocators are incompatible in a sense that e.g. you can't 
splice pieces of  them together.


Of the above IMHO we can deduce that
a) Should support stateful allocators but we have to make sure we don't 
pay storage space for state-less ones (global ones e.g. mallocator).

b) Should preferably be typeless and let container define what they allocate
c) Hardly solvable unless we require a way to reassign objects between 
allocators (at least of similar kinds)




Anyway, bottom line is I don't think that criticism necessarily applies
to D. But there's surely many others and I'm more or less a n00b re
c++'s allocators so idk yet.



--
Dmitry Olshansky


Re: why allocators are not discussed here

2013-06-26 Thread H. S. Teoh
On Wed, Jun 26, 2013 at 01:16:31AM +0200, Adam D. Ruppe wrote:
 On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
 And maybe (b) can be implemented by making gc_alloc / gc_free
 overridable function pointers? Then we can override their values
 and use scope guards to revert them back to the values they were
 before.
 
 Yea, I was thinking this might be a way to go. You'd have a global
 (well, thread-local) allocator instance that can be set and reset
 through stack calls.
 
 You'd want it to be RAII or delegate based, so the scope is clear.
 
 with_allocator(my_alloc, {
  do whatever here
 });
 
 
 or
 
 {
ChangeAllocator!my_alloc dummy;
 
do whatever here
 } // dummy's destructor ends the allocator scope
 
 
 I think the former is a bit nicer, since the dummy variable is a bit
 silly. We'd hope that delegate can be inlined.

Actually, D's frontend leaves something to be desired when it comes to
inlining delegates. It *is* done sometimes, but not as often as one may
like. For example, opApply generally doesn't inline its delegate, even
when it's just a thin wrapper around a foreach loop.

But yeah, I think the former has nicer syntax. Maybe we can help the
compiler with inlining by making the delegate a compile-time parameter?
But it forces a switch of parameter order, which is Not Nice (hurts
readability 'cos the allocator argument comes after the block instead of
before).


 But, the template still has a big advantage: you can change the
 type. And I think that is potentially enormously useful.

True. It can use different types for different allocators that does (or
doesn't) do cleanups at the end of the scope, depending on what the
allocator needs to do.


 Another question is how to tie into output ranges. Take std.conv.to.
 
 auto s = to!string(10); // currently, this hits the gc
 
 What if I want it to go on a stack buffer? One option would be to
 rewrite it to use an output range, and then call it like:
 
 char[20] buffer;
 auto s = to!string(10, buffer); // it returns the slice of the
 buffer it actually used
 
 (and we can do overloads so to!string(10, radix) still works, as
 well as to!string(10, radix, buffer). Hassle, I know...)

I think supporting the multi-argument version of to!string() is a good
thing, but what to do with library code that calls to!string()? It'd be
nice if we could somehow redirect those GC calls without having to comb
through the entire Phobos codebase for stray calls to to!string().


[...]
 The fun part is the output range works for that, and could also work
 for something like this:
 
 struct malloced_string {
 char* ptr;
 size_t length;
 size_t capacity;
 void put(char c) {
 if(length = capacity)
ptr = realloc(ptr, capacity*2);
 ptr[length++] = c;
 }
 
 char[] slice() { return ptr[0 .. length]; }
 alias slice this;
 mixin RefCounted!this; // pretend this works
 }
 
 
 {
malloced_string str;
auto got = to!string(10, str);
 } // str is out of scope, so it gets free()'d. unsafe though: if you
 stored a copy of got somewhere, it is now a pointer to freed memory.
 I'd kinda like language support of some sort to help mitigate that
 though, like being a borrowed pointer that isn't allowed to be
 stored, but that's another discussion.

Nice!


 And that should work. So then what we might do is provide these
 little output range wrappers for various allocators, and use them on
 many functions.
 
 So we'd write:
 
 import std.allocators;
 import std.range;
 
 // mallocator is provided in std.allocators and offers the goods
 OutputRange!(char, mallocator) str;
 
 auto got = to!string(10, str);

I like this. However, it still doesn't address how to override the
default allocator in, say, Phobos functions.


 What's nice here is the output range is useful for more than just
 allocators. You could also to!string(10, my_file) or a delegate,
 blah blah blah. So it isn't too much of a burden, it is something
 you might naturally use anyway.

Now *that* is a very nice idea. I like having a way of bypassing using a
string buffer, and just writing the output directly to where it's
intended to go. I think to() with an output range parameter definitely
should be implemented. It doesn't address all of the issues, but it's a
very big first step IMO.


 Also, we may have the problem of the wrong allocator
 being used to free the object.
 
 Another reason why encoding the allocator into the type is so nice.
 For the minimal D I've been playing with, the idea I'm running with
 is all allocated memory has some kind of special type, and then
 naked pointers are always assumed to be borrowed, so you should
 never store or free them.

Interesting idea. So basically you can tell which allocator was used to
allocate an object just by looking at its type? That's not a bad idea,
actually.


 auto foo = HeapArray!char(capacity);
 
 void bar(char[] lol){}
 
 bar(foo); // allowed, foo has an alias this on slice


Re: why allocators are not discussed here

2013-06-26 Thread H. S. Teoh
On Wed, Jun 26, 2013 at 04:31:40PM +0200, cybervadim wrote:
 On Wednesday, 26 June 2013 at 14:26:03 UTC, H. S. Teoh wrote:
 Yeah, I think the best approach would be one that doesn't require
 changing a whole mass of code to support. Also, one that doesn't
 require language changes would be far more likely to be accepted, as
 the core D devs are leery of adding yet more complications to the
 language.
 
 That's why I proposed that gc_alloc and gc_free be made into
 thread-global function pointers, that can be swapped with a custom
 allocator's version. This doesn't have to be visible to user code; it
 can just be an implementation detail in std.allocator, for example.
 It allows us to implement custom allocators across a block of code
 that doesn't know (and doesn't need to know) what allocator will be
 used.
 
 
 Yes, being able to change gc_alloc, gc_free would do the work. If
 runtime  remembers the stack of gc_alloc/gc_free functions like pushd,
 popd, that would simplify its usage.  I think this is a very nice and
 simple solution to the problem.

Adam's idea does this: tie each replacement of gc_alloc/gc_free to some
stack-based object, that automatically cleans up in the dtor. So
something along these lines:

struct CustomAlloc(A) {
void* function(size_t size) old_alloc;
void  function(void* ptr)   old_free;

this(A alloc) {
old_alloc = gc_alloc;
old_free  = gc_free;

gc_alloc = A.alloc;
gc_free  = A.free;
}

~this() {
gc_alloc = old_alloc;
gc_free  = old_free;

// Cleans up, e.g., region allocator deletes the
// region
A.cleanup();
}
}

class C {}

void main() {
auto c = new C();   // allocates using default allocator 
(GC)
{
CustomAlloc!MyAllocator _;

// Everything from here on until end of block
// uses MyAllocator

auto d = new C();   // allocates using MyAllocator

{
CustomAlloc!AnotherAllocator _;
auto e = new C(); // allocates using 
AnotherAllocator

// End of scope: auto cleanup, gc_alloc and
// gc_free reverts back to MyAllocator
}

auto f = new C();   // allocates using MyAllocator

// End of scope: auto cleanup, gc_alloc and
// gc_free reverts back to default values
}
auto g = new C();   // allocates using default allocator
}


So you effectively have an allocator stack, and user code never has to
directly manipulate gc_alloc/gc_free (which would be dangerous).


T

-- 
Almost all proofs have bugs, but almost all theorems are true. -- Paul Pedersen


Re: why allocators are not discussed here

2013-06-26 Thread Dicebot
Some type system help is required to guarantee that references to 
such scope-allocated data won't escape.


Re: why allocators are not discussed here

2013-06-26 Thread H. S. Teoh
On Wed, Jun 26, 2013 at 06:51:54PM +0400, Dmitry Olshansky wrote:
 26-Jun-2013 03:16, Adam D. Ruppe пишет:
 On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:
 And maybe (b) can be implemented by making gc_alloc / gc_free
 overridable function pointers? Then we can override their values and
 use scope guards to revert them back to the values they were before.
 
 Yea, I was thinking this might be a way to go. You'd have a global
 (well, thread-local) allocator instance that can be set and reset
 through stack calls.
 
 You'd want it to be RAII or delegate based, so the scope is clear.
 
 with_allocator(my_alloc, {
   do whatever here
 });
 
 
 or
 
 {
 ChangeAllocator!my_alloc dummy;
 
 do whatever here
 } // dummy's destructor ends the allocator scope
 
 
 Both suffer from
 a) being totally unsafe and in fact bug prone since all references
 obtained in there are now dangling (and there is no indication where
 they came from)

How is this different from using malloc() and free() manually? You have
no indication of where a void* came from either, and the danger of
dangling references is very real, as any C/C++ coder knows. And I assume
that *some* people will want to be defining custom allocators that wrap
around malloc/free (e.g. the game engine guys who want total control).


 b) imagine you need to use an allocator for a stateful object. Say
 forward range of some other ranges (e.g. std.regex) both
 scoped/stacked to allocate its internal stuff. 2nd one may handle it
 but not the 1st one.

Yeah this is a complicated area. A container basically needs to know how
to allocate its elements. So somehow that information has to be
somewhere.


 c) transfer of objects allocated differently up the call graph
 (scope graph?), is pretty much neglected I see.

They're incompatible. You can't safely make a linked list that contains
both GC-allocated nodes and malloc() nodes. That's just a bomb waiting
to explode in your face. So in that sense, Adam's idea of using a
different type for differently-allocated objects makes sense. A
container has to declare what kind of allocation its members are using;
any other way is asking for trouble.


 I kind of wondering how our knowledgeable community has come to this.
 (must have been starving w/o allocators way too long)

We're just trying to provoke Andrei into responding. ;-)


[...]
 IMHO the only place for allocators is in containers other kinds of
 code may just ignore allocators completely.

But some people clamoring for allocators are doing so because they're
bothered by Phobos using ~ for string concatenation, which implicitly
uses the GC. I don't think we can just ignore that.


 std.algorithm and friends should imho be customized on 2 things only:
 
 a) containers to use (instead of array)
 b) optionally a memory source (or allocator) f container is
 temporary(scoped) to tie its life-time to smth.
 
 Want temporary stuff? Use temporary arrays, hashmaps and whatnot
 i.e. types tailored for a particular use case (e.g. with a
 temporary/scoped allocator in mind).
 These would all be unsafe though. Alternative is ref-counting
 pointers to an allocator. With word on street about ARC it could be
 nice direction to pursue.

Ref-counting is not fool-proof, though. There's always cycles to mess
things up.


 Allocators (as Andrei points out in his video) have many kinds:
 a) persistence: infinite, manual, scoped
 b) size: unlimited vs fixed
 c) block-size: any, fixed, or *any* up to some maximum size
 
 Most of these ARE NOT interchangeable!
 Yet some are composable however I'd argue that allocators are not
 composable but have some reusable parts that in turn are composable.

I was listening to Andrei's talk this morning, but I didn't quite
understand what he means by composable allocators. Is he talking about
nesting, say, a GC inside a region allocated by a region allocator?


 Code would have to cutter for specific flavors of allocators still
 so we'd better reduce this problem to the selection of containers.
[...]

Hmm. Sounds like we have two conflicting things going on here:

1) En massé replacement of gc_alloc/gc_free in a certain block of code
(which may be the entire program), e.g., for the avoidance of GC in game
engines, etc.. Basically, the code is allocator-agnostic, but at some
higher level we want to control which allocator is being used.

2) Specific customization of containers, etc., as to which allocator(s)
should be used, with (hopefully) some kind of support from the type
system to prevent mistakes like dangling pointers, escaping references,
etc.. Here, the code is NOT allocator-agnostic; it has to be written
with the specific allocation model in mind. You can't just replace the
allocator with another one without introducing bugs or problems.

These two may interact in complex ways... e.g., you might want to use
malloc to allocate a pool, then use a custom gc_alloc/gc_free to
allocate from this pool in order to support language 

Re: why allocators are not discussed here

2013-06-26 Thread Dicebot
By the way, while this topic gets some attention, I want to make 
a notice that there are actually two orthogonal entities that 
arise when speaking about configurable allocation - allocators 
itself and global allocation policies. I think good design should 
address both of those.


For example, changing global allocator for custom one has limited 
usability - you are anyway limited by the language design that 
makes only GC or ref-counting viable general options. However, 
some way to prohibit automatic allocations at runtime while still 
allowing manual ones may be useful - and it does not matter what 
allocator is actually used to get that memory. Once such API is 
designed, tighter classification and control may be added with 
time.


Re: why allocators are not discussed here

2013-06-26 Thread Brian Rogoff

On Wednesday, 26 June 2013 at 17:25:24 UTC, H. S. Teoh wrote:
I was listening to Andrei's talk this morning, but I didn't 
quite
understand what he means by composable allocators. Is he 
talking about
nesting, say, a GC inside a region allocated by a region 
allocator?


Maybe he was talking about a freelist allocator over a reap, as
described by the HeapLayers project http://heaplayers.org/ in the
paper from 2001 titled 'Composing High-Performance Memory
Allocators'. I'm pretty sure that web site was referenced in the
talk. A few publications there are from Andrei.

I agree that D should support programming without a GC, with
different GCs than the default one, and custom allocators, and
that features which demand a GC will be troublesome.

-- Brian


Re: why allocators are not discussed here

2013-06-26 Thread Adam D. Ruppe

On Wednesday, 26 June 2013 at 16:40:20 UTC, H. S. Teoh wrote:
I think supporting the multi-argument version of to!string() is 
a good thing, but what to do with library code that calls 
to!string()? It'd be nice if we could somehow redirect those GC 
calls without having to comb through the entire Phobos codebase 
for stray calls to to!string().



Let's consider what kinds of allocations we have. We can break 
them up into two broad groups: internal and visible.


Internal allocations, in theory, don't matter. These can be on 
the stack, the gc heap, malloc/free, whatever. The function 
itself is responsible for their entire lifetime.


Changing these either optimize, in the case of reusing a region, 
or leak if you switch it to manual and the function doesn't know 
it.


Visible allocations are important because the caller is 
responsible for freeing them. Here, I really think we want the 
type system's help: either it should return something that we 
know we're responsible for, or take a buffer/output range from us 
to receive the data in the first place.


Either way, the function signature should reflect what's going on 
with visible allocations. It'd possibly return a wrapped type and 
it'd take an output range/buffer/allocator.




With internals though, the only reason I can see why you'd want 
to change them outside the function is to give them a region of 
some sort to work with, especially since you don't know for sure 
what it is doing - these are all local variables to the 
function/call stack. And here, I don't think we want to change 
the allocator wholesale.


At most, we'd want to give it hints that what we're doing are 
short lived. (Or, better yet, have it figure this out on its own, 
like a generational gc.)




So I think this is more about tweaking the gc than replacing it, 
at most adding a couple new functions to it:


GC.hint_short_lived // returns a helper struct with a static 
refcount:


TempGcAllocator {
 static int tempCount = 0;
 static void* localRegion;
 this() { tempCount++; } // pretend this works
 ~this() { tempCount--; if(tempCount == 0) 
gc.tryToCollect(localRegion); }


 T create(T, Args...)(Args args) { return GC.new_short_lived 
T(args); }

}


and gc.tryToCollect() does a quick scan for anything into the 
local region. If there's nothing in there, it frees the whole 
thing. If there is, in the name of memory safety, it just 
reintegrates that local region into the regular memory and gc's 
its components normally.




The reason the count is static is that you don't have to pass 
this thing down the call stack. Any function that wants to adapt 
to this generational hint system just calls hint_short_lived. If 
you're a leaf function, that's ok, the static count means you'll 
inherit the region from the function above you.


You would NOT use this in main(), as that defeats the purpose.



I think to() with an output range parameter definitely
should be implemented.


No doubt about it, we should aim for most phobos functions not to 
allocate at all, if given an output range they can use.



Interesting idea. So basically you can tell which allocator was 
used to allocate an object just by looking at its type?


Right, then you'll know if you have to free() it. (Or it can free 
itself with its destructor.)



This is a bit inconvenient. So your member variables will have 
to know what allocation type is being used. Not the end of the

world, of course, but not as pretty as one would like.


Yeah, you'd need to know if you own them or not too (are you 
responsible for freeing that string you just got passed? If no, 
are you sure it won't be freed while you're still using it?), but 
I just think that's a part of memory management you can't 
sidestep.


There's two easy answers: 1) always make a private copy of 
anything you store (and perhaps write to) or 2) use a gc and 
trust it to always be the owner.


In any other case, I think you *have* to think about it, and the 
type telling you can help you make that decision.



and allows you to mix differently-allocated objects without 
having to


Important to remember though that you are borrowing these 
references, not taking ownership.


I think the rule of all pointers/slices are borrowed is fairly 
workable though. With the gc, that's ok, you don't own anything. 
The garbage collector is responsible for it all, so store away. 
(Though if it is mutable, you might want to idup it so you don't 
get overwritten by someone else. But that's a separate question 
from allocation method and already encoded in D's type 
system).


So never free() a naked pointer, unless you know what you're 
doing like interfacing with a C library, prefer to only free a 
ManuallyAllocated!(pointer).


hell a C library binding could change the type too, it'd still be 
binary compatible. RefCounted!T wouldn't be, but 
ManuallyAllocated!T would just be a wrapper around T*.


I think I'm starting to ramble!


Re: why allocators are not discussed here

2013-06-26 Thread Adam D. Ruppe

On Wednesday, 26 June 2013 at 17:25:24 UTC, H. S. Teoh wrote:

malloc to allocate a pool, then use a custom gc_alloc/gc_free to
allocate from this pool in order to support language built-ins 
like ~ and ~= without needing to rewrite every function that 
uses strings.


Blargh, I forgot about operator ~ on built ins. For custom types 
it is easy enough to manage, just overload it. You can even do ~= 
on types that aren't allowed to allocate, if they have a certain 
capacity set up ahead of time (like a stack buffer)


But for built ins, blargh, I don't even think we can hint on them 
to the gc. Maybe we should just go ahead and make the gc 
generational. (If you aren't using gc, I say leave binary ~ 
unimplemented in all cases. Use ~= on a temporary instead 
whenever you would do that. It is easier to follow the lifetime 
if you explicitly declare your temporary.)


Re: why allocators are not discussed here

2013-06-26 Thread cybervadim
On Wednesday, 26 June 2013 at 14:59:41 UTC, Dmitry Olshansky 
wrote:
Here is a chief problem - the assumption that is required to 
make it magically work.


Now what I see is:

T arr[];//TLS

//somewhere down the line
arr = ... ;
else{
...
alloctor(myAlloc){
arr = array(filter!);
}
...
}
return arr;

Having an unsafe magic wand that may transmogrify some code to 
switch allocation strategy I consider naive and dangerous.


Who ever told you process does return before allocating a few 
Gigs of RAM (and hoping on GC collection)? Right, nobody. Maybe 
it's an event loop that may run forever.


What is missing is that code up to date assumes new == GC and 
works _like that_.


Not magic, but the tool which is quite powerful and thus it may 
shoot your leg.
This is unsafe, but if you want it safe, don't use allocators, 
stay with GC.
In the example above, you get first arr freed by GC, second arr 
may point to nothing if myAlloc was implemented to free it 
before. Or you may get a proper arr reference if myAlloc used 
malloc and didn't free it. The fact that you may write bad code 
does not make the language (or concept) bad.




Re: why allocators are not discussed here

2013-06-26 Thread Dmitry Olshansky

26-Jun-2013 23:04, cybervadim пишет:

On Wednesday, 26 June 2013 at 14:59:41 UTC, Dmitry Olshansky wrote:



Having an unsafe magic wand that may transmogrify some code to switch
allocation strategy I consider naive and dangerous.

Who ever told you process does return before allocating a few Gigs of
RAM (and hoping on GC collection)? Right, nobody. Maybe it's an event
loop that may run forever.

What is missing is that code up to date assumes new == GC and works
_like that_.


Not magic, but the tool which is quite powerful and thus it may shoot
your leg.


I know what kind of thing you are talking about. It's ain't powerful 
it's just a hack that doesn't quite do what advertised.



This is unsafe, but if you want it safe, don't use allocators, stay with
GC.


BTW you were talking changing allocation of the code you didn't write.
There is not even single fact that makes the thing safe. It's all 
working by chance or because the thing was designed to work with scoped 
allocator to begin with.


I believe the 2nd case (design to use scoped allocation) is
a) The behavior is guaranteed (determinism vs GC etc)
b) Safety is assured be the designer not pure luck (and reasonable 
assumption that may not hold)



In the example above, you get first arr freed by GC, second arr may
point to nothing if myAlloc was implemented to free it before. Or you
may get a proper arr reference if myAlloc used malloc and didn't free
it.


Yeah I know, hence I showed it. BTW forget about malloc I'm not talking 
about explicit malloc being an alternative to you scheme.


 The fact that you may write bad code does not make the language (or
 concept) bad.

It does. Because it introduces easy unreliable and bug prone usage.

--
Dmitry Olshansky


Re: why allocators are not discussed here

2013-06-26 Thread Dmitry Olshansky

26-Jun-2013 21:35, Dicebot пишет:

By the way, while this topic gets some attention, I want to make a
notice that there are actually two orthogonal entities that arise when
speaking about configurable allocation - allocators itself and global
allocation policies. I think good design should address both of those.



Sadly I believe that global allocators would still have to be compatible 
with GC (to not break code in hard to track ways) thus basically being a 
GC. Hence we can easily stop talking about them ;)




--
Dmitry Olshansky


Re: why allocators are not discussed here

2013-06-26 Thread Dmitry Olshansky

26-Jun-2013 21:23, H. S. Teoh пишет:


Both suffer from
a) being totally unsafe and in fact bug prone since all references
obtained in there are now dangling (and there is no indication where
they came from)


How is this different from using malloc() and free() manually? You have
no indication of where a void* came from either, and the danger of
dangling references is very real, as any C/C++ coder knows. And I assume
that *some* people will want to be defining custom allocators that wrap
around malloc/free (e.g. the game engine guys who want total control).


Why the heck you people think I purpose to use malloc directly as 
alternative to whatever hackish allocator stack proposed?


Use the darn container. For starters I'd make allocation strategy a 
parameter of each containers. At least they do OWN memory.


Then refactor out common pieces into a framework of allocation helpers. 
I'd personally in the end would separate concerns into 3 entities:


1. Memory area objects - think as allocators but without the circuitry 
to do the allocation, e.g. a chunk of memory returned by malloc/alloca 
can be wrapped into a memory area object.


2. Allocators (Policies) - a potentially nested combination of such 
circuitry that makes use of memory areas. Free-lists, pools, stacks 
etc. Safe ones have ref-counting on memory areas, unsafe once don't. 
(Though safety largely depends on the way you got that chunk of memory)


3. Containers/Warppers as above objects that handle life-cycle of 
objects and make use of allocators. In fact allocators are part of

type but not memory area objects.





b) imagine you need to use an allocator for a stateful object. Say
forward range of some other ranges (e.g. std.regex) both
scoped/stacked to allocate its internal stuff. 2nd one may handle it
but not the 1st one.


Yeah this is a complicated area. A container basically needs to know how
to allocate its elements. So somehow that information has to be
somewhere.



c) transfer of objects allocated differently up the call graph
(scope graph?), is pretty much neglected I see.


They're incompatible. You can't safely make a linked list that contains
both GC-allocated nodes and malloc() nodes.


What I mean is that if types are the same as built-ins it would be a 
horrible mistake. If not then we are talking about containers anyway.
And if these have a ref-counted pointer to their allocator then the 
whole thing is safe albeit at the cost of performance.


Sadly alias this to some built-in (=e.g. slice) allows squirreling away 
underlying reference too easily.


As such I don't believe in any of the 2 *lies*:
a) built-ins can be refurbished to use custom allocators
b) we can add opSlice/alias this or whatever to our custom type to get 
access to the underlying built-ins safely and transparently


Both are just nuclear bombs waiting a good time to explode.

That's just a bomb waiting

to explode in your face. So in that sense, Adam's idea of using a
different type for differently-allocated objects makes sense.


Yes, but one should be careful here as not to have exponential explosion 
in the code size. So some allocators have to be compatible and if there 
is a way to transfer ownership it'd be bonus points (and a large pot of 
these mind you).



A
container has to declare what kind of allocation its members are using;
any other way is asking for trouble.


Hence my thoughts to move this piece of circuitry to containers 
proper. The whole idea that by swapping malloc with myMalloc you can 
translate to a wildly different allocation scheme doesn't quite hold.


I think it may be interesting to try and put a wall in different place 
namely in between allocation strategy and memory areas it works on.




I kind of wondering how our knowledgeable community has come to this.
(must have been starving w/o allocators way too long)


We're just trying to provoke Andrei into responding. ;-)


Cool, then keep it coming but ... safety and other holes has to be taken 
care of.



[...]

IMHO the only place for allocators is in containers other kinds of
code may just ignore allocators completely.


But some people clamoring for allocators are doing so because they're
bothered by Phobos using ~ for string concatenation, which implicitly
uses the GC. I don't think we can just ignore that.


~= would work with any sensible array-like contianer.
~ is sadly only a convenience for scripts and/or non-performance 
(determinism) critical apps unfortunately.




std.algorithm and friends should imho be customized on 2 things only:

a) containers to use (instead of array)
b) optionally a memory source (or allocator) f container is
temporary(scoped) to tie its life-time to smth.

Want temporary stuff? Use temporary arrays, hashmaps and whatnot
i.e. types tailored for a particular use case (e.g. with a
temporary/scoped allocator in mind).
These would all be unsafe though. Alternative is ref-counting
pointers to an allocator. With word on street about ARC it 

Re: why allocators are not discussed here

2013-06-26 Thread Marco Leise
Am Wed, 26 Jun 2013 16:30:50 +0200
schrieb Robert Schadek realbur...@gmx.de:

 
  Imagine we have two delegates:
 
  void* delegate(size_t);  // this one allocs
  void delegate(void*);// this one frees
 
  you pass both to a function that constructs you object. The first is
  used for allocation the
  memory, the second gets attached to the TypeInfo and is used by the gc
  to free
  the object.
 

Does it mean 16 extra bytes for every allocation ?

-- 
Marco



Re: why allocators are not discussed here

2013-06-26 Thread Robert Schadek
On 06/26/2013 10:06 PM, Marco Leise wrote:
 Does it mean 16 extra bytes for every allocation ?

yes, or wrap it, and you have 4 or 8 bytes, but yes you would to have
save it somewhere


Re: why allocators are not discussed here

2013-06-26 Thread Dicebot
On Wednesday, 26 June 2013 at 19:40:54 UTC, Dmitry Olshansky 
wrote:
Sadly I believe that global allocators would still have to be 
compatible with GC (to not break code in hard to track ways) 
thus basically being a GC. Hence we can easily stop talking 
about them ;)


Nice way to say we don't really need that embedded, kernel and 
gamedev guys. GC as a safe an obvious approach should be the 
default but druntime needs to provide means for tight and 
dangerous control upon explicit request.


Re: why allocators are not discussed here

2013-06-26 Thread Dmitry Olshansky

27-Jun-2013 00:53, Dicebot пишет:

On Wednesday, 26 June 2013 at 19:40:54 UTC, Dmitry Olshansky wrote:

Sadly I believe that global allocators would still have to be
compatible with GC (to not break code in hard to track ways) thus
basically being a GC. Hence we can easily stop talking about them ;)


Nice way to say we don't really need that embedded, kernel and gamedev
guys. GC as a safe an obvious approach should be the default but
druntime needs to provide means for tight and dangerous control upon
explicit request.


Just don't use certain built-ins. Stub them out in run-time if you like. 
The only problematic point I see is closures allocated on heap.


Frankly I see embedded, kernel and gamedev guys using ref-counting and 
custom data structures all the time. They all want that level of control 
and determinism anyway or are so resource constrained that GC is too 
much code space or run-time overhead anyway.


Needless to say that custom run-time for the first 2 categories is 
required anyway so just hack the druntime. It would be nice to have 
hooks readily available (and documented?) to do so but hardly beyond that.


--
Dmitry Olshansky


Re: why allocators are not discussed here

2013-06-26 Thread Dicebot
On Wednesday, 26 June 2013 at 21:00:54 UTC, Dmitry Olshansky 
wrote:
Needless to say that custom run-time for the first 2 categories 
is required anyway so just hack the druntime. It would be nice 
to have hooks readily available (and documented?) to do so but 
hardly beyond that.


It is an API issue. Hacking druntime is, unfortunately, 
inevitable but keeping ability to swap those two with no code 
changes simplifies development process and makes less tempting 
too forget about this use case when doing std lib / runtime stuff 
- it has been a second-class citizen for rather long time.


Re: why allocators are not discussed here

2013-06-26 Thread Adam D. Ruppe
On Wednesday, 26 June 2013 at 21:00:54 UTC, Dmitry Olshansky 
wrote:
Just don't use certain built-ins. Stub them out in run-time if 
you like. The only problematic point I see is closures 
allocated on heap.


Actually, I was kinda sorta able to solve this in my minimal d.

// this would be used for automatic heap closures, but there's no 
way to free it...

///*
extern(C)
void* _d_allocmemory(size_t bytes) {
auto ptr = manual_malloc(bytes);
debug(allocations) {
char[16] buffer;
write(warning: automatic memory allocation , 
intToString(cast(size_t) ptr, buffer));

}
return ptr;
}


struct HeapClosure(T) if(is(T == delegate)) {
mixin SimpleRefCounting!(T, q{
char[16] buffer;
write(\nfreeing closure , 
intToString(cast(size_t) payload.ptr, buffer),\n);

manual_free(payload.ptr);
});
}

HeapClosure!T makeHeapClosure(T)(T t) { // if(__traits(isNested, 
T)) {

return HeapClosure!T(t);
}



void closureTest2(HeapClosure!(void delegate()) test) {
write(\nptr is , cast(size_t) test.ptr, \n);
test();

auto b = test;
}

void closureTest() {
string a = whoa;
scope(exit) write(\n\nexit\n\n);
//throw new Exception(test);
closureTest2( makeHeapClosure({ write(a); }) );
}




It worked in my toy tests. The trick would be though to never 
store or use a non-scope builtin delegate. Using RTInfo, I 
believe I can statically verify you don't do this in the whole 
program,  but haven't actually tried yet.



I also left built in append unimplemented, but did custom types 
with ~= that are pretty convenient. Binary ~ is a loss though, 
too easy to lose pointers with that.


Re: why allocators are not discussed here

2013-06-26 Thread Dmitry Olshansky

27-Jun-2013 01:05, Adam D. Ruppe пишет:

On Wednesday, 26 June 2013 at 21:00:54 UTC, Dmitry Olshansky wrote:

Just don't use certain built-ins. Stub them out in run-time if you
like. The only problematic point I see is closures allocated on heap.


Actually, I was kinda sorta able to solve this in my minimal d.

// this would be used for automatic heap closures, but there's no way to
free it...


[snip a cool hack]

Yeah, I suspected something like this might work. Basically defining 
your own ref-count closure type and forging delegate keyword in your 
codebase (except in the file that defines heap closure). That still 
leaves chasing code like auto dg = (...){ ... } though.


Maybe having it as a template Closure!(ret-type, arg types...)
and instantiator function called simply closure could be more
ecstatically pleasing (this is IMHO).


It worked in my toy tests. The trick would be though to never store or
use a non-scope builtin delegate. Using RTInfo, I believe I can
statically verify you don't do this in the whole program,  but haven't
actually tried yet.


I also left built in append unimplemented, but did custom types with ~=
that are pretty convenient. Binary ~ is a loss though, too easy to lose
pointers with that.



--
Dmitry Olshansky


Re: why allocators are not discussed here

2013-06-26 Thread Adam D. Ruppe
So to try some ideas, I started implementing a simple container 
with replaceable allocators: a singly linked list.


All was going kinda well until I realized the forward range it 
offers to iterate its contents makes it possible to escape a 
reference to a freed node.


auto range = list.range;
auto range2 = range;
range.removeFront();

range2 now refers to a freed node. Maybe the nodes could be 
refcounted, though a downside there is even the range won't be 
sharable, it would be a different type based on allocation 
method. (I was hoping to make the range be a sharable component, 
even as the list itself changed type with allocators.)


I guess we could @disable copy construction, and make it a 
forward range instead of an input one, but that takes some of the 
legitimate usefulness away.


Interestingly though, opApply would be ok here, since all it 
would expose is the payload.


(though if the payload is a reference type, does the container 
take ownership of it? How do we indicate that? Perhaps more 
interestingly, how do we indicate the /lack/ of ownership at the 
transfer point?)




This is all fairly easy if we just decide we're going to do this 
with GC or we're going to do this C style and do the whole 
program like that, libraries and all. But trying to mix and match 
just gets more complicated the more I think about it :( It makes 
the question of allocators look trivial.


Re: why allocators are not discussed here

2013-06-26 Thread H. S. Teoh
On Thu, Jun 27, 2013 at 12:43:54AM +0200, Adam D. Ruppe wrote:
 So to try some ideas, I started implementing a simple container with
 replaceable allocators: a singly linked list.
 
 All was going kinda well until I realized the forward range it
 offers to iterate its contents makes it possible to escape a
 reference to a freed node.
[...]
 (though if the payload is a reference type, does the container take
 ownership of it? How do we indicate that? Perhaps more interestingly,
 how do we indicate the /lack/ of ownership at the transfer point?)

Maybe a type distinction akin to C++'s auto_ptr might help? Say we
introduce OwnedRef!T vs. plain old T*. So something returning OwnedRef!T
will need to assume ownership of the object, whereas something returning
T* would just be returning a reference, but the container continues to
hold ownership over the object.


 This is all fairly easy if we just decide we're going to do this
 with GC or we're going to do this C style and do the whole
 program like that, libraries and all. But trying to mix and match
 just gets more complicated the more I think about it :( It makes the
 question of allocators look trivial.

Heh. Yeah, I'm started to wonder if it even makes sense to try to
mix-n-match GC-based and non-GC-based allocators. It seems that maybe we
just have to settle for the fact of life that a GC-based object is
fundamentally incompatible with a pool-allocated object, and both are
also fundamentally incompatible with malloc-allocated objects, 'cos you
need the code to be aware in each instance of what needs to be done to
cleanup, etc..


T

-- 
GEEK = Gatherer of Extremely Enlightening Knowledge


Re: why allocators are not discussed here

2013-06-26 Thread Adam D. Ruppe

On Wednesday, 26 June 2013 at 23:02:47 UTC, H. S. Teoh wrote:

Maybe a type distinction akin to C++'s auto_ptr might help?


Yeah, that's what I'm thinking, but I don't really like it. 
Perhaps I'm trying too hard to cover everything, and should be 
happier with just doing what C++ does. Full memory safety is 
prolly out the window anyway.


In std.typecons, there's a Unique!T, but it doesn't look 
complete. A lot of the code is commented out, maybe it was 
started back in the days of bug city.


Yeah, I'm started to wonder if it even makes sense to try to 
mix-n-match GC-based and non-GC-based allocators.


It might not be so bad if we modified D to add a lent storage 
class, or something, similar to some discussions about scope in 
the past.


These would be values you may work with, but never keep; 
assigning them to anything is not allowed and you may only pass 
them to a function or return them from a function if that is also 
marked lent. Any regular reference would be implicitly usable as 
lent.


int* ptr;

void bar(int* a) {
  foo(a); // ok
}

int* foo(lent int* a) {
   bar(a); // error, cannot call bar with lent pointer
   ptr = a; // error, cannot assign lent value to non-lent field
   foo2(a); // ok
   foo(foo2(a)); // ok
   return a; // error, cannot return a lent value
}

lent int* foo2(lent int* a) {
   return a; // ok
}

foo(ptr); // ok (if foo actually compiled)

And finally, if you take the address of a lent reference, that 
itself is lent; (lent int*) == lent int**.



Then, if possible, it would be cool if:

lent int* a;
{
  int* b;
  a = b;
}


That was an error, because a outlived b. But since you can't 
store a anywhere, the only time this would happen would be 
something like here. And hell maybe we could hammer around that 
by making lent variables head const and say they must be 
initialized at declaration, so lent int* a; is illegal as well 
as a = b;. But we wouldn't want it transitively const, because 
then:


void fillBuffer(lent char[] buffer) {}

would be disallowed and that is something I would definitely want.



Part of me thinks pure might help with this too but eh maybe 
not because even a pure function could in theory escape a 
reference via its other parameters.




But with this kind of thing, we could do a nicer pointer type 
that does:


lent T getThis() { return _this; }
alias getThis this;

and thus implicitly convert our inner pointer to something we can 
use on the outside world with some confidence that they won't 
sneak away any references to it. If combined with @disabling the 
address of operator on the container itself, we could really lock 
down ownership.


Re: why allocators are not discussed here

2013-06-25 Thread H. S. Teoh
On Wed, Jun 26, 2013 at 12:22:04AM +0200, cybervadim wrote:
 I know Andrey mentioned he was going to work on Allocators a year
 ago. In DConf 2013 he described the problems he needs to solve with
 Allocators. But I wonder if I am missing the discussion around that
 - I tried searching this forum, found a few threads that was not
 actually a brain storm for Allocators design.
 
 Please point me in the right direction
 or
 is there a reason it is not discussed
 or
 should we open the discussion?

That would be nice to get things going. :)

Ever since I found D and subscribed to this mailing list, I've been
hearing rumors of allocators, but they seem to be rather lacking in the
department of concrete evidence. They're like the Big Foot or Swamp Ape
of D. Maybe it's time we got out into the field and produced some real
evidence of these mythical beasts. :-P


 The easiest approach for Allocators design I can imagine would be to
 let user specify which Allocator operator new should get the memory
 from (introducing a new keyword allocator). This gives a total
 control, but assumes user knows what he is doing.
 
 Example:
 
 CustomAllocator ca;
 allocator(ca) {
   auto a = new A; // operator new will use ScopeAllocator::malloc()
   auto b = new B;
 
   free(a); // that should call ScopeAllocator::free()
   // if free() is missing for allocated area, it is a user
 responsibility to make sure custom Allocator can handle that
 }
 
 By default allocator is the druntime using GC, free(a) does nothing
 for it.

I believe the current direction is to avoid needing new language
features / syntax. So the above probably won't happen.


 if some library defines its allocator (e.g. specialized container),
 there should be ability to:
 1. override allocator
 2. get access to the allocator used
 
 I understand that I spent 5 mins thinking about the way Allocators
 may look.
 My point is - if somebody is working on it, can you please share
 your ideas?

Well, thanks for getting the ball rolling. Maybe Andrei can pipe up
about any experimental designs he's currently considering.

But barring that, I'm thinking about how allocators would be used in
user code. I think it's pretty much a given that the C++ way of sticking
it to the end of template arguments doesn't really fly: it's just too
much of a hassle to keep having to worry about passing allocators around
template arguments, that people just don't bother. So coming back to
square one, how would allocators be used?

1) Usually, the user would just be content with the GC, and not ever
have to worry about allocators. So this means that whatever allocator
design we adopt, it should be practically invisible to ordinary users
unless they're specifically looking to change how memory is allocated.

2) Furthermore, it's unlikely that in the same piece of code, you'd want
to use 3 or 4 different allocators for different objects; while such
cases may exist, it seems to me to be more likely that you want either
(a) a very specific object (say a class instance or container) to use a
particular allocator, or (b) you want to transitively block off an
entire section of code (which may be the entire program in some cases)
to use a particular allocator.

As a first stab at it, I'd say (a) can be implemented by a static class
member reference to an allocator, that can be set from user code.

And maybe (b) can be implemented by making gc_alloc / gc_free
overridable function pointers? Then we can override their values and use
scope guards to revert them back to the values they were before. This
allows us to use the runtime stack to manage which allocator is
currently active. This lets *all* memory allocations be rerouted through
the custom allocator without needing to hand-edit every call to new down
the call graph.

This is just a very crude first stab at the problem, though. In
particular, (a) isn't very satisfactory. And also the interaction of
allocated objects with the call stack: if any custom-allocated objects
in (b) survive past the containing function which sets/resets the
function pointers, there could be problems: if a member function of such
an object needs to allocate memory, it will pick up the ambient
allocator instead of the custom allocator in effect when the object was
first created. Also, we may have the problem of the wrong allocator
being used to free the object.

Anyone has better ideas?


T

-- 
All problems are easy in retrospect.


Re: why allocators are not discussed here

2013-06-25 Thread Adam D. Ruppe

On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:

(introducing a new keyword allocator)


It would be easier to just pass an allocator object that provides 
the necessary methods and don't use new at all. (I kinda wish new 
wasn't in the language. It'd make this a little more consistent.)


The allocator's create function could also return wrapped types, 
like RefCounted!T or NotNull!T depending on what it does.


Though the devil is in the details here and I don't think I can 
say more without trying to actually do it.


Re: why allocators are not discussed here

2013-06-25 Thread H. S. Teoh
On Wed, Jun 26, 2013 at 12:50:36AM +0200, Adam D. Ruppe wrote:
 On Tuesday, 25 June 2013 at 22:22:09 UTC, cybervadim wrote:
 (introducing a new keyword allocator)
 
 It would be easier to just pass an allocator object that provides
 the necessary methods and don't use new at all. (I kinda wish new
 wasn't in the language. It'd make this a little more consistent.)

It's not too late to introduce a default allocator object that maps to
built-in GC primitives. Maybe something like:

struct DefaultAllocator
{
T* alloc(T, A...)(A args) {
return new T(args);
}
void free(T)(T* ref) {
// no-op
}
}

We can then change Phobos to always use allocator.alloc and
allocator.free, which it gets from user code somehow, and in the default
case it would do the Right Thing.


 The allocator's create function could also return wrapped types,
 like RefCounted!T or NotNull!T depending on what it does.

So maybe something like:

struct RefCountedAllocator
{
RefCounted!T alloc(T, A...)(A args) {
return allocRefCounted(args);
}
void free(T)(RefCounted!T ref) {
dotDotDotMagic(ref);
}
}

etc..


 Though the devil is in the details here and I don't think I can say
 more without trying to actually do it.

The main issue I see is how *not* to get stuck in C++'s situation where
you have to specify allocator objects everywhere, which is highly
inconvenient and liable for people to avoid using, which defeats the
purpose of having allocators. It would be nice, IMO, if we can somehow
let the user specify a custom allocator for, say, the whole of Phobos,
so that people who care about this sorta thing can just replace the GC
wholesale and then use Phobos to their hearts' content without having to
manually specify allocator objects everywhere and risk forgetting a
single case that eventually leads to memory leakage.


T

-- 
Computers shouldn't beep through the keyhole.


Re: why allocators are not discussed here

2013-06-25 Thread cybervadim

On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:

On Wed, Jun 26, 2013 at 12:22:04AM +0200, cybervadim wrote:

That would be nice to get things going. :)

Ever since I found D and subscribed to this mailing list, I've 
been
hearing rumors of allocators, but they seem to be rather 
lacking in the
department of concrete evidence. They're like the Big Foot or 
Swamp Ape
of D. Maybe it's time we got out into the field and produced 
some real

evidence of these mythical beasts. :-P

Well, thanks for getting the ball rolling. Maybe Andrei can 
pipe up

about any experimental designs he's currently considering.

But barring that, I'm thinking about how allocators would be 
used in
user code. I think it's pretty much a given that the C++ way of 
sticking
it to the end of template arguments doesn't really fly: it's 
just too
much of a hassle to keep having to worry about passing 
allocators around
template arguments, that people just don't bother. So coming 
back to

square one, how would allocators be used?

1) Usually, the user would just be content with the GC, and not 
ever
have to worry about allocators. So this means that whatever 
allocator
design we adopt, it should be practically invisible to ordinary 
users
unless they're specifically looking to change how memory is 
allocated.


2) Furthermore, it's unlikely that in the same piece of code, 
you'd want
to use 3 or 4 different allocators for different objects; while 
such
cases may exist, it seems to me to be more likely that you want 
either
(a) a very specific object (say a class instance or container) 
to use a
particular allocator, or (b) you want to transitively block off 
an
entire section of code (which may be the entire program in some 
cases)

to use a particular allocator.

As a first stab at it, I'd say (a) can be implemented by a 
static class
member reference to an allocator, that can be set from user 
code.


And maybe (b) can be implemented by making gc_alloc / gc_free
overridable function pointers? Then we can override their 
values and use
scope guards to revert them back to the values they were 
before. This

allows us to use the runtime stack to manage which allocator is
currently active. This lets *all* memory allocations be 
rerouted through
the custom allocator without needing to hand-edit every call to 
new down

the call graph.

This is just a very crude first stab at the problem, though. In
particular, (a) isn't very satisfactory. And also the 
interaction of
allocated objects with the call stack: if any custom-allocated 
objects
in (b) survive past the containing function which sets/resets 
the
function pointers, there could be problems: if a member 
function of such

an object needs to allocate memory, it will pick up the ambient
allocator instead of the custom allocator in effect when the 
object was
first created. Also, we may have the problem of the wrong 
allocator

being used to free the object.

Anyone has better ideas?


T


From my experience all objects may be divided into 2 categories
1. temporaries. Program usually have some kind of event loop. 
During one iteration of this loop some temporary objects are 
created and then discarded. The ideal case for stack (or ranged 
or area) allocator, where you define allocator at the beginning 
of the loop cycle, use it for all temporaries, then free all the 
memory in one go at the end of iteration.
2. containers. Program receives an event from the outside and 
puts some data into container OR update the data if the record 
already exists.
The important thing here is - when updating the data in 
container, you may want to resize the existing area.


If you are working with temporary which should be placed into 
container, a copy can be made (with corresponding memory 
allocation from container allocator).


Not sure if there is anything better than stack/area allocator 
for the first class. For the second class user should be able to 
choose default GC or more precise memory handling (e.g. explicit 
malloc/free for resizing).


Anything I am missing in this categorization?

So even if we get allocators that lets us deal with temporaries, 
that will be a huge benefit.


Re: why allocators are not discussed here

2013-06-25 Thread Adam D. Ruppe

On Tuesday, 25 June 2013 at 22:50:55 UTC, H. S. Teoh wrote:

And maybe (b) can be implemented by making gc_alloc / gc_free
overridable function pointers? Then we can override their 
values and use scope guards to revert them back to the values 
they were before.


Yea, I was thinking this might be a way to go. You'd have a 
global (well, thread-local) allocator instance that can be set 
and reset through stack calls.


You'd want it to be RAII or delegate based, so the scope is clear.

with_allocator(my_alloc, {
 do whatever here
});


or

{
   ChangeAllocator!my_alloc dummy;

   do whatever here
} // dummy's destructor ends the allocator scope


I think the former is a bit nicer, since the dummy variable is a 
bit silly. We'd hope that delegate can be inlined.




But, the template still has a big advantage: you can change the 
type. And I think that is potentially enormously useful.




Another question is how to tie into output ranges. Take 
std.conv.to.


auto s = to!string(10); // currently, this hits the gc

What if I want it to go on a stack buffer? One option would be to 
rewrite it to use an output range, and then call it like:


char[20] buffer;
auto s = to!string(10, buffer); // it returns the slice of the 
buffer it actually used


(and we can do overloads so to!string(10, radix) still works, as 
well as to!string(10, radix, buffer). Hassle, I know...)


Naturally, the default argument is to use the 'global' allocator, 
whatever that is, which does nothing special.




The fun part is the output range works for that, and could also 
work for something like this:


struct malloced_string {
char* ptr;
size_t length;
size_t capacity;
void put(char c) {
if(length = capacity)
   ptr = realloc(ptr, capacity*2);
ptr[length++] = c;
}

char[] slice() { return ptr[0 .. length]; }
alias slice this;
mixin RefCounted!this; // pretend this works
}


{
   malloced_string str;
   auto got = to!string(10, str);
} // str is out of scope, so it gets free()'d. unsafe though: if 
you stored a copy of got somewhere, it is now a pointer to freed 
memory. I'd kinda like language support of some sort to help 
mitigate that though, like being a borrowed pointer that isn't 
allowed to be stored, but that's another discussion.



And that should work. So then what we might do is provide these 
little output range wrappers for various allocators, and use them 
on many functions.


So we'd write:

import std.allocators;
import std.range;

// mallocator is provided in std.allocators and offers the goods
OutputRange!(char, mallocator) str;

auto got = to!string(10, str);



What's nice here is the output range is useful for more than just 
allocators. You could also to!string(10, my_file) or a delegate, 
blah blah blah. So it isn't too much of a burden, it is something 
you might naturally use anyway.



Also, we may have the problem of the wrong allocator
being used to free the object.


Another reason why encoding the allocator into the type is so 
nice. For the minimal D I've been playing with, the idea I'm 
running with is all allocated memory has some kind of special 
type, and then naked pointers are always assumed to be borrowed, 
so you should never store or free them.


auto foo = HeapArray!char(capacity);

void bar(char[] lol){}

bar(foo); // allowed, foo has an alias this on slice

// but

struct A {
   char[] lol; // not allowed, because you don't know when lol is 
going to be freed

}


foo frees itself with refcounting.


Re: why allocators are not discussed here

2013-06-25 Thread bearophile

cybervadim:


From my experience all objects may be divided into 2 categories
1. temporaries. Program usually have some kind of event loop. 
During one iteration of this loop some temporary objects are 
created and then discarded. The ideal case for stack (or ranged 
or area) allocator, where you define allocator at the beginning 
of the loop cycle, use it for all temporaries, then free all 
the memory in one go at the end of iteration.
2. containers. Program receives an event from the outside and 
puts some data into container OR update the data if the record 
already exists.
The important thing here is - when updating the data in 
container, you may want to resize the existing area.


Many garbage collectors use the same idea (and manage it 
automatically), with two or three different generations:


http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#Generational_GC_.28ephemeral_GC.29

Bye,
bearophile


Re: why allocators are not discussed here

2013-06-25 Thread cybervadim
Many garbage collectors use the same idea (and manage it 
automatically), with two or three different generations:


http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#Generational_GC_.28ephemeral_GC.29

Bye,
bearophile


The problem with GC is that it doesn't know which is temporary 
and which is not, so it has to traverse tree to determine that. 
Allocators in my opinion should let user specify explicitly the 
temporaries.


Re: why allocators are not discussed here

2013-06-25 Thread Adam D. Ruppe
I was just quickly skimming some criticism of C++ allocators, 
since my thought here is similar to what they do. On one hand, 
maybe D can do it right by tweaking C++'s design rather than 
discarding it.


On the other hand, with all the C++ I've done, I have never 
actually used STL allocators, which could say something about me 
or could say something about them.



One thing I saw said making the differently allocated object a 
different type sucks. ...but must it? The complaint there was so 
much for just doing a function that takes a std::string. But, 
the way I'd want to do it in D is the function would take a 
char[] instead, and our special allocated type provides that via 
opSlice and/or alias this.


So you'd only have to worry about the different type if you 
intend to take ownership of the container yourself. Which we 
already kinda think about in D: if you store a char[], someone 
else could overwrite it, so we prefer to store an 
immutable(char)[] aka string. If you're given a char[] and want 
to store it, you might idup. So I don't think doing a private 
copy with some other allocation scheme is any more of a hassle.


(BTW immutable objects IMO should *always* be garbage collected, 
because part of immutability is infinite lifetime. So we might 
want to be careful with implicit conversions to immutable based 
on allocation method, which I believe we can protect through 
member functions.)



Anyway, bottom line is I don't think that criticism necessarily 
applies to D. But there's surely many others and I'm more or less 
a n00b re c++'s allocators so idk yet.