subject:"LRU cache for \~="

Re: LRU cache for ~=

2009-10-20 Thread Fawzi Mohamed


On 2009-10-20 03:41:59 +0200, Walter Bright newshou...@digitalmars.com said:


Denis Koroskin wrote:

Safe as in SafeD (i.e. no memory corruption) :)


Right. The problems with other definitions of safe is they are too ill-defined.


no (or as little as possible) undefined behaviour comes to mind.
Initialization of values in D follows that.
Slice appending does not (and will remain so also with LRU cache).
LRU cache improves the current status quo, but is still a cludge: 
undefined behavious in some corner cases, undefined optimization 
behaviour...
It improves things, but cannot be trusted in general, thus a clean 
library type is still needed imho.


Fawzi

Re: LRU cache for ~=

2009-10-20 Thread Don


Andrei Alexandrescu wrote:

Rainer Deyke wrote:

Andrei Alexandrescu wrote:

One surprising (but safe) behavior that remains with slices is this:

void fun(int[] a) {
   a[0] = 0;
   a ~= 42;
   a[0] = 42;
}

The caller may or may not see 42 in the first slot after the call.


Your definition of safe is clearly not aligned with mine.



What's yours?


SafeD is easy to learn and it keeps the programmers away from undefined 
behaviors. -- safed.html.


The behaviour you quoted is undefined behaviour, therefore it's not safe 
according to the only SafeD definition in the spec.

Re: LRU cache for ~=

2009-10-20 Thread Steven Schveighoffer

On Mon, 19 Oct 2009 14:51:32 -0400, Andrei Alexandrescu  
seewebsiteforem...@erdani.org wrote:


I just wrote this to Sean and Walter and subsequently discussed it with  
Walter. Walter thinks this should work. Does anyone have the time and  
inclination to test this out? It would involve hacking into druntime's  
implementation of ~= (I'm not sure what the function name is). I'd  
really appreciate this; I'm overloaded as it is.


==

In wake of the recent demise of T[new], I was thinking of finding ways  
of making ~= efficient for T[].


Currently ~= is slow because accessing GC.sizeOf(void*) acquires a  
global lock and generally must figure out a lot of things about the  
pointer to make a decision.


Also, ~= is dangerous because it allows slices to stomp over other  
slices.


I was thinking of solving these issues by keeping an LRU (Least Recently  
Used) cache inside the implementation of ~=. The LRU would only have a  
few entries (4-8) and would store the parameters of the last ~= calls,  
and their cached capacities.


So whenever code calls arr ~= b, the LRU is searched first. If the  
system finds arr (both bounds) in the LRU, that means the cached  
capacity is correct and can solve the matter without an actual trip to  
the GC at all! Otherwise, do the deed and cache the new slice and the  
new capacity.


This also solves the lack of safety: if you request a growth on an array  
you just grew, it's impossible  to have a valid slice beyond that array.


This LRU would allow us to keep the slice API as it currently is, and  
also at excellent efficiency.


What do you think?



This is a very good idea.  Incidentally, you only need the upper bound  
location, the beginning location is irrelevant, since you don't grow  
down.  What do you do in the case where the memory was recycled?  Does a  
GC collection cycle clean out the cache as well?


This is better than my two previous ideas.  The only drawback I see is if  
you have many threads doing appending, or you are appending more than 8  
arrays at once in a round-robin fashion, you would lose all the benefit  
(although it shouldn't affect correctness).  At that point however, you'd  
have to ask yourself why you aren't using a specialized appender type or  
function.


In response to other's queries about how many LRUs to use, you'd probably  
want one per heap, and you'd want to lock/not lock based on whether the  
heap is thread local or not.


-Steve

Re: LRU cache for ~=

2009-10-20 Thread Steven Schveighoffer


On Mon, 19 Oct 2009 22:37:26 -0400, dsimcha dsim...@yahoo.com wrote:

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s  
article

dsimcha wrote:
 == Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s  
article

 dsimcha wrote:
 == Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s  
article

 dsimcha wrote:
 Started playing w/ the implementation a little and I see a  
problem.  What

about

 the garbage collector?  There are two possibilities:
 [snip]
 The only possible solutions I see would be to have the GC know  
everything

about
 the LRU cache and evict stale entries (probably slows down GC a  
lot, a

huge PITA
 to implement, couples things that shouldn't be tightly coupled),  
or clear the
 cache every time GC is run (probably would make appending so slow  
as to

 defeat the
 purpose of having the cache).
 I think GC.collect may simply evict the entire cache. The  
collection
 cycle costs so much, the marginal cost of losing cached  
information is

 lost in the noise.
 Andrei
 But then you have to copy the whole array again, likely triggering  
another GC if
 the array is large.  Then things really get ugly as, for all  
practical purposes,

 you've completely done away with the cache.
 This happens whether or not a cache is in use.
 Andrei

 But the array isn't guaranteed to get reallocated immediately after  
*every* GC
 run.  If you're appending to a huge array, the GC will likely run  
several times

 while you're appending, leading to several unnecessary reallocations.
I don't think I understand this.
1. Request for an append comes that runs out of memory
2. GC runs and clears memory
3. Array is reallocated and the capacity cached.
No?


This is entirely correct.


 Each of
 those unnecessary reallocations will increase the memory footprint of  
your
 program, possibly triggering another GC run and wiping out your cache  
again in

 short order, until, for sufficiently large arrays,

 a ~= b;

 is almost equivalent to

 a = a ~ b;
I don't understand how the cache makes that all worse.
Andrei


The cache doesn't make anything *worse* than with no cache.  The only  
point I'm
trying to make is that, for large arrays, if the GC clears the cache  
every time it
runs, things would start to get *almost as bad as* having no cache  
because the

copy operations become expensive and the GC may run frequently.


The cache can't be cleared every time, or else you might as well only  
keep one LRU entry:


int[] twos, threes;

for(int i = 1; i  1; i++)
{
  twos ~= i * 2;
  threes ~= i * 3;
}

At some point, twos or threes needs an allocation triggering a collection,  
and that clears the cache, making the other array need an allocation,  
clearing the cache, etc.


I'd think you only want to clear the entries affected by the collection.

-Steve

Re: LRU cache for ~=

2009-10-20 Thread Andrei Alexandrescu


Steven Schveighoffer wrote:
On Mon, 19 Oct 2009 14:51:32 -0400, Andrei Alexandrescu 
seewebsiteforem...@erdani.org wrote:


I just wrote this to Sean and Walter and subsequently discussed it 
with Walter. Walter thinks this should work. Does anyone have the time 
and inclination to test this out? It would involve hacking into 
druntime's implementation of ~= (I'm not sure what the function name 
is). I'd really appreciate this; I'm overloaded as it is.


==

In wake of the recent demise of T[new], I was thinking of finding ways 
of making ~= efficient for T[].


Currently ~= is slow because accessing GC.sizeOf(void*) acquires a 
global lock and generally must figure out a lot of things about the 
pointer to make a decision.


Also, ~= is dangerous because it allows slices to stomp over other 
slices.


I was thinking of solving these issues by keeping an LRU (Least 
Recently Used) cache inside the implementation of ~=. The LRU would 
only have a few entries (4-8) and would store the parameters of the 
last ~= calls, and their cached capacities.


So whenever code calls arr ~= b, the LRU is searched first. If the 
system finds arr (both bounds) in the LRU, that means the cached 
capacity is correct and can solve the matter without an actual trip to 
the GC at all! Otherwise, do the deed and cache the new slice and the 
new capacity.


This also solves the lack of safety: if you request a growth on an 
array you just grew, it's impossible  to have a valid slice beyond 
that array.


This LRU would allow us to keep the slice API as it currently is, and 
also at excellent efficiency.


What do you think?



This is a very good idea.  Incidentally, you only need the upper bound 
location, the beginning location is irrelevant, since you don't grow 
down.


Awesome, didn't think of that. So now more cases are caught:

auto a = new int[100];
a ~= 42;
a = a[50 .. $];
a ~= 52;

That wouldn't have worked with my original suggestion, but it does work 
safely with yours.


What do you do in the case where the memory was recycled?  Does a 
GC collection cycle clean out the cache as well?


As you saw, there was some discussion about that as well.

This is better than my two previous ideas.  The only drawback I see is 
if you have many threads doing appending, or you are appending more than 
8 arrays at once in a round-robin fashion, you would lose all the 
benefit (although it shouldn't affect correctness).  At that point 
however, you'd have to ask yourself why you aren't using a specialized 
appender type or function.


Yah. As I suspect a lot of code is actually doing round-robin naturally, 
I'm considering using a random eviction strategy. That way performance 
will degrade smoother. A more advanced algorithm would use introspection 
to choose dynamically between LRU and random.



Andrei

Re: LRU cache for ~=

2009-10-20 Thread Robert Jacques

On Tue, 20 Oct 2009 10:14:52 -0400, Andrei Alexandrescu  
seewebsiteforem...@erdani.org wrote:

Steven Schveighoffer wrote:
On Mon, 19 Oct 2009 14:51:32 -0400, Andrei Alexandrescu  
seewebsiteforem...@erdani.org wrote:


I just wrote this to Sean and Walter and subsequently discussed it  
with Walter. Walter thinks this should work. Does anyone have the time  
and inclination to test this out? It would involve hacking into  
druntime's implementation of ~= (I'm not sure what the function name  
is). I'd really appreciate this; I'm overloaded as it is.


==

In wake of the recent demise of T[new], I was thinking of finding ways  
of making ~= efficient for T[].


Currently ~= is slow because accessing GC.sizeOf(void*) acquires a  
global lock and generally must figure out a lot of things about the  
pointer to make a decision.


Also, ~= is dangerous because it allows slices to stomp over other  
slices.


I was thinking of solving these issues by keeping an LRU (Least  
Recently Used) cache inside the implementation of ~=. The LRU would  
only have a few entries (4-8) and would store the parameters of the  
last ~= calls, and their cached capacities.


So whenever code calls arr ~= b, the LRU is searched first. If the  
system finds arr (both bounds) in the LRU, that means the cached  
capacity is correct and can solve the matter without an actual trip to  
the GC at all! Otherwise, do the deed and cache the new slice and the  
new capacity.


This also solves the lack of safety: if you request a growth on an  
array you just grew, it's impossible  to have a valid slice beyond  
that array.


This LRU would allow us to keep the slice API as it currently is, and  
also at excellent efficiency.


What do you think?

 This is a very good idea.  Incidentally, you only need the upper bound  
location, the beginning location is irrelevant, since you don't grow  
down.


Awesome, didn't think of that. So now more cases are caught:

auto a = new int[100];
a ~= 42;
a = a[50 .. $];
a ~= 52;

That wouldn't have worked with my original suggestion, but it does work  
safely with yours.


What do you do in the case where the memory was recycled?  Does a GC  
collection cycle clean out the cache as well?


As you saw, there was some discussion about that as well.

This is better than my two previous ideas.  The only drawback I see is  
if you have many threads doing appending, or you are appending more  
than 8 arrays at once in a round-robin fashion, you would lose all the  
benefit (although it shouldn't affect correctness).  At that point  
however, you'd have to ask yourself why you aren't using a specialized  
appender type or function.


Yah. As I suspect a lot of code is actually doing round-robin naturally,  
I'm considering using a random eviction strategy. That way performance  
will degrade smoother. A more advanced algorithm would use introspection  
to choose dynamically between LRU and random.



Andrei


So you want to synchronize the ~= function? I thought the LRU would be  
thread local and therefore independent of these issues, as well as being  
faster. And if the LRU isn't thread-local, then why not make it part of  
the GC? It would both be more general and much simpler/cleaner to  
implement.

Re: LRU cache for ~=

2009-10-20 Thread Robert Jacques

On Tue, 20 Oct 2009 10:05:42 -0400, Steven Schveighoffer  
schvei...@yahoo.com wrote:



On Mon, 19 Oct 2009 22:37:26 -0400, dsimcha dsim...@yahoo.com wrote:

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s  
article

dsimcha wrote:
 == Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s  
article

 dsimcha wrote:
 == Quote from Andrei Alexandrescu  
(seewebsiteforem...@erdani.org)'s article

 dsimcha wrote:
 Started playing w/ the implementation a little and I see a  
problem.  What

about

 the garbage collector?  There are two possibilities:
 [snip]
 The only possible solutions I see would be to have the GC know  
everything

about
 the LRU cache and evict stale entries (probably slows down GC a  
lot, a

huge PITA
 to implement, couples things that shouldn't be tightly coupled),  
or clear the
 cache every time GC is run (probably would make appending so  
slow as to

 defeat the
 purpose of having the cache).
 I think GC.collect may simply evict the entire cache. The  
collection
 cycle costs so much, the marginal cost of losing cached  
information is

 lost in the noise.
 Andrei
 But then you have to copy the whole array again, likely triggering  
another GC if
 the array is large.  Then things really get ugly as, for all  
practical purposes,

 you've completely done away with the cache.
 This happens whether or not a cache is in use.
 Andrei

 But the array isn't guaranteed to get reallocated immediately after  
*every* GC
 run.  If you're appending to a huge array, the GC will likely run  
several times

 while you're appending, leading to several unnecessary reallocations.
I don't think I understand this.
1. Request for an append comes that runs out of memory
2. GC runs and clears memory
3. Array is reallocated and the capacity cached.
No?


This is entirely correct.


 Each of
 those unnecessary reallocations will increase the memory footprint  
of your
 program, possibly triggering another GC run and wiping out your  
cache again in

 short order, until, for sufficiently large arrays,

 a ~= b;

 is almost equivalent to

 a = a ~ b;
I don't understand how the cache makes that all worse.
Andrei


The cache doesn't make anything *worse* than with no cache.  The only  
point I'm
trying to make is that, for large arrays, if the GC clears the cache  
every time it
runs, things would start to get *almost as bad as* having no cache  
because the

copy operations become expensive and the GC may run frequently.


The cache can't be cleared every time, or else you might as well only  
keep one LRU entry:


int[] twos, threes;

for(int i = 1; i  1; i++)
{
   twos ~= i * 2;
   threes ~= i * 3;
}

At some point, twos or threes needs an allocation triggering a  
collection, and that clears the cache, making the other array need an  
allocation, clearing the cache, etc.


I'd think you only want to clear the entries affected by the collection.

-Steve


If it was free and simple to only clear the affected entries, sure. But  
doing so requires (very heavy?) modification of the GC in order to track  
and check changes. It also reduces collection performance. I think, that  
if GC allocations added entries to the LRU, and therefore the information  
in the LRU is never stale, you could avoid clearing the LRU. But this  
requires the LRU to be part of the GC.

Re: LRU cache for ~=

2009-10-20 Thread Steven Schveighoffer

On Tue, 20 Oct 2009 10:37:40 -0400, Robert Jacques sandf...@jhu.edu  
wrote:


So you want to synchronize the ~= function? I thought the LRU would be  
thread local and therefore independent of these issues, as well as being  
faster. And if the LRU isn't thread-local, then why not make it part of  
the GC? It would both be more general and much simpler/cleaner to  
implement.


quoting myself earlier:

On Tue, 20 Oct 2009 09:58:01 -0400, Steven Schveighoffer  
schvei...@yahoo.com wrote:


In response to other's queries about how many LRUs to use, you'd  
probably want one per heap, and you'd want to lock/not lock based on  
whether the heap is thread local or not.


You need a locked operation in the case where the heap is shared,  
otherwise, you lose safety.


At the moment all we *have* is a shared heap.  So ~= is a synchronized  
operation until thread-local heaps are available.


I think the only logical place for the LRU is the GC, it makes no sense to  
have a a shared LRU for an unshared GC or vice versa.


-Steve

Re: LRU cache for ~=

2009-10-20 Thread Steven Schveighoffer

On Tue, 20 Oct 2009 10:14:52 -0400, Andrei Alexandrescu  
seewebsiteforem...@erdani.org wrote:



Steven Schveighoffer wrote:
 This is a very good idea.  Incidentally, you only need the upper bound  
location, the beginning location is irrelevant, since you don't grow  
down.


Awesome, didn't think of that. So now more cases are caught:

auto a = new int[100];
a ~= 42;
a = a[50 .. $];
a ~= 52;

That wouldn't have worked with my original suggestion, but it does work  
safely with yours.


It was one of the coolest parts of my original proposal :)  
http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.Darticle_id=63146


But using a cache solves a lot of the problems I didn't.



What do you do in the case where the memory was recycled?  Does a GC  
collection cycle clean out the cache as well?


As you saw, there was some discussion about that as well.


Yeah, I'm reading in thread order :)  Still got 91 unread messages, so  
maybe I'll read all before replying again...


-Steve

Re: LRU cache for ~=

2009-10-20 Thread Steven Schveighoffer

On Tue, 20 Oct 2009 10:48:31 -0400, Robert Jacques sandf...@jhu.edu  
wrote:


On Tue, 20 Oct 2009 10:05:42 -0400, Steven Schveighoffer  
schvei...@yahoo.com wrote:



I'd think you only want to clear the entries affected by the collection.



If it was free and simple to only clear the affected entries, sure. But  
doing so requires (very heavy?) modification of the GC in order to track  
and check changes.


Why?  All you have to do is check whether a block is referenced in the LRU  
while freeing the block.  I don't even think it would be that performance  
critical.  Using my vastly novice assumptions about how the GC collection  
cycle works:


step 1, mark all blocks that are not referenced by any roots.
step 2, check which blocks are referenced by the LRU, if they are, then  
remove them from the LRU.

step 3, recycle free blocks.


But this requires the LRU to be part of the GC.


I think we're already in that boat.  If the LRU isn't attached to the GC,  
then ~= becomes a locking operation even if the GC is thread-local, which  
makes no sense.


-Steve

Re: LRU cache for ~=

2009-10-20 Thread Robert Jacques

On Tue, 20 Oct 2009 11:24:21 -0400, Steven Schveighoffer  
schvei...@yahoo.com wrote:


On Tue, 20 Oct 2009 10:48:31 -0400, Robert Jacques sandf...@jhu.edu  
wrote:


On Tue, 20 Oct 2009 10:05:42 -0400, Steven Schveighoffer  
schvei...@yahoo.com wrote:


I'd think you only want to clear the entries affected by the  
collection.




If it was free and simple to only clear the affected entries, sure. But  
doing so requires (very heavy?) modification of the GC in order to  
track and check changes.


Why?  All you have to do is check whether a block is referenced in the  
LRU while freeing the block.  I don't even think it would be that  
performance critical.  Using my vastly novice assumptions about how the  
GC collection cycle works:


step 1, mark all blocks that are not referenced by any roots.
step 2, check which blocks are referenced by the LRU, if they are, then  
remove them from the LRU.

step 3, recycle free blocks.


I agree, but my mind hadn't gotten there yet. (It was thinking of the  
overhead of generational/concurrent collections, for some strange reason)



But this requires the LRU to be part of the GC.


I think we're already in that boat.  If the LRU isn't attached to the  
GC, then ~= becomes a locking operation even if the GC is thread-local,  
which makes no sense.


-Steve


Of course, Andrei just stated the cache should be thread-local (and  
probably in the function, not the GC) which throws a spanner into the  
works.

Re: LRU cache for ~=

2009-10-20 Thread Christopher Wright


Brad Roberts wrote:

On Mon, 19 Oct 2009, Walter Bright wrote:


Denis Koroskin wrote:

Safe as in SafeD (i.e. no memory corruption) :)

Right. The problems with other definitions of safe is they are too
ill-defined.


There's SafeD, which has a fairly formal definition.


But a fairly generic name, which confuses people repeatedly. I'm not the 
first to recommend that the name be changed. It does more harm than good.

LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu

I just wrote this to Sean and Walter and subsequently discussed it with 
Walter. Walter thinks this should work. Does anyone have the time and 
inclination to test this out? It would involve hacking into druntime's 
implementation of ~= (I'm not sure what the function name is). I'd 
really appreciate this; I'm overloaded as it is.


==

In wake of the recent demise of T[new], I was thinking of finding ways 
of making ~= efficient for T[].


Currently ~= is slow because accessing GC.sizeOf(void*) acquires a 
global lock and generally must figure out a lot of things about the 
pointer to make a decision.


Also, ~= is dangerous because it allows slices to stomp over other slices.

I was thinking of solving these issues by keeping an LRU (Least Recently 
Used) cache inside the implementation of ~=. The LRU would only have a 
few entries (4-8) and would store the parameters of the last ~= calls, 
and their cached capacities.


So whenever code calls arr ~= b, the LRU is searched first. If the 
system finds arr (both bounds) in the LRU, that means the cached 
capacity is correct and can solve the matter without an actual trip to 
the GC at all! Otherwise, do the deed and cache the new slice and the 
new capacity.


This also solves the lack of safety: if you request a growth on an array 
you just grew, it's impossible  to have a valid slice beyond that array.


This LRU would allow us to keep the slice API as it currently is, and 
also at excellent efficiency.


What do you think?


Andrei

Re: LRU cache for ~=

2009-10-19 Thread dsimcha

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
 I just wrote this to Sean and Walter and subsequently discussed it with
 Walter. Walter thinks this should work. Does anyone have the time and
 inclination to test this out? It would involve hacking into druntime's
 implementation of ~= (I'm not sure what the function name is). I'd
 really appreciate this; I'm overloaded as it is.
 ==
 In wake of the recent demise of T[new], I was thinking of finding ways
 of making ~= efficient for T[].
 Currently ~= is slow because accessing GC.sizeOf(void*) acquires a
 global lock and generally must figure out a lot of things about the
 pointer to make a decision.
 Also, ~= is dangerous because it allows slices to stomp over other slices.
 I was thinking of solving these issues by keeping an LRU (Least Recently
 Used) cache inside the implementation of ~=. The LRU would only have a
 few entries (4-8) and would store the parameters of the last ~= calls,
 and their cached capacities.
 So whenever code calls arr ~= b, the LRU is searched first. If the
 system finds arr (both bounds) in the LRU, that means the cached
 capacity is correct and can solve the matter without an actual trip to
 the GC at all! Otherwise, do the deed and cache the new slice and the
 new capacity.
 This also solves the lack of safety: if you request a growth on an array
 you just grew, it's impossible  to have a valid slice beyond that array.
 This LRU would allow us to keep the slice API as it currently is, and
 also at excellent efficiency.
 What do you think?
 Andrei

1.  I assume the LRU cache would be thread-local, so no lock would be necessary?
2.  I don't understand how this solves the safety problem:

// foo lives on the heap b/c we've idup'd it.
string foo = This is only a test..idup;
string bar = foo[0..4];
bar ~=  is _not ;
writeln(foo);  // prints This is _not a test.

Having access to the capacity in an LRU cache doesn't help if I understand it
correctly.

3.  I'm pretty familiar with these parts of druntime and would probably be 
willing
to do the implementation after I understand a few details of it a little better.

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


dsimcha wrote:

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article

I just wrote this to Sean and Walter and subsequently discussed it with
Walter. Walter thinks this should work. Does anyone have the time and
inclination to test this out? It would involve hacking into druntime's
implementation of ~= (I'm not sure what the function name is). I'd
really appreciate this; I'm overloaded as it is.
==
In wake of the recent demise of T[new], I was thinking of finding ways
of making ~= efficient for T[].
Currently ~= is slow because accessing GC.sizeOf(void*) acquires a
global lock and generally must figure out a lot of things about the
pointer to make a decision.
Also, ~= is dangerous because it allows slices to stomp over other slices.
I was thinking of solving these issues by keeping an LRU (Least Recently
Used) cache inside the implementation of ~=. The LRU would only have a
few entries (4-8) and would store the parameters of the last ~= calls,
and their cached capacities.
So whenever code calls arr ~= b, the LRU is searched first. If the
system finds arr (both bounds) in the LRU, that means the cached
capacity is correct and can solve the matter without an actual trip to
the GC at all! Otherwise, do the deed and cache the new slice and the
new capacity.
This also solves the lack of safety: if you request a growth on an array
you just grew, it's impossible  to have a valid slice beyond that array.
This LRU would allow us to keep the slice API as it currently is, and
also at excellent efficiency.
What do you think?
Andrei


1.  I assume the LRU cache would be thread-local, so no lock would be necessary?


Yes.


2.  I don't understand how this solves the safety problem:


It's rather subtle. I'd figured it out in the morning and forgot it by 
the time I explained to Walter.



// foo lives on the heap b/c we've idup'd it.
string foo = This is only a test..idup;
string bar = foo[0..4];
bar ~=  is _not ;
writeln(foo);  // prints This is _not a test.


This is actually one of the less subtle cases. Upon the call to ~=, bar 
is not in the LRU cache so ~= conservatively reallocates it.


As a rule of thumb (which takes care of the subtler issues): if a 
growing slice is not found in the LRU cache, it will always be 
conservatively reallocated. This is exactly because you don't know 
whether that slice was reduced from a larger slice.



Having access to the capacity in an LRU cache doesn't help if I understand it
correctly.

3.  I'm pretty familiar with these parts of druntime and would probably be 
willing
to do the implementation after I understand a few details of it a little better.


That would be awesome!!!


Andrei

Re: LRU cache for ~=

2009-10-19 Thread Leandro Lucarella

Andrei Alexandrescu, el 19 de octubre a las 13:51 me escribiste:
 I just wrote this to Sean and Walter and subsequently discussed it
 with Walter. Walter thinks this should work. Does anyone have the
 time and inclination to test this out? It would involve hacking into
 druntime's implementation of ~= (I'm not sure what the function name
 is). I'd really appreciate this; I'm overloaded as it is.
 
 ==
 
 In wake of the recent demise of T[new], I was thinking of finding
 ways of making ~= efficient for T[].
 
 Currently ~= is slow because accessing GC.sizeOf(void*) acquires a
 global lock and generally must figure out a lot of things about the
 pointer to make a decision.
 
 Also, ~= is dangerous because it allows slices to stomp over other slices.
 
 I was thinking of solving these issues by keeping an LRU (Least
 Recently Used) cache inside the implementation of ~=. The LRU would
 only have a few entries (4-8) and would store the parameters of the
 last ~= calls, and their cached capacities.
 
 So whenever code calls arr ~= b, the LRU is searched first. If the
 system finds arr (both bounds) in the LRU, that means the cached
 capacity is correct and can solve the matter without an actual trip
 to the GC at all! Otherwise, do the deed and cache the new slice and
 the new capacity.
 
 This also solves the lack of safety: if you request a growth on an
 array you just grew, it's impossible  to have a valid slice beyond
 that array.
 
 This LRU would allow us to keep the slice API as it currently is,
 and also at excellent efficiency.
 
 What do you think?

I think this is moving in the wrong direction. It's a patch, not
a solution. And this makes dynamic arrays more coupled with the way the GC
works. I think the GC shouldn't even provide a sizeOf(void*), we should
move in that direction, removing restrictions to the GC instead of
enforcing the current ones; at lest if the goal is to eventually have
a better GC (a copying GC doesn't even have to store the size of the
cell, for example).

I hope you eventually realize that you are complicating things much more
than necessary.

-- 
Leandro Lucarella (AKA luca) http://llucax.com.ar/
--
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
--
ACCIDENTE FATAL EN FLORES, MUEREN DOS PERSONAS Y UN BOLIVIANO
-- Crónica TV

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


Leandro Lucarella wrote:

Andrei Alexandrescu, el 19 de octubre a las 13:51 me escribiste:

I just wrote this to Sean and Walter and subsequently discussed it
with Walter. Walter thinks this should work. Does anyone have the
time and inclination to test this out? It would involve hacking into
druntime's implementation of ~= (I'm not sure what the function name
is). I'd really appreciate this; I'm overloaded as it is.

==

In wake of the recent demise of T[new], I was thinking of finding
ways of making ~= efficient for T[].

Currently ~= is slow because accessing GC.sizeOf(void*) acquires a
global lock and generally must figure out a lot of things about the
pointer to make a decision.

Also, ~= is dangerous because it allows slices to stomp over other slices.

I was thinking of solving these issues by keeping an LRU (Least
Recently Used) cache inside the implementation of ~=. The LRU would
only have a few entries (4-8) and would store the parameters of the
last ~= calls, and their cached capacities.

So whenever code calls arr ~= b, the LRU is searched first. If the
system finds arr (both bounds) in the LRU, that means the cached
capacity is correct and can solve the matter without an actual trip
to the GC at all! Otherwise, do the deed and cache the new slice and
the new capacity.

This also solves the lack of safety: if you request a growth on an
array you just grew, it's impossible  to have a valid slice beyond
that array.

This LRU would allow us to keep the slice API as it currently is,
and also at excellent efficiency.

What do you think?


I think this is moving in the wrong direction. It's a patch, not
a solution. And this makes dynamic arrays more coupled with the way the GC
works. I think the GC shouldn't even provide a sizeOf(void*), we should
move in that direction, removing restrictions to the GC instead of
enforcing the current ones; at lest if the goal is to eventually have
a better GC (a copying GC doesn't even have to store the size of the
cell, for example).


The LRU doesn't need GC.sizeOf. It could always conservatively 
reallocate whenever in doubt.



I hope you eventually realize that you are complicating things much more
than necessary.


I actually did a couple of days ago :o).


Andrei

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


dsimcha wrote:
== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s 
article
I just wrote this to Sean and Walter and subsequently discussed it 
with Walter. Walter thinks this should work. Does anyone have the 
time and inclination to test this out? It would involve hacking 
into druntime's implementation of ~= (I'm not sure what the 
function name is). I'd really appreciate this; I'm overloaded as it
 is. == In wake of the recent demise of T[new], I 
was thinking of finding ways of making ~= efficient for T[]. 
Currently ~= is slow because accessing GC.sizeOf(void*) acquires a

 global lock and generally must figure out a lot of things about
the pointer to make a decision. Also, ~= is dangerous because it 
allows slices to stomp over other slices. I was thinking of solving
 these issues by keeping an LRU (Least Recently Used) cache inside 
the implementation of ~=. The LRU would only have a few entries 
(4-8) and would store the parameters of the last ~= calls, and 
their cached capacities. So whenever code calls arr ~= b, the LRU 
is searched first. If the system finds arr (both bounds) in the 
LRU, that means the cached capacity is correct and can solve the 
matter without an actual trip to the GC at all! Otherwise, do the 
deed and cache the new slice and the new capacity. This also solves
 the lack of safety: if you request a growth on an array you just 
grew, it's impossible  to have a valid slice beyond that array. 
This LRU would allow us to keep the slice API as it currently is, 
and also at excellent efficiency. What do you think? Andrei


1.  I assume the LRU cache would be thread-local, so no lock would be
 necessary?


Yes.


2.  I don't understand how this solves the safety problem:


It's rather subtle. I'd figured it out in the morning and forgot it by
the time I explained to Walter.

// foo lives on the heap b/c we've idup'd it. string foo = This is 
only a test..idup; string bar = foo[0..4]; bar ~=  is _not ; 
writeln(foo);  // prints This is _not a test.


Upon the call to ~=, bar is not in the LRU cache so ~= conservatively 
reallocates it.


As a rule of thumb (which takes care of the subtler issues): if a
growing slice is not found in the LRU cache, it will always be
conservatively reallocated. This is exactly because you don't know
whether that slice was reduced from a larger slice.

Having access to the capacity in an LRU cache doesn't help if I 
understand it correctly.


3.  I'm pretty familiar with these parts of druntime and would 
probably be willing to do the implementation after I understand a few

 details of it a little better.


That would be awesome!!!


Andrei

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


dsimcha wrote:

2.  I don't understand how this solves the safety problem:

// foo lives on the heap b/c we've idup'd it.
string foo = This is only a test..idup;
string bar = foo[0..4];
bar ~=  is _not ;
writeln(foo);  // prints This is _not a test.

Having access to the capacity in an LRU cache doesn't help if I understand it
correctly.


Let me stress a point harder that I think I expressed poorly:

The LRU cache stores the capacity of a given slice given _BOTH_ the 
slice's left and right bounds. If you later come with a slice that has 
only one correct bound, the LRU doesn't care about it. That's the safety 
tidbit.



Andrei

Re: LRU cache for ~=

2009-10-19 Thread dsimcha

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
 dsimcha wrote:
  2.  I don't understand how this solves the safety problem:
 
  // foo lives on the heap b/c we've idup'd it.
  string foo = This is only a test..idup;
  string bar = foo[0..4];
  bar ~=  is _not ;
  writeln(foo);  // prints This is _not a test.
 
  Having access to the capacity in an LRU cache doesn't help if I understand 
  it
  correctly.
 Let me stress a point harder that I think I expressed poorly:
 The LRU cache stores the capacity of a given slice given _BOTH_ the
 slice's left and right bounds. If you later come with a slice that has
 only one correct bound, the LRU doesn't care about it. That's the safety
 tidbit.
 Andrei

I think I get it now, although it's very conservative.  One more question:  Is
this going to take the place of ArrayBuilder or be inaddition?  The LRU is a 
good
hack to preserve syntactic elegance and ease of use, but it's somewhat of a 
kludge
nonetheless and I'd ideally still like to see a real ArrayBuilder with full
array-like semantics if T[new] is definitely out.

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


dsimcha wrote:

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article

dsimcha wrote:

2.  I don't understand how this solves the safety problem:

// foo lives on the heap b/c we've idup'd it.
string foo = This is only a test..idup;
string bar = foo[0..4];
bar ~=  is _not ;
writeln(foo);  // prints This is _not a test.

Having access to the capacity in an LRU cache doesn't help if I understand it
correctly.

Let me stress a point harder that I think I expressed poorly:
The LRU cache stores the capacity of a given slice given _BOTH_ the
slice's left and right bounds. If you later come with a slice that has
only one correct bound, the LRU doesn't care about it. That's the safety
tidbit.
Andrei


I think I get it now, although it's very conservative.


That's why we need experimentation. My gut tells me that linear search 
will scram through 8 elements real fast, and if there's something in the 
cache it will almost always be in a top position. Also, I think the LRU 
will solve all cases that matter.



One more question:  Is
this going to take the place of ArrayBuilder or be inaddition?  The LRU is a 
good
hack to preserve syntactic elegance and ease of use, but it's somewhat of a 
kludge
nonetheless and I'd ideally still like to see a real ArrayBuilder with full
array-like semantics if T[new] is definitely out.


Ideally we'd be able to render ArrayBuilder/Appender unnecessary. I 
think there is a need for a UniqueArray, but that's only loosely related.



Andrei

Re: LRU cache for ~=

2009-10-19 Thread Chris Nicholson-Sauls


Andrei Alexandrescu wrote:
I just wrote this to Sean and Walter and subsequently discussed it with 
Walter. Walter thinks this should work. Does anyone have the time and 
inclination to test this out? It would involve hacking into druntime's 
implementation of ~= (I'm not sure what the function name is). I'd 
really appreciate this; I'm overloaded as it is.


==

In wake of the recent demise of T[new], I was thinking of finding ways 
of making ~= efficient for T[].


Currently ~= is slow because accessing GC.sizeOf(void*) acquires a 
global lock and generally must figure out a lot of things about the 
pointer to make a decision.


Also, ~= is dangerous because it allows slices to stomp over other slices.

I was thinking of solving these issues by keeping an LRU (Least Recently 
Used) cache inside the implementation of ~=. The LRU would only have a 
few entries (4-8) and would store the parameters of the last ~= calls, 
and their cached capacities.


So whenever code calls arr ~= b, the LRU is searched first. If the 
system finds arr (both bounds) in the LRU, that means the cached 
capacity is correct and can solve the matter without an actual trip to 
the GC at all! Otherwise, do the deed and cache the new slice and the 
new capacity.


This also solves the lack of safety: if you request a growth on an array 
you just grew, it's impossible  to have a valid slice beyond that array.


This LRU would allow us to keep the slice API as it currently is, and 
also at excellent efficiency.


What do you think?


Andrei


Its an interesting idea, and if I have time tonight I'll take a crack at it.  For those 
others who may, the function you care about is _d_arrayappendcT in compiler/dmd/lifetime.


-- Chris Nicholson-Sauls

Re: LRU cache for ~=

2009-10-19 Thread Fawzi Mohamed

On 2009-10-19 21:53:53 +0200, Andrei Alexandrescu 
seewebsiteforem...@erdani.org said:



dsimcha wrote:

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article

dsimcha wrote:

2.  I don't understand how this solves the safety problem:

// foo lives on the heap b/c we've idup'd it.
string foo = This is only a test..idup;
string bar = foo[0..4];
bar ~=  is _not ;
writeln(foo);  // prints This is _not a test.

Having access to the capacity in an LRU cache doesn't help if I understand it
correctly.

Let me stress a point harder that I think I expressed poorly:
The LRU cache stores the capacity of a given slice given _BOTH_ the
slice's left and right bounds. If you later come with a slice that has
only one correct bound, the LRU doesn't care about it. That's the safety
tidbit.
Andrei


I think I get it now, although it's very conservative.


That's why we need experimentation. My gut tells me that linear search 
will scram through 8 elements real fast, and if there's something in 
the cache it will almost always be in a top position. Also, I think the 
LRU will solve all cases that matter.


yes you are probably right




One more question:  Is
this going to take the place of ArrayBuilder or be inaddition?  The LRU 
is a good
hack to preserve syntactic elegance and ease of use, but it's somewhat 
of a kludge

nonetheless and I'd ideally still like to see a real ArrayBuilder with full
array-like semantics if T[new] is definitely out.


Ideally we'd be able to render ArrayBuilder/Appender unnecessary. I 
think there is a need for a UniqueArray, but that's only loosely 
related.


a library array knowing its capacity is still useful I think.

Fawzi

Re: LRU cache for ~=

2009-10-19 Thread Jason House

Andrei Alexandrescu Wrote:

 I just wrote this to Sean and Walter and subsequently discussed it with 
 Walter. Walter thinks this should work. Does anyone have the time and 
 inclination to test this out? It would involve hacking into druntime's 
 implementation of ~= (I'm not sure what the function name is). I'd 
 really appreciate this; I'm overloaded as it is.
 
 ==
 
 In wake of the recent demise of T[new], I was thinking of finding ways 
 of making ~= efficient for T[].
 
 Currently ~= is slow because accessing GC.sizeOf(void*) acquires a 
 global lock and generally must figure out a lot of things about the 
 pointer to make a decision.
 
 Also, ~= is dangerous because it allows slices to stomp over other slices.
 
 I was thinking of solving these issues by keeping an LRU (Least Recently 
 Used) cache inside the implementation of ~=. The LRU would only have a 
 few entries (4-8) and would store the parameters of the last ~= calls, 
 and their cached capacities.

Shouldn't it be MRU (Most Recently Used)?

I assume ~= will be disallowed for shared arrays?

Will all arrays overallicate when created in order to allow efficient appending?

I feel like bolting on ~= to slices gives mediocre performance and a lot of 
gotchas. Some users may expect slices to be reference copied arrays and want to 
see appended data in other slices. Some may get away with mutating the array in 
the MRU or a slice and seeing the changes, only to get bitten later. A separate 
type really makes more sense to me...

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


Jason House wrote:

Andrei Alexandrescu Wrote:


I just wrote this to Sean and Walter and subsequently discussed it
with Walter. Walter thinks this should work. Does anyone have the
time and inclination to test this out? It would involve hacking
into druntime's implementation of ~= (I'm not sure what the
function name is). I'd really appreciate this; I'm overloaded as it
is.

==

In wake of the recent demise of T[new], I was thinking of finding
ways of making ~= efficient for T[].

Currently ~= is slow because accessing GC.sizeOf(void*) acquires a
 global lock and generally must figure out a lot of things about
the pointer to make a decision.

Also, ~= is dangerous because it allows slices to stomp over other
slices.

I was thinking of solving these issues by keeping an LRU (Least
Recently Used) cache inside the implementation of ~=. The LRU would
only have a few entries (4-8) and would store the parameters of the
last ~= calls, and their cached capacities.


Shouldn't it be MRU (Most Recently Used)?


Yes. LRU would be the eviction strategy.


I assume ~= will be disallowed for shared arrays?


I think so.


Will all arrays overallicate when created in order to allow efficient
appending?


Upon the first append.


I feel like bolting on ~= to slices gives mediocre performance and a
lot of gotchas.


You may be right about the gotchas, but you better produce some numbers 
wrt performance.



Some users may expect slices to be reference copied
arrays and want to see appended data in other slices. Some may get
away with mutating the array in the MRU or a slice and seeing the
changes, only to get bitten later. A separate type really makes more
sense to me...


Making ~= work well for slices does not preclude creating a distinct 
library type.



Andrei

Re: LRU cache for ~=

2009-10-19 Thread dsimcha

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
 Making ~= work well for slices does not preclude creating a distinct
 library type.
 Andrei

Yes, I see LRU/MRU/whatever as a kludge to allow convenience without being 
unsafe
egregiously inefficient.  The fact that it is a kludge is not a criticism 
because
there is clearly no easy answer, and thus a kludge is genuinely necessary.
However, I think there needs to be a separate array builder type for heavy 
duty
appending operations.  In TDPL I would just say that concatenating slices can be
inefficient and that people should use array builder for heavy duty appending,
length changing, etc., w/o getting into the details.

Re: LRU cache for ~=

2009-10-19 Thread Fawzi Mohamed


On 2009-10-20 00:13:05 +0200, dsimcha dsim...@yahoo.com said:


== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article

Making ~= work well for slices does not preclude creating a distinct
library type.
Andrei


Yes, I see LRU/MRU/whatever as a kludge to allow convenience without 
being unsafe
egregiously inefficient.  The fact that it is a kludge is not a 
criticism because

there is clearly no easy answer, and thus a kludge is genuinely necessary.
However, I think there needs to be a separate array builder type for 
heavy duty
appending operations.  In TDPL I would just say that concatenating 
slices can be

inefficient and that people should use array builder for heavy duty appending,
length changing, etc., w/o getting into the details.


I fully agree

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


dsimcha wrote:

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article

Making ~= work well for slices does not preclude creating a distinct
library type.
Andrei


Yes, I see LRU/MRU/whatever as a kludge to allow convenience without being 
unsafe
egregiously inefficient.  The fact that it is a kludge is not a criticism 
because
there is clearly no easy answer, and thus a kludge is genuinely necessary.
However, I think there needs to be a separate array builder type for heavy 
duty
appending operations.  In TDPL I would just say that concatenating slices can be
inefficient and that people should use array builder for heavy duty appending,
length changing, etc., w/o getting into the details.


But how about the opposite view. What if the *previous* implementation 
of GC and ~= were a kludge that led to egregious inefficiency and 
revolting unsafety, kludge that that now is getting fixed.


What I'm saying is that with the cache in place we'll have slices that 
are safe and efficient. Right now they are not safe and not efficient. I 
can hardly find reasons to characterize the new state of affairs as kludgey.


One surprising (but safe) behavior that remains with slices is this:

void fun(int[] a) {
   a[0] = 0;
   a ~= 42;
   a[0] = 42;
}

The caller may or may not see 42 in the first slot after the call.


Andrei

Re: LRU cache for ~=

2009-10-19 Thread dsimcha

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
 dsimcha wrote:
  == Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
  Making ~= work well for slices does not preclude creating a distinct
  library type.
  Andrei
 
  Yes, I see LRU/MRU/whatever as a kludge to allow convenience without being 
  unsafe
  egregiously inefficient.  The fact that it is a kludge is not a criticism 
  because
  there is clearly no easy answer, and thus a kludge is genuinely necessary.
  However, I think there needs to be a separate array builder type for heavy 
  duty
  appending operations.  In TDPL I would just say that concatenating slices 
  can be
  inefficient and that people should use array builder for heavy duty 
  appending,
  length changing, etc., w/o getting into the details.
 But how about the opposite view. What if the *previous* implementation
 of GC and ~= were a kludge that led to egregious inefficiency and
 revolting unsafety, kludge that that now is getting fixed.
 What I'm saying is that with the cache in place we'll have slices that
 are safe and efficient. Right now they are not safe and not efficient. I
 can hardly find reasons to characterize the new state of affairs as kludgey.
 One surprising (but safe) behavior that remains with slices is this:
 void fun(int[] a) {
 a[0] = 0;
 a ~= 42;
 a[0] = 42;
 }
 The caller may or may not see 42 in the first slot after the call.
 Andrei

Started playing w/ the implementation a little and I see a problem.  What about
the garbage collector?  There are two possibilities:

1.  The pointer gets stored as a pointer.  The array never gets freed until it
gets evicted from the cache.  If it's a huge array, this is very bad.

2.  The pointer gets cast to a size_t so it doesn't look like a reference.  
Then,
we can have something like:

string foo;
foreach(i; 0..5) {
foo ~= 'a';
}
foo = null;
GC.collect;

// Now we have (foo.ptr, 5) in our LRU cache, but foo has been GC'd.
string bar = 123456789;  // bar gets the memory reclaimed from foo.
string baz = bar[0..5];  // Same length, ptr as foo.
baz ~= '1';  // Finds stale information from foo in cache.
writeln(bar);  // Prints [1 2 3 4 5 1 7 8 9].

The only possible solutions I see would be to have the GC know everything about
the LRU cache and evict stale entries (probably slows down GC a lot, a huge PITA
to implement, couples things that shouldn't be tightly coupled), or clear the
cache every time GC is run (probably would make appending so slow as to defeat 
the
purpose of having the cache).

Re: LRU cache for ~=

2009-10-19 Thread Sean Kelly

Jason House Wrote:
 
 Will all arrays overallicate when created in order to allow efficient 
 appending?

They effectively do anyway, since the GC block sizes are in powers of 2 and the 
runtime relies on the GC to determine block capacity.  In effect, it's simply 
the GC that's over-allocating instead of the runtime itself.

Re: LRU cache for ~=

2009-10-19 Thread Sean Kelly

Andrei Alexandrescu Wrote:

 
 What I'm saying is that with the cache in place we'll have slices that 
 are safe and efficient. Right now they are not safe and not efficient. I 
 can hardly find reasons to characterize the new state of affairs as kludgey.

I'm not sure I agree that they'd be safe.  Given only a pointer to the head of 
a block there's no way to know whether it represents the array or a slice of 
the array from [0..n].

Re: LRU cache for ~=

2009-10-19 Thread Sean Kelly

Andrei Alexandrescu Wrote:
 
 I think GC.collect may simply evict the entire cache. The collection 
 cycle costs so much, the marginal cost of losing cached information is 
 lost in the noise.

Right.  The GC already has an LRU cache of size 1 (ie. LRU entry) for 
GC.sizeOf() and GC.query(), and these values are cleared on a collection by 
necessity, since pages can be repurposed if they're been completely emptied.

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


Sean Kelly wrote:

Andrei Alexandrescu Wrote:


What I'm saying is that with the cache in place we'll have slices
that are safe and efficient. Right now they are not safe and not
efficient. I can hardly find reasons to characterize the new state
of affairs as kludgey.


I'm not sure I agree that they'd be safe.  Given only a pointer to
the head of a block there's no way to know whether it represents the
array or a slice of the array from [0..n].


That's the beauty of the cache: with the cache in place, you may 
actually know that the memory beyond the end of a cached slice is seen 
by nobody else.


Andrei

Re: LRU cache for ~=

2009-10-19 Thread Rainer Deyke

Andrei Alexandrescu wrote:
 One surprising (but safe) behavior that remains with slices is this:
 
 void fun(int[] a) {
a[0] = 0;
a ~= 42;
a[0] = 42;
 }
 
 The caller may or may not see 42 in the first slot after the call.

Your definition of safe is clearly not aligned with mine.


-- 
Rainer Deyke - rain...@eldwood.com

Re: LRU cache for ~=

2009-10-19 Thread Leandro Lucarella

Andrei Alexandrescu, el 19 de octubre a las 17:24 me escribiste:
 dsimcha wrote:
 == Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
 Making ~= work well for slices does not preclude creating a distinct
 library type.
 Andrei
 
 Yes, I see LRU/MRU/whatever as a kludge to allow convenience without being 
 unsafe
 egregiously inefficient.  The fact that it is a kludge is not a criticism 
 because
 there is clearly no easy answer, and thus a kludge is genuinely necessary.
 However, I think there needs to be a separate array builder type for heavy 
 duty
 appending operations.  In TDPL I would just say that concatenating slices 
 can be
 inefficient and that people should use array builder for heavy duty 
 appending,
 length changing, etc., w/o getting into the details.
 
 But how about the opposite view. What if the *previous*
 implementation of GC and ~= were a kludge that led to egregious
 inefficiency and revolting unsafety, kludge that that now is getting
 fixed.
 
 What I'm saying is that with the cache in place we'll have slices
 that are safe and efficient. Right now they are not safe and not
 efficient. I can hardly find reasons to characterize the new state
 of affairs as kludgey.
 
 One surprising (but safe) behavior that remains with slices is this:
 
 void fun(int[] a) {
a[0] = 0;
a ~= 42;
a[0] = 42;
 }
 
 The caller may or may not see 42 in the first slot after the call.

And for me this is enough to remove appending from slicing and having
a proper array type.

There is no reason to keep that buggy behaviour.

-- 
Leandro Lucarella (AKA luca) http://llucax.com.ar/
--
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
--
DONAN UN PENE EN NICARAGUA
-- Crónica TV

Re: LRU cache for ~=

2009-10-19 Thread Denis Koroskin

On Tue, 20 Oct 2009 03:00:57 +0400, Rainer Deyke rain...@eldwood.com  
wrote:



Andrei Alexandrescu wrote:

One surprising (but safe) behavior that remains with slices is this:

void fun(int[] a) {
   a[0] = 0;
   a ~= 42;
   a[0] = 42;
}

The caller may or may not see 42 in the first slot after the call.


Your definition of safe is clearly not aligned with mine.




Safe as in SafeD (i.e. no memory corruption) :)

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


dsimcha wrote:

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article

dsimcha wrote:

Started playing w/ the implementation a little and I see a problem.  What about
the garbage collector?  There are two possibilities:

[snip]

The only possible solutions I see would be to have the GC know everything about
the LRU cache and evict stale entries (probably slows down GC a lot, a huge PITA
to implement, couples things that shouldn't be tightly coupled), or clear the
cache every time GC is run (probably would make appending so slow as to defeat 
the
purpose of having the cache).

I think GC.collect may simply evict the entire cache. The collection
cycle costs so much, the marginal cost of losing cached information is
lost in the noise.
Andrei


But then you have to copy the whole array again, likely triggering another GC if
the array is large.  Then things really get ugly as, for all practical purposes,
you've completely done away with the cache.


This happens whether or not a cache is in use.

Andrei

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


Rainer Deyke wrote:

Andrei Alexandrescu wrote:

One surprising (but safe) behavior that remains with slices is this:

void fun(int[] a) {
   a[0] = 0;
   a ~= 42;
   a[0] = 42;
}

The caller may or may not see 42 in the first slot after the call.


Your definition of safe is clearly not aligned with mine.




What's yours?

Andrei

Re: LRU cache for ~=

2009-10-19 Thread dsimcha

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
 dsimcha wrote:
  == Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
  dsimcha wrote:
  Started playing w/ the implementation a little and I see a problem.  What 
  about
  the garbage collector?  There are two possibilities:
  [snip]
  The only possible solutions I see would be to have the GC know everything 
  about
  the LRU cache and evict stale entries (probably slows down GC a lot, a 
  huge PITA
  to implement, couples things that shouldn't be tightly coupled), or clear 
  the
  cache every time GC is run (probably would make appending so slow as to
defeat the
  purpose of having the cache).
  I think GC.collect may simply evict the entire cache. The collection
  cycle costs so much, the marginal cost of losing cached information is
  lost in the noise.
  Andrei
 
  But then you have to copy the whole array again, likely triggering another 
  GC if
  the array is large.  Then things really get ugly as, for all practical 
  purposes,
  you've completely done away with the cache.
 This happens whether or not a cache is in use.
 Andrei

But the array isn't guaranteed to get reallocated immediately after *every* GC
run.  If you're appending to a huge array, the GC will likely run several times
while you're appending, leading to several unnecessary reallocations.  Each of
those unnecessary reallocations will increase the memory footprint of your
program, possibly triggering another GC run and wiping out your cache again in
short order, until, for sufficiently large arrays,

a ~= b;

is almost equivalent to

a = a ~ b;

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


dsimcha wrote:

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article

dsimcha wrote:

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article

dsimcha wrote:

Started playing w/ the implementation a little and I see a problem.  What about
the garbage collector?  There are two possibilities:

[snip]

The only possible solutions I see would be to have the GC know everything about
the LRU cache and evict stale entries (probably slows down GC a lot, a huge PITA
to implement, couples things that shouldn't be tightly coupled), or clear the
cache every time GC is run (probably would make appending so slow as to

defeat the

purpose of having the cache).

I think GC.collect may simply evict the entire cache. The collection
cycle costs so much, the marginal cost of losing cached information is
lost in the noise.
Andrei

But then you have to copy the whole array again, likely triggering another GC if
the array is large.  Then things really get ugly as, for all practical purposes,
you've completely done away with the cache.

This happens whether or not a cache is in use.
Andrei


But the array isn't guaranteed to get reallocated immediately after *every* GC
run.  If you're appending to a huge array, the GC will likely run several times
while you're appending, leading to several unnecessary reallocations. 


I don't think I understand this.

1. Request for an append comes that runs out of memory

2. GC runs and clears memory

3. Array is reallocated and the capacity cached.

No?


Each of
those unnecessary reallocations will increase the memory footprint of your
program, possibly triggering another GC run and wiping out your cache again in
short order, until, for sufficiently large arrays,

a ~= b;

is almost equivalent to

a = a ~ b;


I don't understand how the cache makes that all worse.


Andrei

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


Bill Baxter wrote:

On Mon, Oct 19, 2009 at 5:07 PM, Andrei Alexandrescu
seewebsiteforem...@erdani.org wrote:

Rainer Deyke wrote:

Andrei Alexandrescu wrote:

One surprising (but safe) behavior that remains with slices is this:

void fun(int[] a) {
  a[0] = 0;
  a ~= 42;
  a[0] = 42;
}

The caller may or may not see 42 in the first slot after the call.

Your definition of safe is clearly not aligned with mine.



What's yours?


Probably something like safe from easy-to-write hard-to-debug
programming errors.

I agree that it would stink if all these changes were made and *still*
one of the top gotchas with arrays/slices remained.  It also makes me
think slices and appending just don't belong together.  Appending to a
view ... just say no.


How to reconcile this viewpoint with that of people who find ~= all too 
darn convenient?


Andrei

Re: LRU cache for ~=

2009-10-19 Thread Walter Bright


Denis Koroskin wrote:

Safe as in SafeD (i.e. no memory corruption) :)


Right. The problems with other definitions of safe is they are too 
ill-defined.

Re: LRU cache for ~=

2009-10-19 Thread Rainer Deyke

Denis Koroskin wrote:
 On Tue, 20 Oct 2009 03:00:57 +0400, Rainer Deyke rain...@eldwood.com
 wrote:
 Andrei Alexandrescu wrote:
 One surprising (but safe) behavior that remains with slices is this:

 void fun(int[] a) {
a[0] = 0;
a ~= 42;
a[0] = 42;
 }

 The caller may or may not see 42 in the first slot after the call.

 Your definition of safe is clearly not aligned with mine.
 
 Safe as in SafeD (i.e. no memory corruption) :)

If the caller wasn't expecting the array to be modified, then that's a
textbook case of memory corruption.  If the caller /was/ expecting the
array to be modified, then it's the opposite, which isn't much better.


-- 
Rainer Deyke - rain...@eldwood.com

Re: LRU cache for ~=

2009-10-19 Thread Rainer Deyke

Andrei Alexandrescu wrote:
 Rainer Deyke wrote:
 Your definition of safe is clearly not aligned with mine.
 
 What's yours?

Safety is not an absolute, but a question of degree.  The harder it is
to write incorrect code, the safer the language.

One key element of this is deterministic behavior.  If you rely on the
whim of the runtime to determine if two slices refer to the same data,
then it becomes much harder to reason about or test the code.


-- 
Rainer Deyke - rain...@eldwood.com

Re: LRU cache for ~=

2009-10-19 Thread Brad Roberts

On Mon, 19 Oct 2009, Walter Bright wrote:

 Denis Koroskin wrote:
  Safe as in SafeD (i.e. no memory corruption) :)
 
 Right. The problems with other definitions of safe is they are too
 ill-defined.

There's SafeD, which has a fairly formal definition.

The other side of it is the general principle of D which is that the right 
way should be the easy and obvious way.  Slices and arrays have issue with 
the principle.

Re: LRU cache for ~=

2009-10-19 Thread Leandro Lucarella

Andrei Alexandrescu, el 19 de octubre a las 20:12 me escribiste:
 Bill Baxter wrote:
 On Mon, Oct 19, 2009 at 5:07 PM, Andrei Alexandrescu
 seewebsiteforem...@erdani.org wrote:
 Rainer Deyke wrote:
 Andrei Alexandrescu wrote:
 One surprising (but safe) behavior that remains with slices is this:
 
 void fun(int[] a) {
   a[0] = 0;
   a ~= 42;
   a[0] = 42;
 }
 
 The caller may or may not see 42 in the first slot after the call.
 Your definition of safe is clearly not aligned with mine.
 
 
 What's yours?
 
 Probably something like safe from easy-to-write hard-to-debug
 programming errors.
 
 I agree that it would stink if all these changes were made and *still*
 one of the top gotchas with arrays/slices remained.  It also makes me
 think slices and appending just don't belong together.  Appending to a
 view ... just say no.
 
 How to reconcile this viewpoint with that of people who find ~= all
 too darn convenient?

Use a proper array type, not a view (slice).

-- 
Leandro Lucarella (AKA luca) http://llucax.com.ar/
--
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
--
No recuerdo las flores, no conozco el viento
Aquí donde envejecen juntos presos y carceleros
Los días no pasan, el tiempo no pasa
Y vivimos contando el tiempo
Y moriremos contando el tiempo

Re: LRU cache for ~=

2009-10-19 Thread grauzone


Andrei Alexandrescu wrote:
I was thinking of solving these issues by keeping an LRU (Least Recently 
Used) cache inside the implementation of ~=. The LRU would only have a 
few entries (4-8) and would store the parameters of the last ~= calls, 
and their cached capacities.


Sounds like a bad hack, but at least it would solve the issue about 
overwritten slices.


But for some uses of array appending, you had to use a library based 
Appender (or whatever you have in Phobos 2):


class Something {
   T[] data;
   void add(T x) {
 data ~= x; //chances that data are in the cache are minimal
   }
}

There's no cache locality or anything, but obviously you still would 
like to have efficient appender behavior.


Also look at this:

string[] t;
t ~= somefunction();
t ~= someotherfunction();

Chances are high that those functions will remove t from the cache, 
and it would all be inefficient again. Nothing solved!


Now you could try to make the cache function local to solve this issue. 
Whenever a function contains the ~= operator, the compiler allocates a 
cache struct, and the ~= operator passes a hidden pointer to it to the 
runtime.


But that wouldn't work if you want pass slices to other functions and to 
append to them (but it would still be safe, I guess?).


Looks like these performance hacks just don't work out. It'd be so much 
simpler to make a~=b; equivalent to a=a~b;. Then there's no 
mystery about the performance of the ~= operator. You can explain 
everyone: yes, this allocates; if you want performance, use 
ArrayAppender. Hey, you could alias this ArrayAppender type to 
something like T[new] to make it feel more natural. But then, we're at 
the beginning again.

Re: LRU cache for ~=

2009-10-19 Thread dsimcha

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
 dsimcha wrote:
  == Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
  dsimcha wrote:
  == Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s 
  article
  dsimcha wrote:
  Started playing w/ the implementation a little and I see a problem.  
  What
about
  the garbage collector?  There are two possibilities:
  [snip]
  The only possible solutions I see would be to have the GC know 
  everything
about
  the LRU cache and evict stale entries (probably slows down GC a lot, a
huge PITA
  to implement, couples things that shouldn't be tightly coupled), or 
  clear the
  cache every time GC is run (probably would make appending so slow as to
  defeat the
  purpose of having the cache).
  I think GC.collect may simply evict the entire cache. The collection
  cycle costs so much, the marginal cost of losing cached information is
  lost in the noise.
  Andrei
  But then you have to copy the whole array again, likely triggering 
  another GC if
  the array is large.  Then things really get ugly as, for all practical 
  purposes,
  you've completely done away with the cache.
  This happens whether or not a cache is in use.
  Andrei
 
  But the array isn't guaranteed to get reallocated immediately after *every* 
  GC
  run.  If you're appending to a huge array, the GC will likely run several 
  times
  while you're appending, leading to several unnecessary reallocations.
 I don't think I understand this.
 1. Request for an append comes that runs out of memory
 2. GC runs and clears memory
 3. Array is reallocated and the capacity cached.
 No?

This is entirely correct.

  Each of
  those unnecessary reallocations will increase the memory footprint of your
  program, possibly triggering another GC run and wiping out your cache again 
  in
  short order, until, for sufficiently large arrays,
 
  a ~= b;
 
  is almost equivalent to
 
  a = a ~ b;
 I don't understand how the cache makes that all worse.
 Andrei

The cache doesn't make anything *worse* than with no cache.  The only point I'm
trying to make is that, for large arrays, if the GC clears the cache every time 
it
runs, things would start to get *almost as bad as* having no cache because the
copy operations become expensive and the GC may run frequently.

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


Rainer Deyke wrote:

Denis Koroskin wrote:

On Tue, 20 Oct 2009 03:00:57 +0400, Rainer Deyke rain...@eldwood.com
wrote:

Andrei Alexandrescu wrote:

One surprising (but safe) behavior that remains with slices is this:

void fun(int[] a) {
   a[0] = 0;
   a ~= 42;
   a[0] = 42;
}

The caller may or may not see 42 in the first slot after the call.

Your definition of safe is clearly not aligned with mine.

Safe as in SafeD (i.e. no memory corruption) :)


If the caller wasn't expecting the array to be modified, then that's a
textbook case of memory corruption.


[citation needed]

Andrei

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


Rainer Deyke wrote:

Andrei Alexandrescu wrote:

Rainer Deyke wrote:

Your definition of safe is clearly not aligned with mine.

What's yours?


Safety is not an absolute, but a question of degree.  The harder it is
to write incorrect code, the safer the language.


Well you got a point. I think I will from here on talk about soundness 
instead of safety, because the former is absolute.



One key element of this is deterministic behavior.  If you rely on the
whim of the runtime to determine if two slices refer to the same data,
then it becomes much harder to reason about or test the code.


I agree.


Andrei

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


grauzone wrote:

Andrei Alexandrescu wrote:
I was thinking of solving these issues by keeping an LRU (Least 
Recently Used) cache inside the implementation of ~=. The LRU would 
only have a few entries (4-8) and would store the parameters of the 
last ~= calls, and their cached capacities.


Sounds like a bad hack, but at least it would solve the issue about 
overwritten slices.


But for some uses of array appending, you had to use a library based 
Appender (or whatever you have in Phobos 2):


class Something {
   T[] data;
   void add(T x) {
 data ~= x; //chances that data are in the cache are minimal
   }
}

There's no cache locality or anything, but obviously you still would 
like to have efficient appender behavior.


Why is there no cache locality?


Also look at this:

string[] t;
t ~= somefunction();
t ~= someotherfunction();

Chances are high that those functions will remove t from the cache, 
and it would all be inefficient again. Nothing solved!


Why are there chances those functions will remove t from the cache? 
For it to become a performance problem, there must be a loop with nine 
or more round-robin appends. When that does happen, yes, appends will 
become slower. (We may be able to make that better by using random 
eviction.)


Now you could try to make the cache function local to solve this issue. 
Whenever a function contains the ~= operator, the compiler allocates a 
cache struct, and the ~= operator passes a hidden pointer to it to the 
runtime.


But that wouldn't work if you want pass slices to other functions and to 
append to them (but it would still be safe, I guess?).


Looks like these performance hacks just don't work out.


Caches have a long and solid history of working. You'd have a very hard 
time convincing me that a cache that directly addresses a performance 
problem is a hack.


It'd be so much 
simpler to make a~=b; equivalent to a=a~b;. Then there's no 
mystery about the performance of the ~= operator. You can explain 
everyone: yes, this allocates; if you want performance, use 
ArrayAppender. Hey, you could alias this ArrayAppender type to 
something like T[new] to make it feel more natural. But then, we're at 
the beginning again.


I think making ~= faster is worth pursuing.


Andrei

Re: LRU cache for ~=

2009-10-19 Thread Rainer Deyke

Andrei Alexandrescu wrote:
 Rainer Deyke wrote:
 If the caller wasn't expecting the array to be modified, then that's a
 textbook case of memory corruption.
 
 [citation needed]

I guess we need to define memory corruption first.  Memory corruption
is when a piece of code erroneously overwrites memory.  That applies
here.  Do you have a better definition?


-- 
Rainer Deyke - rain...@eldwood.com

Re: LRU cache for ~=

2009-10-19 Thread Andrei Alexandrescu


Rainer Deyke wrote:

Andrei Alexandrescu wrote:

Rainer Deyke wrote:

If the caller wasn't expecting the array to be modified, then that's a
textbook case of memory corruption.

[citation needed]


I guess we need to define memory corruption first.  Memory corruption
is when a piece of code erroneously overwrites memory.


Where's that quote from?

That definition is too vague to be of any use. Even an int that's 
written with a value when the spec meant to write another value is 
erroneously written and would be, by your definition, memory corruption. 
 It is not.



That applies
here.  Do you have a better definition?


I do, but since you mentioned a textbook case of memory corruption, I 
was curious which textbook that wold come from. That's why I asked for a 
_citation_. A vague useless ad-hoc definition is easy to put together.



Andrei

Re: LRU cache for ~=

2009-10-19 Thread Rainer Deyke

Andrei Alexandrescu wrote:
 Rainer Deyke wrote:
 I guess we need to define memory corruption first.  Memory corruption
 is when a piece of code erroneously overwrites memory.
 
 Where's that quote from?

It's my own attempt to define memory corruption.  It's equivalent to the
definition in Wikipedia: Memory corruption happens when the contents of
a memory location are unintentionally modified due to programming errors.

 [...] Do you have a better definition?
 
 I do,

Then post it.

 but since you mentioned a textbook case of memory corruption, I
 was curious which textbook that wold come from.

My use of the word textbook was idiomatic, not literal.

From http://www.answers.com/topic/textbook:

text·book   (tĕkst'bʊk')

[...]

adj.

Being a characteristic example of its kind; classic: a textbook case of
schizophrenia.


-- 
Rainer Deyke - rain...@eldwood.com

54 matches

Mail list logo