On Tue, 20 Oct 2009 10:05:42 -0400, Steven Schveighoffer <schvei...@yahoo.com> wrote:

On Mon, 19 Oct 2009 22:37:26 -0400, dsimcha <dsim...@yahoo.com> wrote:

== Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
dsimcha wrote:
> == Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
>> dsimcha wrote:
>>> == Quote from Andrei Alexandrescu (seewebsiteforem...@erdani.org)'s article
>>>> dsimcha wrote:
>>>>> Started playing w/ the implementation a little and I see a problem. What
about
>>>>> the garbage collector?  There are two possibilities:
>>>> [snip]
>>>>> The only possible solutions I see would be to have the GC know everything
about
>>>>> the LRU cache and evict stale entries (probably slows down GC a lot, a
huge PITA
>>>>> to implement, couples things that shouldn't be tightly coupled), or clear the >>>>> cache every time GC is run (probably would make appending so slow as to
> defeat the
>>>>> purpose of having the cache).
>>>> I think GC.collect may simply evict the entire cache. The collection >>>> cycle costs so much, the marginal cost of losing cached information is
>>>> lost in the noise.
>>>> Andrei
>>> But then you have to copy the whole array again, likely triggering another GC if >>> the array is large. Then things really get ugly as, for all practical purposes,
>>> you've completely done away with the cache.
>> This happens whether or not a cache is in use.
>> Andrei
>
> But the array isn't guaranteed to get reallocated immediately after *every* GC > run. If you're appending to a huge array, the GC will likely run several times
> while you're appending, leading to several unnecessary reallocations.
I don't think I understand this.
1. Request for an append comes that runs out of memory
2. GC runs and clears memory
3. Array is reallocated and the capacity cached.
No?

This is entirely correct.

> Each of
> those unnecessary reallocations will increase the memory footprint of your > program, possibly triggering another GC run and wiping out your cache again in
> short order, until, for sufficiently large arrays,
>
> a ~= b;
>
> is almost equivalent to
>
> a = a ~ b;
I don't understand how the cache makes that all worse.
Andrei

The cache doesn't make anything *worse* than with no cache. The only point I'm trying to make is that, for large arrays, if the GC clears the cache every time it runs, things would start to get *almost as bad as* having no cache because the
copy operations become expensive and the GC may run frequently.

The cache can't be "cleared" every time, or else you might as well only keep one LRU entry:

int[] twos, threes;

for(int i = 1; i < 10000; i++)
{
   twos ~= i * 2;
   threes ~= i * 3;
}

At some point, twos or threes needs an allocation triggering a collection, and that clears the cache, making the other array need an allocation, clearing the cache, etc.

I'd think you only want to clear the entries affected by the collection.

-Steve

If it was free and simple to only clear the affected entries, sure. But doing so requires (very heavy?) modification of the GC in order to track and check changes. It also reduces collection performance. I think, that if GC allocations added entries to the LRU, and therefore the information in the LRU is never stale, you could avoid clearing the LRU. But this requires the LRU to be part of the GC.

Reply via email to