On Thursday, Jul 17, 2003, at 12:48 America/Guayaquil, Berin Loritsch wrote:


Berin Loritsch wrote:

The concept translates over to the adaptive cache approach at almost 1:1.
The actual individual cost sample is not important. It is the windowed average
that is important. For our cache a simple mean would best suit our problem
space. The size of the window for our samples will control how responsive
our cache control algorithm is. This would affect your cache prediction
algorithm as the value k is no longer needed to control responsiveness. The
value k can be set at 0, 1, or 0.5 depending on how you want to affect the
slope. IMO, the window size should be the determining factor at how the average
cost for a resource will converge toward the optimal point. Large window sizes
provide slower response, while a smaller window provides faster response.


BTW. If we use "0" for the value of "k", then the cache prediction algorithm
as is defined by your "discriminate caching" sequence would be a purely
random result. The c(eff) function would solve down to the constant 0.5,
and since we discriminate based on a randomly generated number being less
than the result of the function, we have a 50/50 chance (theoretically) that
the random number would be greater than 0.5. I say theoretcially because
not all random number generators are truly random, or even mostly random.


If we use "1" for the value of "k", then the prediction algorithm follows
the efficiency value much better. There is a higher chance of having a
higher number for the cache prediction, which means we will be evaluating
more entries. In fact, as the value for "eff" reaches +/-38, we will always
be caching because the result of the equation will always be 1.


If we use "0.5" for the value of "k", then the prediction algorithm still
follows the efficiency value, but our range is now to +/-76.

You analysis is fully correct.


The question then is, where statistically will this efficiency algorithm be all the time?

The idea behind such a design is that if a resource has been efficient (remember that efficiency in this context means "how much costs we saved so far on that resource") the chance of avoiding caching should be very small because it's very likely that the non-cached version will be much less efficient. Of course, the same is true for negative efficiencies (means that the cache is inefficient on those resources).


The important concept is that even if the cache is very efficient on that resource (thus efficiency is positive and high), we will be sampling the other side of the fence to understand if its behavior has changed.

Imagine the peak-hour case: the efficiency for a particular resource is, for example -50 (means that by "not caching" we saved 50 cost units [the dimension of which depends on the cost function]), this gives us, roughly a 99% probability of avoid caching that resource. But on that one request, we find out that caching effiency for that single resource was +5.

This changes the overall efficiency to -45, thus increasing the chance of caching to 1.5% (I'm making up the numbers but you get my point). The next cached request, another +5 cost units are saved, turning into -40 and the caching change to 2%.

If the caching trend continues, the efficiency will eventually swing positive, changing the caching behavior adaptively.

If the caching trend was just occasional, this will be simply a perturbation that will be absorbed by the adaptivity of the system.

The hard thing to sell in this algorithm is that even if you *know* that you are going to be inefficient on a particular resource, you are going to be, in order to obtain information on *whether* of not your prediction was right.

This tradeoff seems like a sin to the eyes of many and I believe that it is exactly this vision that prevented truly adaptive caches to be tested.

But, of course, I might be completely wrong.

--
Stefano.



Reply via email to