On Tuesday, Jul 15, 2003, at 17:06 America/Guayaquil, Berin Loritsch wrote:
Dumbing it down a lot will help emensely. The problem is that it is already 18+ pages of hard to read stuff. Then when we don't understand it we get verbally flogged.
Berin, look: if you said "I don't get it, please rephrase" I would have done it. If you said "I don't get this part" or "why are you doing this?" I would have elaborated.
Here is the deal. When I try to wrap my brain around something, I try to rephrase things in terms and concepts that I understand. I have studied AI theory for a program that was being developed by my last company. I can "get" that because it works more cognitively for me.
So the response I get is a lot of flak. Quite honestly, I simply state things in terms that I have both observed and have experience with. My statements about the limited value (cost/benefit ratio) of partial pipeline caching has to do with *my* experience. Maybe others have had different experiences, but all my dynamic information was always encapsulated in one place: the generator. The transformers had an infinite ergodic period (infinite in that they did not change until there was a development reason to do so). The serializers were quite simple.
With that combination facts, the only thing in my pipeline with an ergodic period less than infinity was the generator. For my static pages, I could have simply compiled the results and served them. For my dynamic pages, the generation was pretty darn quick.
Then again, maybe my *experience* is not typical, which means what is good enough for me is not good enough for others.
Sorry I poopooed on your idea. I'm sorry I don't have the mental capacity
to constructively contribute.
Oh, please, you are way smarter than enough to get it. Maybe the language is not what you are used to, granted, but I'm wide open to try to outline the details. If you tell me what you don't understand I can try explain it differently. If you don't, and just go on assuming, I can't do anything but being frustrated at all the time I spent on this.
That was in response to your email. By the time I was done reading it, I felt like crap. There are more similarities between what I presented and you presented than you would like to admit.
The basic crux of programming Intelligent Agents is that agents react to the state of the environment, discover the best response, and act (usually in a way that affects the environment).
What you outlined was the "search algorithm" (the discovery method), and the maner in which the agent (cache controller) acted upon the environment.
What you left as an excersize to the reader was the Cost function. The rules-based approach is one way of achieving that cost function declaratively. Any time you incorporate conditional logic in your cost function, you have explicit rules. Any time you have a weighted cost that factors in different elements you have implicit rules. Explicit rules are easier to debug and understand, also to predict. There are different ways of applying explicit rules without resorting to if/then/else or switch/case functions. In fact you might be able to come up with a way to translate explicit rules into a function with implicit rules.
Truth be told, your search algorithm might be as efficient as they come. It might not.
But do keep in mind the poor administrator/installer. The guy who is managing the install is more interested in the efficiency of the cache and how Cocoon uses resources than the developer. The user in many cases won't know the difference because their network connection is throttling the response, or their browser is the bottleneck. The efficiency of the cache will likely have the most impact on the scalability of the server. That is my OPINION, and should not be taken as gospel. Please do not shoot me for holding that opinion.
I took the 18 pages home and started to play with some code. I appreciate the journey to get to the search algorithm you proposed. The functions as declared in the beginning were not that efficient, and on my machine as soon as you had more than ~410 samples for the invalid/valid cost evaluations the function took ~10ms (the granularity of my clock) to evaluate. The evaluation used constants for all the cost figures because that would be the computationally cheapest way to evaluate the efficiency of those functions.
After this exercise, to my dismay, was the better way of evaluating a continual cost value. That way instead of having a sample for each request at a particular time, we only had to work with three samples. A vastly improved search algorithm in terms of being computationally cheap.
Before it was unclear what type would best represent the cost function, but when you introduced the trigonometric functions into the mix, it was clear that floating point precision was required.
Perhaps seeing what you want in code would be the best thing. It would help solidify things for us that don't do the math or are overwhelmed by a very long whitepaper and trying to derive from it how it will practically make our lives better. It is would help determine the cost/benefit ratio of actually developing this thing. As it stands, any idea that takes 18 pages to explain gives the impression of a very high cost. Whether this bears out in practice or not remains to be seen.
--
"They that give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." - Benjamin Franklin