On 09/05/2012 06:58 PM, Amos Jeffries wrote: > On 06.09.2012 11:58, Eliezer Croitoru wrote: > We can pause there for the infrastructure to look fine before moving on > to the store details. I've been waiting on assistance from Henrik or > Alex on that for a while. They are the ones who know the answers to your > questions below AFAIK.
FWIW, I have not reviewed the store_url_rewrite code in Squid2 so I cannot answer the questions related to how it was done. I can suggest ways of doing this in Squid3, but since somebody already investigated all the alternatives, it would be better to hear the summary of the Squid2 implementation (as it relates to Store) before diving into Squid3 development. The biggest question for me is why Squid2 code was storing multiple URLs with the cached object (if it was). Why cannot Store just work with the [rewritten] URL given to it and ignore the fact that some [store] URLs originated from some other [real] URLs? Are we trying to support going from a store_url_rewrite config back to regular config without losing some of the cached objects? >> 2. Research the workflow of storing objects in memory and store and >> introduce psudo for a new workflow of storing objects to avoid bad >> effects on cache objects usage in any form that can be. >> - I do know that squid uses some hash look-up and I have seen in the >> things about it. >> - as far I understood from the code: >> client_request builds the request of the http object. >> creates a mem-object and on the way creates a checksum. >> a transfer from of the mem-object to a "store" happens. >> if a store rebuild happens it takes all of the data from the file in >> the store. >> >> ? question how cachemgr gets the list of urls in memory? You might be confusing "cache manager" (the thing that responds to "squidclient mgr:info" requests) with Store. Also, you should not think in terms of memory (RAM) because some objects are only cached on disk. It is best to think of Store as a collection of stored objects, ignoring their particular location (memory or disk) to the extent possible. Store can get a list of cached objects by iterating through store_table and other store indexes. In general, you should not assume that it is possible to get a list of all cached URLs in any efficient/practical fashion because not all in-RAM indexes store URLs. It is only possible to get an answer to the following question: * Is a response with cache key K likely to be in Squid cache now? Where cache key is a hash computed over the request method, request URI, and other properties. >> I will look at it later but if someone have solid knowledge on how >> the store routing was or implemented before i'm rushing into the code >> every piece of info will help me when looking into it. The Store is too big and complex of an API to accurately describe in an email IMO. I would be happy to answer specific questions about the stuff I know, but you may have to research how things work as there is no comprehensive documentation yet. Another complication is that such fundamental Squid2 Store feature as store_table needs to be removed but it has not been completely removed from Squid3 yet, so there is some [older] code that relies on it and some [newer] code that tries hard to stay away from it, all while doing the same kind of operations. Finally, the whole Store class hierarchy is ugly to a fault. It needs to be split into more independent classes instead of everybody and the kitchen sink inheriting from Store, hiding the intended boundaries among "store manager", "memory storage manager", "disk storage manager", "cache_dir manager", etc. Good luck, Alex.