"Andrew M. Bishop" wrote: ... > > I think that I will go back to not concatenating all of the headers, > but there is a good reason for making some changes compared to version > 2.6d. > > Consider the following set of headers that are legal for a server to > send: > > Cache-Control: this > Cache-Control: that > Cache-Control: max-age = 10 > > The server could also send these concatenated together in any order, > for example: > > Cache-Control: this, that, max-age = 10 > > or > > Cache-Control: max-age = 10, this, that > > What WWWOFFLE v2.7 is doing is concatenating the headers together so > that there is always only one line that contains the Cache-Control > header if there is any. It then searches through a list (and only one > list) for the max-age header. > > The version 2.6 code would work for separate headers or for the case > where max-age was the first in the list. If max-age was not first in > the list then WWWOFFLE would not see it. This means that it did not > handle the re-requesting of the URL if needed.
It seems to me the solution to this problem is very easy. Why should WWWOFFLE v2.7 only search through the first list? It's trivial to rewrite GetHeader2() so that it continues searching for more headerlines with the same key if the search through the first list fails. Then you could have something like: Cache-Control: this,that Cache-Control: foo, max-age = 10, bar and GetHeader2(head,"Cache-Control","max-age") would still find max-age. For the case that you need one list containing all the items of all the headerlines with a particular key, it's not necessary to first concatenate the header values into one string and then apply SplitHeaderList(). I've written a function GetHeaderList() that produces the same effect without needing concatenated headerlines. For instance, I use GetHeaderList(head,"Accept-Language") where you would use SplitHeaderList(GetHeader(head,"Accept-Language")). The difference is that GetHeaderList(head,"Accept-Language") will also work correctly if multiple "Accept-Language:" headerlines are not first combined into one. > I think that the solution is to have a list of headers that it is safe > to concatenate (or unsafe). For example concatenation of > Cache-Control and Etag and Pragma is safe but Content-Type and > Set-Cookie is not safe (even though the RFC says that any header can > be concatenated). I think that's an unnecessary complication. It's much easier not to combine headerlines at all. > While writing this e-mail I have finally found the reason for doing > the concatenation in the first place. The description above is true, > but not the whole reason. The ChangeLog tells the story: > > * src/parse.c, src/wwwoffles.c: > Change the GetHeader() function to GetHeader() and GetHeader2(). > Change the RemoveHeader() function to RemoveHeader() and RemoveHeader2(). > Use the functions to split up a comma separated list in a header. > Add handling of If-None-Match headers as well as If-Modified-Since. > > The real problem being solved was If-None-Match which is like > If-Modified-Since and will ask the server to send a newer page if one > exists or nothing otherwise. > > This header relies on one or more headers called Etag that the server > sent with the original page. The If-None-Match header from WWWOFFLE > is required to send back to the server all of the Etag header values > in a list. The easiest way to do this without adding extra code for > this special case is to concatenate all duplicated headers. This > allows me to get the complete list of Etag header values without > needing to concatenate them if there were several. Is it valid for a single copy of a webpage to have several entity-tags? It seems to me the only valid reason WWWOFFLE would have to send a request with a If-None-Match headerline with more than one entity-tag is if it has more than one cached version for the same URL. And as far as I know, this is not the case with WWWOFFLE. -- Paul A. Rombouts <[EMAIL PROTECTED]> My alternative WWWOFFLE implementation page: http://www.phys.uu.nl/~rombouts/wwwoffle.html
