On 21/03/2013 5:45 p.m., Alex Rousskov wrote:
On 03/20/2013 09:02 PM, Amos Jeffries wrote:

+            rep->header.putStr(HDR_VARY,tmp.termedBuf());
+
+            // Key:Accept-Language;b="foo"
+            // Only supply the match parameters if the language is actually 
found.
+            // On the generic reply to all 'unknown' inputs we must be vague 
like Vary
+            // which makes the recipient use the entire Accept-Language as the 
variant key.
+            if (err_language) {
+                tmp.append(";b=\"");
+                tmp.append(err_language);
+                tmp.append('"');
+            }
+            rep->header.delById(HDR_KEY);
+            rep->header.putStr(HDR_KEY, tmp.termedBuf());

If I am interpreting draft-fielding-http-key-02 correctly, there is no
point in sending

     Vary: Accept-Language
     Key:  Accept-Language

because both of the above header fields have identical semantics
(Section 2). Did I misunderstood? If I did not, you can move the
delById() and putStr() calls inside the err_language if guard (and merge
the two identical if-guards, one after another?).

"

  When a cache
   fully implements this mechanism, it MAY ignore the Vary response
   header field.
"

Which does not limit the ignoring of Vary to just Vary+Key responses. So until (and if) the spec gets updated to mandate following Vary whenever Key is absent its best to always emit Key+Vary for the same reasons that it is best to always emit Vary if *any* of the URLs responses needs it. Otherwise we risk causing for Key the same problem that Apache caused for Vary - one response coming back with *no* key gravitating all future traffic to that variant regardless of others existence.


If the negotiated language was "xfo" and the later requested language
happens to be "xfoobar", will the "xfo" language response be served to
an "xfoobar" reader because of the "prefix match" logic of the b=".."
modifier? Is that a good thing?
Yes and yes. Due to the way the codes are syntaxed the valid ones are
all 2 or 5 bytes long with an optional trailer.
Well, many languages have three letters in the primary tag (e.g., duh
for Dungra Bhil) and other length are allowed, but what is important
here is whether a language tag may be a prefix of another, completely
unrelated language tag. If yes, then using the prefix b="..." modifier
would be wrong IMHO.

For example, should a cached English ("en") error page be served to an
Emumu-speaking client?

Hmm. Yes I see they extended the tags beyond ISO-639-1 set now.

The way Squid scans to find and fill err_language the longer codes will be added in preference over shorter ones. We only come up to problems when the full language code is 2-bytes versus being a prefix of a 3-byte one.

If we send ;b= the 2-bytes will match against 3-bytes codes. But if we send ;p= then 2-byte will fail to match against the more common 5-byte wildcards. For example env vs en-*.



Type: language
Subtag: enr
Description: Emumu
Description: Emem
Added: 2009-07-29
Type: language
Subtag: en
Description: English
Added: 2005-10-16
Suppress-Script: Latn

I could not find any rules that prohibit one language tag to be a prefix
of another. It is possible that I missed them (not my area of
expertise), but the above counter-example with Emumu/English seems to
indicate that there is no such prohibition?



In practice from my archive of Accept-Languge headers collected from around the net over the last few years there are no 3-byte codes in use (so far).

So the question is do we want to cater to the rare case by compromising a more common form of breakage? I think no.

Amos

Reply via email to