* Joey Hess <jo...@debian.org> [2012-03-20 16:26]:
> > According to
> > http://www.w3.org/TR/2011/WD-html-markup-20110113/meta.http-equiv.content-language.html
> > you can specify the language of your content using a header like this:
> > <meta http-equiv="content-language" content="en-GB" />
 
> This could be added to the whitelisted values, if it can be proven that
> it's safe to let users set these values, and if a suitable regex were
> developed to block invalid values (that might attempt javascript or
> other content inseration attacks).

It's fairly easy to check for the most common variants of valid
language tags (e.g. de, en-GB, zh-Hant, zh-cmn-Hans-CN, es-419).  I've
tried to do this in the patch below.

BTW, in preprocess() some values are written to $pagestate{$page}{meta}
Is this for tags that are relevant for the page template?

Anyway, I noticed the followning warning at
http://www.w3.org/TR/2011/WD-html-markup-20110113/meta.http-equiv.content-language.html
| Using the meta element to specify the document-wide default language
| is obsolete. Consider specifying the language on the root element
| instead.

Apparently the preferred way in HTML 5 is to specify something like:
| <html lang="en">

But given that not most people are not using HTML 5 yet, it might
still be worthwhile to add content-language support to meta.  I'll
leave it up to you.  In any case, I appreciate comments on the patch
(i.e. if I added the code in the right location) since I'm new to
ikiwiki.

--- a/IkiWiki/Plugin/meta.pm    2012-03-26 00:43:05.257460466 +0100
+++ b/IkiWiki/Plugin/meta.pm    2012-03-26 01:00:27.262627492 +0100
@@ -280,6 +280,17 @@
                        encode_entities($key).
                        '" content="'.encode_entities($value).'" />';
        }
+       elsif ($key eq 'content-language') {
+               # Check if a valid language tag is specified according to
+               # BCP 47 at http://tools.ietf.org/html/bcp47
+               # We don't implement all of BCP 47 but we check for the most
+               # common variants of: language, extlang, script and region
+               if ($value =~ 
(/^[[:alpha:]]{2,3}(-[[:alpha:]]{3})?(-[[:alpha:]]{4})?(-[[:alpha:]]{2}|-\d{3})?$/))
 {
+                       push @{$metaheaders{$page}}, '<meta http-equiv="'.
+                               encode_entities($key).
+                               '" content="'.encode_entities($value).'" />';
+               }
+       }
        elsif ($key eq 'name') {
                push @{$metaheaders{$page}}, scrub('<meta '.$key.'="'.
                        encode_entities($value).

-- 
Martin Michlmayr
http://www.cyrius.com/



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to