Content-Disposition and utf8 filenames
I would like to use a user-supplied filename when returning a download (e.g. pdf). For example, might be filename=$title.pdf. But $title can include any character. It seems like support for this in browsers is spotty: See: http://greenbytes.de/tech/tc2231/ Is anyone aware of a way to set this header to allow utf8 filenames that is supported across browsers? Also, my assumption is HTTP::Headers expect encoded values -- that is the values are octets not characters and so should always encode( 'US-ASCII' ) the value. I just tried with Google Apps and it seems they turn any non A-Za-z into an underscore. Not sure that means they just didn't try hard or if they felt it was not possible to use non ASCII characters in suggested filenames. -- Bill Moseley mose...@hank.org
Re: Content-Disposition and utf8 filenames
* Bill Moseley wrote: I would like to use a user-supplied filename when returning a download (e.g. pdf). For example, might be filename=$title.pdf. But $title can include any character. It seems like support for this in browsers is spotty: See: http://greenbytes.de/tech/tc2231/ Is anyone aware of a way to set this header to allow utf8 filenames that is supported across browsers? No, as you can tell from the results there is no one way supported by all major browsers. The IETF HTTPbis Working Group is currently revising the specification for it, and the recommended way to do this will be the through the RFC 5987 style notation, where you can also specify a fall- back value by using both the filename* and filename parameters. But no silver bullet there. Also, my assumption is HTTP::Headers expect encoded values -- that is the values are octets not characters and so should always encode( 'US-ASCII' ) the value. (That is generally correct, yes). I just tried with Google Apps and it seems they turn any non A-Za-z into an underscore. Not sure that means they just didn't try hard or if they felt it was not possible to use non ASCII characters in suggested filenames. As I understand it, Google is currently investigating switching to RFC 5987 style encoding in some of their applications. -- Björn Höhrmann · mailto:bjo...@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Re: Content-Disposition and utf8 filenames
On Sat, Dec 4, 2010 at 6:40 AM, Bjoern Hoehrmann derhoe...@gmx.net wrote: No, as you can tell from the results there is no one way supported by all major browsers. The IETF HTTPbis Working Group is currently revising the specification for it, and the recommended way to do this will be the through the RFC 5987 style notation, where you can also specify a fall- back value by using both the filename* and filename parameters. But no silver bullet there. Thanks very much for the pointers. Oh, that RFC says 8859-1 is supported, not just ASCII. So, I suppose I could encode to 8859-1 and have encode() use an underscore for the substitution character. But, then do I have to be concerned with removing any characters that might not be appropriate for their (the client's) file system? That is, remove slashes? Maybe I should just stick to something basic like: $filename = substr( $title, 0, 50 ); # But 78 is allowed $filename =~ s/[^A-Za-z0-9]/_/g; # Replace $filename = 'document' unless $filename =~ /[^_]/; $filename =~ s/^_{2,}/_/; # trim to make less ugly? $filename =~ s/_{2,}$/_/; $filename = encode( 'US-ASCII', $filename ); $filename .= '.pdf'; # for example As I understand it, Google is currently investigating switching to RFC 5987 style encoding in some of their applications. That would require some kind of browser detection, right? Hopefully, that might all get abstracted out into a HTTP::Headers method/subclass in the future. Thanks, -- Bill Moseley mose...@hank.org