The current File API spec seems to have a mismatch between type in 
BlobPropertyBag, and type as Blob attribute. The latter declaratively states 
that the type is an ASCII lower case string. As mentioned by Glenn before, 
WebKit interpreted this by raising an exception in constructor for non-ASCII 
input, and lowercasing the string. I think that this is a reasonable reading of 
the spec. I'd be fine with raising exceptions for invalid types more eagerly.

This is the text in question:

(1)
> type, a DOMString which corresponds to the Blob object's type attribute. If 
> not the empty string, user agents must treat it as an RFC2616 media-type 
> [RFC2616], and as an opaque string that can be ignored if it is an invalid 
> media-type. This value must be used as the Content-Type header when 
> dereferencing a Blob URI.
> 


(2)
> type
> The ASCII-encoded string in lower case representing the media type of the 
> Blob, expressed as an RFC2046 MIME type [RFC2046]. On getting, conforming 
> user agents must return the MIME type of the Blob, if it is known. If 
> conforming user agents cannot determine the media type of the Blob, they must 
> return the empty string. A string is a valid MIME type if it matches the 
> media-type token defined in section 3.7 "Media Types" of RFC 2616 [RFC2616]. 
> If not the empty string, user agents must treat it as an RFC2616 media-type 
> [RFC2616], and as an opaque string that can be ignored if it is an invalid 
> media-type. This value must be used as the Content-Type header when 
> dereferencing a Blob URI.


It would be helpful to have the terminology corrected, and to have this 
generally clarified - for example, validity is mentioned here, but seems to be 
unused.

It seems pretty clear from normative text that charset parameter is supposed to 
work. A non-normative example supports that too. I agree with Arun that this 
seems best to keep as is.

However, <https://bugs.webkit.org/show_bug.cgi?id=111380> is about a different 
case - it's about posting multipart form data that has Blob elements with 
invalid media-types. I'm not even sure which spec is in charge of this behavior 
- I don't think that anything anywhere says that Blob.type affects media-type 
of posted multipart data, even though that's obviously the intention. 
XMLHttpRequest spec defers to HTML, which defers to RFC2388, which mentions 
files "returned via filling out a form", but not Blobs (which is no surprise 
given its age).

Making Blobs only hold valid media-types would solve practical issues, but it 
would be helpful to know what formally defines multipart data serialization 
with blobs.

We also previously had 
<https://bugs.webkit.org/attachment.cgi?id=177736&action=review> for sending 
non-multipart data. Back then, we determined that "Content-Type: " should be 
sent when the value is invalid. I'm no longer sure if that's right. For this 
case, XMLHttpRequest authoritatively defines the behavior, although heavily 
leaning on File API to decide when the type attribute is empty:

> If the object's type attribute is not the empty string let mime type be its 
> value.


Note that "mime type" is then directly used as default media-type for 
Content-Type header, but it's not parsed to set encoding variable. The encoding 
could be needed to update a charset in author provided Content-Type header 
field in later steps of the algorithm. This is probably not right, as Blob 
should know its encoding better than code that sets header fields on an 
XMLHttpRequest object.

- WBR, Alexey Proskuryakov


Reply via email to