Alexey,

On Mar 7, 2013, at 3:02 PM, Alexey Proskuryakov wrote:

> 
> The current File API spec seems to have a mismatch between type in 
> BlobPropertyBag, and type as Blob attribute. The latter declaratively states 
> that the type is an ASCII lower case string. As mentioned by Glenn before, 
> WebKit interpreted this by raising an exception in constructor for non-ASCII 
> input, and lowercasing the string. I think that this is a reasonable reading 
> of the spec. I'd be fine with raising exceptions for invalid types more 
> eagerly.
> 
> This is the text in question:
> 
> (1)
>> type, a DOMString which corresponds to the Blob object's type attribute. If 
>> not the empty string, user agents must treat it as an RFC2616 media-type 
>> [RFC2616], and as an opaque string that can be ignored if it is an invalid 
>> media-type. This value must be used as the Content-Type header when 
>> dereferencing a Blob URI.
>> 
> 
> 
> (2)
>> type
>> The ASCII-encoded string in lower case representing the media type of the 
>> Blob, expressed as an RFC2046 MIME type [RFC2046]. On getting, conforming 
>> user agents must return the MIME type of the Blob, if it is known. If 
>> conforming user agents cannot determine the media type of the Blob, they 
>> must return the empty string. A string is a valid MIME type if it matches 
>> the media-type token defined in section 3.7 "Media Types" of RFC 2616 
>> [RFC2616]. If not the empty string, user agents must treat it as an RFC2616 
>> media-type [RFC2616], and as an opaque string that can be ignored if it is 
>> an invalid media-type. This value must be used as the Content-Type header 
>> when dereferencing a Blob URI.


This is now clarified; the mismatch is a spec. bug.  Thanks for pointing this 
out.


> It would be helpful to have the terminology corrected, and to have this 
> generally clarified - for example, validity is mentioned here, but seems to 
> be unused.
> 


Conditions for validity have been clarified; this doesn't warrant throwing a 
SyntaxError, but it does specify when implementations should ignore poor use of 
MIME type strings, e.g. here's additional clarification in the slice call:

http://dev.w3.org/2006/webapi/FileAPI/#slide-method-algo


> It seems pretty clear from normative text that charset parameter is supposed 
> to work. A non-normative example supports that too. I agree with Arun that 
> this seems best to keep as is.

+1.


> However, <https://bugs.webkit.org/show_bug.cgi?id=111380> is about a 
> different case - it's about posting multipart form data that has Blob 
> elements with invalid media-types. I'm not even sure which spec is in charge 
> of this behavior - I don't think that anything anywhere says that Blob.type 
> affects media-type of posted multipart data, even though that's obviously the 
> intention. XMLHttpRequest spec defers to HTML, which defers to RFC2388, which 
> mentions files "returned via filling out a form", but not Blobs (which is no 
> surprise given its age).


In fact, I'm not sure if Blob.type should influence the type of multipart form 
data.  Consider the concatenation of several Blobs into a new Blob, as the Blob 
constructor allows.  What should the type of a newly constructed Blob be,  if 
it consists of several differently typed Blobs?  The spec. suggests 
disregarding the type of each Blob, but encourages the right use of type within 
the Blob constructor.  

I'm also not sure multipart form data falls under the aegis of the File API, 
but at least Blobs with invalid types is the same us having no type now (empty 
string).


> Making Blobs only hold valid media-types would solve practical issues, but it 
> would be helpful to know what formally defines multipart data serialization 
> with blobs.
> 
> We also previously had 
> <https://bugs.webkit.org/attachment.cgi?id=177736&action=review> for sending 
> non-multipart data. Back then, we determined that "Content-Type: " should be 
> sent when the value is invalid. I'm no longer sure if that's right. For this 
> case, XMLHttpRequest authoritatively defines the behavior, although heavily 
> leaning on File API to decide when the type attribute is empty:
> 
>> If the object's type attribute is not the empty string let mime type be its 
>> value.
> 
> 
> Note that "mime type" is then directly used as default media-type for 
> Content-Type header, but it's not parsed to set encoding variable. The 
> encoding could be needed to update a charset in author provided Content-Type 
> header field in later steps of the algorithm. This is probably not right, as 
> Blob should know its encoding better than code that sets header fields on an 
> XMLHttpRequest object.
> 


Yes, but implementations can't heuristically determine a Blob's type now.  Type 
has to be specified correctly or ignored.   What "Blob should know" is now as 
good as what it is constructed to have as its type, though at read time, thanks 
to the Encoding Spec, we can determine a fallback encoding.

-- A*

Reply via email to