Re: [File API] Latest Editor's Draft | Call for Review
On Thu, Aug 11, 2011 at 12:43 PM, Arun Ranganathan a...@mozilla.com wrote: Greetings WebApps WG, The latest editor's draft of the File API can be found here: http://dev.w3.org/2006/webapi/**FileAPI/http://dev.w3.org/2006/webapi/FileAPI/ Changes are based on feedback on this listserv, as well as the URI listserv (e.g. [1][2][3]). Chrome team: some of the feedback is to more rigorously define the opaqueString production in Blob URIs. Currently, you generate Blob URIs that look like this: blob:http://localhost/**c745ef73-ece9-46da-8f66-**ebes574789b1http://localhost/c745ef73-ece9-46da-8f66-ebes574789b1[4] For chromium, we're going to escape those reserved characters that could appear in the opaqueString. I've included language that allows use of this kind, but some review about what is NOT allowed would be appreciated. -- A* [1] http://lists.w3.org/Archives/**Public/uri/2011May/0004.htmlhttp://lists.w3.org/Archives/Public/uri/2011May/0004.html [2] http://lists.w3.org/Archives/**Public/uri/2011May/0002.htmlhttp://lists.w3.org/Archives/Public/uri/2011May/0002.html [3] http://lists.w3.org/Archives/**Public/uri/2011May/0006.htmlhttp://lists.w3.org/Archives/Public/uri/2011May/0006.html [4] http://www.html5rocks.com/en/**tutorials/workers/basics/#toc-** inlineworkers-bloburishttp://www.html5rocks.com/en/tutorials/workers/basics/#toc-inlineworkers-bloburis
Re: [FileAPI] Updates to FileAPI Editor's Draft
I have a couple questions regarding abort behavior. - If the reading is completed and the loadend event has been fired, do we want to fire loadend event again when abort() method is called? - Do we want to reset error to null or leave it intact when abort() method is called? Thanks, Jian On Wed, May 11, 2011 at 3:49 PM, Arun Ranganathan a...@mozilla.com wrote: The Editor's Draft of the FileAPI -- http://dev.w3.org/2006/webapi/** FileAPI/ http://dev.w3.org/2006/webapi/FileAPI/ -- has had some updates. These are the notable changes: 1. Blob.slice behavior has changed to more closely match String.prototype.slice from ECMAScript (and Array.prototype.slice semantically). I think we're the first host object to have a slice outside of ECMAScript primitives; some builds of browsers have already vendor-prefixed slice till it becomes more stable (and till the new behavior becomes more diffuse on the web -- Blob will soon be used in the Canvas API, etc.). I'm optimistic this will happen soon enough. Thanks to all the browser projects that helped initiate the change -- the consistency is desirable. 2. The read methods on FileReader raise a new exception -- OperationNotAllowedException -- if multiple concurrent reads are invoked. I talked this over with Jonas; we think that rather than reuse DOMException error codes (like INVALID_STATE_ERR), these kinds of scenarios should throw a distinct exception. Some things on the web (as in life) are simply not allowed. It may be useful to reuse this exception in other places. 3. FileReader.abort( ) behavior has changed. 4. There is a closer integration with event loops as defined by HTML. For browser projects with open bug databases, I'll log some bugs based on test cases I've run on each implementation. A few discrepancies exist in implementations I've tested; for instance, setting FileReader.result to the empty string vs. setting it to null, and when exceptions are thrown vs. use of the error event. Feedback encouraged! Draft at http://dev.w3.org/2006/webapi/**FileAPI/http://dev.w3.org/2006/webapi/FileAPI/ -- A*
Re: [FileAPI] Updates to FileAPI Editor's Draft
On Tue, Jun 7, 2011 at 11:23 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jun 7, 2011 at 10:43 AM, Jian Li jia...@chromium.org wrote: I have a couple questions regarding abort behavior. If the reading is completed and the loadend event has been fired, do we want to fire loadend event again when abort() method is called? No Do we want to reset error to null or leave it intact when abort() method is called? If called after load/abort/error has fired the calling abort() should just throw an exception and not alter the FileReader object in any way. Do you mean we should throw if abort() is called after load/abort/error has been fired but before loadend event has been fired? If so, what kind of exception should we throw? The spec only mentions that If readyState = DONE set result to null. / Jonas
Re: Updates to FileAPI
On Mon, Dec 20, 2010 at 2:10 PM, Ian Hickson i...@hixie.ch wrote: On Mon, 20 Dec 2010, Arun Ranganathan wrote: http://dev.w3.org/2006/webapi/FileAPI/ Notably: 1. lastModifiedDate returns a Date object. You don't have a conformance requirement for returning a Date object. (The only MUST is for the case of the UA not being able to return the information.) I mention this because for attributes that return objects, it's important to specify whether the same object is returned each time or whether it's a new object that is created each time. Presumably for a Date object you want to require a new object be created each time. I think it makes more sense to return a new Date object each time. We have the same issue with Metadata.modificationTime. 2. We use the URL object and expose static methods on it for Blob URI creation and revocation. Looks good to me. FYI, I'm probably going to be extending this mechanism for Streams in due course. I expect I'll bring this up again in due course so we can work out how to make sure the specs don't step on each other. I'm a little concerned about the lifetime of these URLs potentially exposing GC behaviour -- we've tried really hard not to expose GC behaviour in the past, for good reason. Can't we jetison the URLs as part of the unloading document cleanup steps? http://www.whatwg.org/specs/web-apps/current-work/complete.html#unloading-document-cleanup-steps (Note that Window objects in some edge cases can survive their Document.) Also, I've minuted Sam Weinig at TPAC saying he'd prefer us to roll back from using the sequenceT type WebIDL syntax to index getters. Sam: are you still tightly wed to this? WebIDL has undergone changes since last we spoke. I'm copying what HTML5 is doing, and didn't want to be inconsistent in rolling this back. FWIW, IIRC the HTML spec is a bit out of sync when it comes to WebIDL. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: Updates to FileAPI
Returning a Date object sounds good. On Sat, Nov 13, 2010 at 6:07 AM, Arun Ranganathan aranganat...@mozilla.comwrote: - Original Message - On 11/12/10 11:53 AM, Boris Zbarsky wrote: OK, then we need to define rules for instantiating a Date object here (in the face of strings that may or may not be valid date format strings, or may be ambiguous, note). Is the proposal to just call new Date(str)? Er, nevermind. We're talking about file modification dates, not web resource modification dates, so my concern is unfounded. Right -- to be totally clear, we're talking about http://www.w3.org/TR/FileAPI/#dfn-file having a readonly Date attribute that returns the last modified on disk data. -- A*
Re: Updates to FileAPI
I have a question regarding lastModifiedDate. The spec says that this property returns an HTML5 valid date string. Per HTML 5 spec, a valid date string consists of only year, month and day information. It does not contain any time information. Do we really want this or what we want to return is a local date and time string per the HTML5 spec? Thanks, Jian On Wed, Oct 13, 2010 at 4:46 AM, Arun Ranganathan a...@mozilla.com wrote: WebApps WG, There have been some updates to the File API. http://dev.w3.org/2006/webapi/FileAPI/ Notable changes are: 1. Exception codes are no longer harnessed to DOMException's exception codes, per discussion on this listserv. 2. Metadata attributes creationDate and lastModifiedDate have been added to the File object, per discussion on the WHATWG listserv. 3. Blob no longer has a .url attribute. Instead, Window and WorkerGlobalScope have been extended with methods createBlobURL and revokeBlobURL, per discussion on this listserv. 4. The Blob URI section has undergone changes, per discussion on this listserv. Notably, Blob supports the addition of an HTTP Content-Type, which implementations must return as a response header when Blob URIs are dereferenced. 5. There are (ongoing) fixes to bugs, including incorrect uses of long long (in lieu of unsigned long long), per (ongoing) discussion on the listserv. In off-list discussion, a few points have come up, which I'll summarize below: 1. The emerging HTML5 Device specification [1] includes a section on streams and an affiliated Stream API, which relies normatively on Blob URIs [2] defined in the File API. Since we've eliminated the .url property on Blob, we should also eliminate the .url property on the Stream object. There has been some discussion on renaming the methods createBlobURL and revokeBlobURL to be more generic, so that use cases such as Stream can be accommodated. This is an ongoing discussion. In general, there's consensus around eliminating the .url attribute from Stream. 2. There is ongoing discussion about the addition of Content-Disposition as well (as an attribute on Blob, as a parameter to Blob.slice, and as a response header when dereferencing blob: URIs), to facilitate and optimize downloads. The use of a Content-Disposition header allows URLs (both http:// and blob:) to dereference content straight for download, with the benefit of header-determined processing (e.g. the download doesn't occur first). Another suggestion to address this use case is that instead of supporting Content-Disposition, to allow for an additional URL property on the FileSaver constructor, modulo domain restrictions. This discussion is ongoing. 3. In general, there's ongoing discussion on allowing *even more* HTTP response behavior when dereferencing blob: URIs. I strongly favor a strict subset of HTTP, and only a use-case driven addition of further response headers and response codes. Arguments cited in favor of including all of http:// are that blob: URIs should be completely indistinguishable from HTTP URIs, thus allowing maximum reuse with XHR and other aspects of the platform. Currently, I believe we allow for a great deal of intermingling without reproducing HTTP in its entirety within Blob URIs. -- A* [1] http://dev.w3.org/html5/html-device/ [2] http://dev.w3.org/html5/html-device/#stream-api
Re: [XHR2] ArrayBuffer integration
I plan to add ArrayBuffer support to BlobBuilder and FileReader. Chris, it is good that you would pick up the work for XHR. We can talk about how we're going to add ArrayBufferView to read ArrayBuffer. Jian On Fri, Sep 24, 2010 at 5:23 PM, Kenneth Russell k...@google.com wrote: On Thu, Sep 23, 2010 at 2:42 AM, Anne van Kesteren ann...@opera.com wrote: On Wed, 08 Sep 2010 19:55:33 +0200, Kenneth Russell k...@google.com wrote: Mozilla's experimental name is mozResponseArrayBuffer, so perhaps to avoid collisions the spec could call it responseArrayBuffer. While I do not think there would be collision (at least not in ECMAScript, which is what we are designing for) naming it responseArrayBuffer is fine with me. And also now done that way in the draft. Still need to get a saner reference to the ArrayBuffer specification than https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html though. :-) http://dev.w3.org/2006/webapi/XMLHttpRequest-2/ Thanks, this is great and very exciting. This motivates implementing the proposed DataView interface ( https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html#6 ), which will make it easier to read multi-byte values with specified endianness out of an ArrayBuffer. For WebKit I've filed https://bugs.webkit.org/show_bug.cgi?id=46541 . -Ken (You can also do send(ArrayBuffer) obviously. I personally think supporting this for both BlobBuilder and send() makes sense. That way Blob/File etc. work too.) -- Anne van Kesteren http://annevankesteren.nl/
Re: [XHR2] FormData for form
new FormData(myformelement) sounds good. On Tue, Sep 14, 2010 at 11:00 AM, Jonas Sicking jo...@sicking.cc wrote: Note that you can always do: fd = new FormData; That is agreed by everyone to work. The question is how to instantiate one which is prefilled with the data from a form. / Jonas On Tue, Sep 14, 2010 at 10:55 AM, Alex Russell slightly...@google.com wrote: I have a preference for the second syntax. These sorts of classes should always be new-able. On Tue, Sep 14, 2010 at 10:46 AM, Jonas Sicking jo...@sicking.cc wrote: Hi All, There was some discussions regarding the syntax for generating a FormData object based on the data in an existing form. I had proposed the following syntax myformelement.getFormData(); however it was pointed out that the downside with this API is that it's not clear that a new FormData object is created every time. Instead the following syntax was proposed: new FormData(myformelement); however I don't see this syntax in the new XHR L2 drafts. Is this merely an oversight or was the omission intentional? I'm fine with either syntax, but since we're getting close to shipping Firefox 4, and I'd like to include this functionality (in fact, it's been shipping for a long time in betas), I'd like to see if how much consensus the various proposals carried. / Jonas
Re: ArrayBuffer and ByteArray questions
Yes, we only need to add it to BlobBuilder so that it can be applied to both FileReader, XHR.send and any other place that take the blob. On Wed, Sep 8, 2010 at 10:57 AM, Eric Uhrhane er...@google.com wrote: On Tue, Sep 7, 2010 at 4:09 PM, Jian Li jia...@chromium.org wrote: Hi, Several specs, like File API and WebGL, use ArrayBuffer, while other spec, like XMLHttpRequest Level 2, use ByteArray. Should we change to use the same name all across our specs? Since we define ArrayBuffer in the Typed Arrays spec ( https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html ), should we favor ArrayBuffer? In addition, can we consider adding ArrayBuffer support to BlobBuilder, FormData, and XMLHttpRequest.send()? It seems like an obvious addition for BlobBuilder or XHR.send, but do we need it in both, or is one sufficient? Thanks, Jian
ArrayBuffer and ByteArray questions
Hi, Several specs, like File API and WebGL, use ArrayBuffer, while other spec, like XMLHttpRequest Level 2, use ByteArray. Should we change to use the same name all across our specs? Since we define ArrayBuffer in the Typed Arrays spec ( https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html), should we favor ArrayBuffer? In addition, can we consider adding ArrayBuffer support to BlobBuilder, FormData, and XMLHttpRequest.send()? Thanks, Jian
Re: Lifetime of Blob URL
The other alternative is to have both FileReader and BlobReader, while the former one is for reading only File object and the later one is for reading any Blob object. With that, we also have FileReaderSync and BlobReaderSync. On Mon, Aug 30, 2010 at 5:17 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Aug 30, 2010 at 5:14 PM, Dmitry Titov dim...@chromium.org wrote: As for wild ideas, it also could be something more generic, lets say DataReader which can take Blobs and Files (and perhaps something else in the future). Like XHR that has overloaded methods for xhr.open(..). It seems possible that web developers may not realize that File is actually a Blob and may be confused by using BlobReader to read File. (Do I need to make a Blob out of my File first?). They may be equally confused by using FileReader to read Blob though. That would address item 1 on my list. But not item 2 through 4. / Jonas On Mon, Aug 30, 2010 at 4:35 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Aug 30, 2010 at 4:22 PM, Darin Fisher da...@chromium.org wrote: On Mon, Aug 30, 2010 at 1:08 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Aug 30, 2010 at 9:59 AM, Arun Ranganathan a...@mozilla.com wrote: In addition, BlobError and BlobException sound better because these names are consistent with current Blob naming scheme in File API. So we're also going to adopt these new names in the WebKit implementation when we rename FileReader to BlobReader per the spec. *sigh. Naming continues to be hard. Not everyone's thrilled with the proliferation of Blob in the API [1] including other major implementors (my co-editor included ;-)) but I changed it mainly due to Darin/Jian/other objections. I suppose you folks are pretty adamant on the Blob* name? I feel pretty strongly that we should name this back to FileReader, for several reasons: 1. Most people that I talk to dislike the name Blob, much less having it spread to things like BlobReader. 2. My understanding is that the writer counterpart is going to be called FileWriter (is this correct?) Yes, that is what we are currently implementing in WebKit. 3. While it's certainly possible to read Blobs with this, it seems to me like the far most common case will be to read a real file, or part of a file (i.e. the result from myfile.slice()). 4. There is one shipping implementation of FileReader It just seems odd to use an interface named FileReader to read blobs, which may not correspond to files. Consider BlobBuilder, which can be used to construct a Blob from a string. I somewhat agree. But it seems equally strange to use BlobReader to read files, and I suspect that it will be vastly more common to read files than blobs-that-aren't-files. Yes, the File interface inherits Blob, so technically when you're reading a file you're also reading a blob, but I doubt that is the mental model most people will have. Like so many other things, there is no perfect solution here. Another idea (possibly a crazy one) would be to eliminate Blob, and just use File for everything. We could rename BlobBuilder to FileBuilder and have it return a File instead of a Blob. Same goes for Blob.slice(). Of course, the File would not necessarily correspond to a real physical file on disk for performance reasons. I've been thinking about this too. I can't say I feel strongly either way. It feels somewhat strange, but I can't come up with any solid technical reasons against it. / Jonas
Re: Lifetime of Blob URL
Adding explicit methods to window and WorkerGlobalScope seems to be a better solution that solves potential problems we currently have with blob.url. Given that, we're going to experiment the proposed new APIs in the WebKit implementation, That is, we will add the following two methods to window and WorkerGlobalScope in the WebKit implementation: URLString createBlobURL(in Blob blob); void revokeBlobURL(in URLString url); In addition, BlobError and BlobException sound better because these names are consistent with current Blob naming scheme in File API. So we're also going to adopt these new names in the WebKit implementation when we rename FileReader to BlobReader per the spec. On Mon, Aug 23, 2010 at 8:19 AM, Eric Uhrhane er...@google.com wrote: I agree with Dmitry: window.createBlobUrl() makes it clearer. Querying blob.url shouldn't have side effects. As Jonas points out, we should keep the creation and destruction methods near each other, so window.destroyBlobUrl() would be the opposite function. As for getBlobUrl vs. createBlobUrl: the latter sounds like it returns a new URL each time. The former is less explicit. If we're returning a unique URL per call, then create is clearly better. Are we requiring that each call to xxxBlobUrl have a matched destroyBlobUrl, even if we're returning the same URL? I think BlobError and BlobException make a bit more sense, but I'm not too adamant about it. On Sat, Aug 21, 2010 at 1:00 PM, Jian Li jia...@google.com wrote: I do not see any more discussions on blob URL API in recent days. Any more thoughts or conclusion? In addition, do we want to rename FileError and File Exception to BlobError and BlobException to match with BlobReader naming, or rather keep them intact? On Mon, Aug 2, 2010 at 3:22 PM, Dmitry Titov dim...@chromium.org wrote: It feels developers will make less errors with window.getBlobUrl(blob) kind of API, because, unlike blob.url, it doesn't violate pretty common programming assumptions (like querying a property of the same object should return the same value if nothing was done to the object, or that value of property should not depend on what is a global object in the context of the query if the blob is the same). The spec language describing why and when blob.url returns different values with different lifetimes would be a bit hairy... Agree though that functionally they are the same. On Mon, Aug 2, 2010 at 3:05 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Aug 2, 2010 at 2:19 PM, Michael Nordman micha...@google.com wrote: On Mon, Aug 2, 2010 at 1:39 PM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jul 30, 2010 at 12:01 PM, Michael Nordman micha...@google.com wrote: On Thu, Jul 29, 2010 at 4:33 PM, Jonas Sicking jo...@sicking.cc wrote: Sorry about the slow response. I'm currently at blackhat, so my internet connectivity is somewhat... unreliable, so generally having to try to stay off the webs :) On Tue, Jul 27, 2010 at 1:16 PM, Dmitry Titov dim...@chromium.org wrote: Thanks Jonas, Just to clarify some details we had while discussing this, could you confirm if this matches with your thinking (or not): 1. If blob was created in window1, blob.url was queried, then passed (as JS object) to window2, and window1 was closed - then the url gets invalidated when window1 is closed, but immediately re-validated if window2 queries blob.url. The url string is going to be the same, only there will be a time interval between closing window1 and querying blob.url in window2, during which loading from the url returns 404. Actually, it might make sense to make blob.url, when queried by window2, return a different string. This makes things somewhat more consistent as to when a URL is working an when not. Now suppose window2 queries the .url attribute before window1 is closed? I think most people would expect the same value as returned in window1 (yes?). Having the same or different value depending on whether the attribute was queried before or after another window was closed seems confusing. I think having the .url remain consistent from frame to frame/window to window could help with debugging. The idea would be that we *always* return different urls depending on which window queries a url. This gives the most consistent behavior in that every url given is always limited to the lifetime of the current window. No matter what windows around it does. If that's the idea, then I would vote for a non-instance method somewhere to provide the context specific URL. Having a simple attribute accessor return different values depending on which context its being accessed in is very unusual behavior
Re: Lifetime of Blob URL
I do not see any more discussions on blob URL API in recent days. Any more thoughts or conclusion? In addition, do we want to rename FileError and File Exception to BlobError and BlobException to match with BlobReader naming, or rather keep them intact? On Mon, Aug 2, 2010 at 3:22 PM, Dmitry Titov dim...@chromium.org wrote: It feels developers will make less errors with window.getBlobUrl(blob) kind of API, because, unlike blob.url, it doesn't violate pretty common programming assumptions (like querying a property of the same object should return the same value if nothing was done to the object, or that value of property should not depend on what is a global object in the context of the query if the blob is the same). The spec language describing why and when blob.url returns different values with different lifetimes would be a bit hairy... Agree though that functionally they are the same. On Mon, Aug 2, 2010 at 3:05 PM, Jonas Sicking jo...@sicking.cc wrote: On Mon, Aug 2, 2010 at 2:19 PM, Michael Nordman micha...@google.com wrote: On Mon, Aug 2, 2010 at 1:39 PM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jul 30, 2010 at 12:01 PM, Michael Nordman micha...@google.com wrote: On Thu, Jul 29, 2010 at 4:33 PM, Jonas Sicking jo...@sicking.cc wrote: Sorry about the slow response. I'm currently at blackhat, so my internet connectivity is somewhat... unreliable, so generally having to try to stay off the webs :) On Tue, Jul 27, 2010 at 1:16 PM, Dmitry Titov dim...@chromium.org wrote: Thanks Jonas, Just to clarify some details we had while discussing this, could you confirm if this matches with your thinking (or not): 1. If blob was created in window1, blob.url was queried, then passed (as JS object) to window2, and window1 was closed - then the url gets invalidated when window1 is closed, but immediately re-validated if window2 queries blob.url. The url string is going to be the same, only there will be a time interval between closing window1 and querying blob.url in window2, during which loading from the url returns 404. Actually, it might make sense to make blob.url, when queried by window2, return a different string. This makes things somewhat more consistent as to when a URL is working an when not. Now suppose window2 queries the .url attribute before window1 is closed? I think most people would expect the same value as returned in window1 (yes?). Having the same or different value depending on whether the attribute was queried before or after another window was closed seems confusing. I think having the .url remain consistent from frame to frame/window to window could help with debugging. The idea would be that we *always* return different urls depending on which window queries a url. This gives the most consistent behavior in that every url given is always limited to the lifetime of the current window. No matter what windows around it does. If that's the idea, then I would vote for a non-instance method somewhere to provide the context specific URL. Having a simple attribute accessor return different values depending on which context its being accessed in is very unusual behavior. Can't say that its ideal, but window.getBlobUrl(blob) and window.revokeBlobUrl(...) would be an improvement. I can't say that I'm a big fan of this syntax given that I think the current syntax works fine in most cases. I'm definitely curious to hear what others think though. / Jonas
Re: [File API] Recent Updates To Specification + Co-Editor
One more question. Should we also rename FileError to BlobError and FileException to BlobException in order to be consistent with the naming changes? Thanks, Jian On Mon, Jun 28, 2010 at 2:20 PM, Arun Ranganathan a...@mozilla.com wrote: Greetings WebApps WG, I have made edits to the File API specification [1]. There are a few things of note that I'd like to call the WG's attention to. 1. There is a name change in effect. FileReader has been re-named BlobReader, upon request from Chrome team folks[2][3]. The name BlobReader won't win awards in a beauty pageant, but it tersely describes an object to read Blobs (which could originate from the underlying file system *or* be generated *within* a Web App). My present understanding is that FileWriter will also undergo a name change. Naming is really hard. Firefox already ships with FileReader, but I see the point of having an object named for what it does, which in this case is certainly more than file reading from the underlying file system. I also abhor bike shedding, especially over naming, but this is something that's exposed to the authors. I have not renamed FileError or FileException. In the case of errors and exceptions, I think *most* scenarios will occur as a result of issues with the underlying file system. These names should remain. 2. I've updated the URL scheme for Blobs using an ABNF that calls for an opaque string which is a term I define in the specification. There was much discussion about this aspect of the File API specification, and I think the existing scheme does allow for user agents to tack on origin information in the URL (this is not something the spec. says you should do). The actual choice of opaque string is left to implementations, though the specification suggests UUID in its canonical form (and provides an ABNF for this). I think this is the most any specification has said on the subject of URLs. 3. There is an additional asynchronous read method on BlobReader, and an additional synchronous read method on BlobReaderSync, namely readAsArrayBuffer. These use the TypedArrays definition initially defined by the WebGL WG [4]. 4. I am moving on from my full-time role at Mozilla to a part-time consulting role. I'll continue to be an editor of the File API, but I am stepping down as Chair of the WebGL WG. I'll continue to be active in standards communities, though :-) 5. I spoke to Jonas Sicking, who expressed willingness to be a co-editor of the File API specification. Most people who work on HTML5 and WebApps know Jonas' contributions to both WGs; with everyone's consent, I'd like to nominate him as co-editor. His model for an asynchronous event-driven API is what prompted the initial rewrite, and he also works on both File API and IndexedDB implementation (amongst other things). -- A* [1] http://dev.w3.org/2006/webapi/FileAPI/ [2] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0755.html [3] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0716.html [4] https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html
Re: [File API] Recent Updates To Specification + Co-Editor
We've some more questions regarding the blob URL. 1. The spec does not describe how blob and blob URL will work in the worker and shared worker scenarios. I think we should allow WorkerGlobalScope to be the binding context for the blob URL, like Document. In addition, we should define how a blob object can be passed to the worker via structured cloning. A new blob object should be expected to be created and it points to the same underlying data. 2. The current spec says that the lifetime of the blob URL is bound to the lifetime of the spawning context. What happens if we try to access the blob url from multiple contexts? Say, we call parent.blob.url, the lifetime of the url is bound to the parent context, not the current context, per the spec. This sounds a little bit unnatural. Could we explicitly provide the context while creating the blob URL, like window.createBlobUrl(blob)? 3. Since the lifetime of the blob URL is bound to a context, the blob URL (the underlying blob data) will get disposed only when the context dies. When we have long-live pages or shared workers, we could have leaked blob URLs that result in unclaimed blob storages. It will be nice if we can add the capability to revoke the blob URL pragmatically, like window.revokeBlobUrl(url), 4. It will be good if the spec could say more about the lifetime of the blob object and the blob URL since they're kind of orthogonal: the blob object will still be functional as long as it is not GC-ed even if the associated context dies. 5. The spec does not describe explicitly about the transient cases, like location.href = blob.url. Probably the spec could mention that the resource pointed by blob URL should be loaded successfully as long as the blob URL is valid at the time when the resource is starting to load. On Mon, Jun 28, 2010 at 2:20 PM, Arun Ranganathan a...@mozilla.com wrote: Greetings WebApps WG, I have made edits to the File API specification [1]. There are a few things of note that I'd like to call the WG's attention to. 1. There is a name change in effect. FileReader has been re-named BlobReader, upon request from Chrome team folks[2][3]. The name BlobReader won't win awards in a beauty pageant, but it tersely describes an object to read Blobs (which could originate from the underlying file system *or* be generated *within* a Web App). My present understanding is that FileWriter will also undergo a name change. Naming is really hard. Firefox already ships with FileReader, but I see the point of having an object named for what it does, which in this case is certainly more than file reading from the underlying file system. I also abhor bike shedding, especially over naming, but this is something that's exposed to the authors. I have not renamed FileError or FileException. In the case of errors and exceptions, I think *most* scenarios will occur as a result of issues with the underlying file system. These names should remain. 2. I've updated the URL scheme for Blobs using an ABNF that calls for an opaque string which is a term I define in the specification. There was much discussion about this aspect of the File API specification, and I think the existing scheme does allow for user agents to tack on origin information in the URL (this is not something the spec. says you should do). The actual choice of opaque string is left to implementations, though the specification suggests UUID in its canonical form (and provides an ABNF for this). I think this is the most any specification has said on the subject of URLs. 3. There is an additional asynchronous read method on BlobReader, and an additional synchronous read method on BlobReaderSync, namely readAsArrayBuffer. These use the TypedArrays definition initially defined by the WebGL WG [4]. 4. I am moving on from my full-time role at Mozilla to a part-time consulting role. I'll continue to be an editor of the File API, but I am stepping down as Chair of the WebGL WG. I'll continue to be active in standards communities, though :-) 5. I spoke to Jonas Sicking, who expressed willingness to be a co-editor of the File API specification. Most people who work on HTML5 and WebApps know Jonas' contributions to both WGs; with everyone's consent, I'd like to nominate him as co-editor. His model for an asynchronous event-driven API is what prompted the initial rewrite, and he also works on both File API and IndexedDB implementation (amongst other things). -- A* [1] http://dev.w3.org/2006/webapi/FileAPI/ [2] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0755.html [3] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0716.html [4] https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html
Re: [File API] Recent Updates To Specification + Co-Editor
Thanks for the update. We've some more questions regarding the blob URL. 1. The spec does not describe how blob and blob URL will work in the worker and shared worker scenarios. I think we should allow WorkerGlobalScope to be the binding context for the blob URL, like Document. In addition, we should define how a blob object can be passed to the worker via structured cloning. A new blob object should be expected to be created and it points to the same underlying data. 2. The current spec says that the lifetime of the blob URL is bound to the lifetime of the spawning context. What happens if we try to access the blob url from multiple contexts? Say, we call parent.blob.url, the lifetime of the url is bound to the parent context, not the current context, per the spec. This sounds a little bit unnatural. Could we explicitly provide the context while creating the blob URL, like window.createBlobUrl(blob)? 3. Since the lifetime of the blob URL is bound to a context, the blob URL (the underlying blob data) will get disposed only when the context dies. When we have long-live pages or shared workers, we could have leaked blob URLs that result in unclaimed blob storages. It will be nice if we can add the capability to revoke the blob URL pragmatically, like window.revokeBlobUrl(url), 4. It will be good if the spec could say more about the lifetime of the blob object and the blob URL since they're kind of orthogonal: the blob object will still be functional as long as it is not GC-ed even if the associated context dies. 5. The spec does not describe explicitly about the transient cases, like location.href = blob.url. Probably the spec could mention that the resource pointed by blob URL should be loaded successfully as long as the blob URL is valid at the time when the resource is starting to load. On Mon, Jun 28, 2010 at 2:20 PM, Arun Ranganathan a...@mozilla.com wrote: Greetings WebApps WG, I have made edits to the File API specification [1]. There are a few things of note that I'd like to call the WG's attention to. 1. There is a name change in effect. FileReader has been re-named BlobReader, upon request from Chrome team folks[2][3]. The name BlobReader won't win awards in a beauty pageant, but it tersely describes an object to read Blobs (which could originate from the underlying file system *or* be generated *within* a Web App). My present understanding is that FileWriter will also undergo a name change. Naming is really hard. Firefox already ships with FileReader, but I see the point of having an object named for what it does, which in this case is certainly more than file reading from the underlying file system. I also abhor bike shedding, especially over naming, but this is something that's exposed to the authors. I have not renamed FileError or FileException. In the case of errors and exceptions, I think *most* scenarios will occur as a result of issues with the underlying file system. These names should remain. 2. I've updated the URL scheme for Blobs using an ABNF that calls for an opaque string which is a term I define in the specification. There was much discussion about this aspect of the File API specification, and I think the existing scheme does allow for user agents to tack on origin information in the URL (this is not something the spec. says you should do). The actual choice of opaque string is left to implementations, though the specification suggests UUID in its canonical form (and provides an ABNF for this). I think this is the most any specification has said on the subject of URLs. 3. There is an additional asynchronous read method on BlobReader, and an additional synchronous read method on BlobReaderSync, namely readAsArrayBuffer. These use the TypedArrays definition initially defined by the WebGL WG [4]. 4. I am moving on from my full-time role at Mozilla to a part-time consulting role. I'll continue to be an editor of the File API, but I am stepping down as Chair of the WebGL WG. I'll continue to be active in standards communities, though :-) 5. I spoke to Jonas Sicking, who expressed willingness to be a co-editor of the File API specification. Most people who work on HTML5 and WebApps know Jonas' contributions to both WGs; with everyone's consent, I'd like to nominate him as co-editor. His model for an asynchronous event-driven API is what prompted the initial rewrite, and he also works on both File API and IndexedDB implementation (amongst other things). -- A* [1] http://dev.w3.org/2006/webapi/FileAPI/ [2] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0755.html [3] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0716.html [4] https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html
Re: Updates to File API
I think encoding the security origin in the URL allows the UAs to do the security origin check in place, without routing through other authority to get the origin information that might cause the check taking long time to finish. If we worry about showing the double schemes in the URL, we can transform the origin encoded in the URL by using base64 or other escaping algorithm. Jian On Wed, Jun 23, 2010 at 8:24 AM, David Levin le...@google.com wrote: On Tue, Jun 22, 2010 at 8:56 PM, Adrian Bateman adria...@microsoft.comwrote: On Tuesday, June 22, 2010 8:40 PM, David Levin wrote: I agree with you Adrian that it makes sense to let the user agent figure out the optimal way of implementing origin and other checks. A logical step from that premise is that the choice/format of the namespace specific string should be left up to the UA as embedding information in there may be the optimal way for some UA's of implementing said checks, and it sounds like other UAs may not want to do that. Robin outlined why that would be a problem [1]. My original feeling was that this should be left up to UAs, as you say, but I've been convinced that doing so is a race to the most complex URL scheme. Robin discussed something that could possibly in http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html. At the same time, there are implementors who gave specific reasons why encoding certain information (scheme, host, port) in the namespace specific string (NSS) is useful to various UAs. No other information has been requested, so theories adding more information seem premature. If the format must be specified, it seems reasonable to take both the theoretical and practical issues into account. Encoding that the security origin in the NSS isn't complex. If a proposal is needed about how that can be done in a simple way, I'm willing to supply one. Also, UAs that don't care about that information are free to ignore it and don't need to parse it. dave
Re: Updates to File API
One benefit of using the encoded origin is to do the security origin check in place, instead of resorting to a centralized authority, esp. under multi-process architecture. Considering getting and checking the origin before hitting the cache for the blob.url item. On Fri, Jun 11, 2010 at 9:09 AM, Adrian Bateman adria...@microsoft.comwrote: On Wednesday, June 02, 2010 5:35 PM, Jonas Sicking wrote: On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote: On 6/2/10 5:06 PM, Jian Li wrote: In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata: http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. This is a good suggestion. I particularly like the idea of encoding the origin as part of the scheme. Though we want to avoid introducing the concept of nested schemes to the web. While mozilla already uses nested schemes (jar:http://... and view-source:http://...) I know others, in particular Apple, have expressed a dislike for this in the past. And with good reason, it's not easy to implement and has been a source of numerous security bugs. That said, it's certainly possible. It's not clear to me the benefit of encoding the origin into the URL. Do we expect script to parse out the origin and use it? Even in a multi-process architecture there's presumably some central store of issued URLs which will need to store origin information as well as other things? Cheers, Adrian
Re: Updates to File API
On Wed, Jun 2, 2010 at 3:48 PM, Eric Uhrhane er...@google.com wrote: On Wed, Jun 2, 2010 at 3:44 PM, Arun Ranganathan a...@mozilla.com wrote: On 6/2/10 3:42 PM, Eric Uhrhane wrote: Arun: In the latest version of the spec I see that readAsDataURL, alone among the readAs* methods, still takes a File rather than a Blob. Is that just an oversight, or is that an intentional restriction? That's intentional; readAsDataURL was cited as useful only in the context of File objects. Do you think it makes sense in the context of random Blob objects? Does it make sense on slice calls on a Blob, for example? Sure, why not? Why would this be limited to File objects? A File is supposed to refer to an actual file on the local hard drive. A Blob is a big bunch of data that you might want to do something with. There's nothing special about a File when it comes to what you're doing with the data. Just as we moved File.url up to Blob, I think File.readAsDataURL belongs there too. And we move type from File to Blob.
Re: Updates to File API
Hi, Arun, I have one question regarding the scheme for Blob.url. The latest spec says that The proposed URL scheme is filedata:. Mozilla already ships with moz-filedata:. Since the URL is now part of the Blob and it could be used to refer to both file data blob and binary data blob, should we consider making the scheme as blobdata: for better generalization? In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata: http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. Indeed, the URL scheme seems to be more sort of implementation details. Different browser vendors can choose the appropriate scheme, like Mozilla ships with moz-filedata. How do you think? Jian On Thu, May 13, 2010 at 5:27 AM, Arun Ranganathan a...@mozilla.com wrote: Greetings WebApps WG, I have updated the editor's draft of the File API to reflect changes that have been in discussion. http://dev.w3.org/2006/webapi/FileAPI Notably: 1. Blobs now allow further binary data operations by exposing an ArrayBuffer property that represents the Blob. ArrayBuffers, and affiliated Typed Array views of data, are specified in a working draft as a part of the WebGL work [1]. This work has been proposed to ECMA's TC-39 WG as well. We intend to implement some of this in the Firefox 4 timeframe, and have reason to believe other browsers will as well. I have thus cited the work as a normative reference [1]. Eventually, we ought to consider further read operations given ArrayBuffers, but for now, I believe exposing Blobs in this way is sufficient. 2. url and type properties have been moved to to the underlying Blob interface. Notably, the property is now called 'url' and not 'urn.' Use cases for triggering 'save as' behavior with Content-Disposition have not been addressed[2], although I believe that with FileWriter and BlobBuilder[3] they may be addressed differently. This change reflects lengthy discussion (e.g. start here[4]) 3. The renaming of the property to 'url' also suggests that we should cease to consider an urn:uuid scheme. I solicited implementer feedback about URLs vs. URNs in general. There was a general preference to URLs[5], though this wasn't a strong preference. Moreover, Mozilla's implementation currently uses moz-filedata: . The current draft has an editor's note about the use of HTTP semantics, and origin issues in the context of shared workers. This is work in progress; I have removed the section specifying urn:uuid and hope to have an update with a section covering the filedata: scheme (with filedata:uuid as a suggestion). I welcome discussion about this. I'll point out that we are coining a new scheme, which we originally sought to avoid :-) 4. I have changed event order; loadend now fires after an error event [6]. -- A* [1] https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/doc/spec/TypedArray-spec.html [2] http://www.mail-archive.com/public-webapps@w3.org/msg06137.html [3] http://dev.w3.org/2009/dap/file-system/file-writer.html [4] http://lists.w3.org/Archives/Public/public-webapps/2010JanMar/0910.html [5] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0462.html [6] http://lists.w3.org/Archives/Public/public-webapps/2010AprJun/0062.html
Re: Updates to File API
I got what you mean. Thanks for clarifying it. Do you plan to add the origin encoding into the spec? How about using more generic scheme name blobdata:? Jian On Wed, Jun 2, 2010 at 5:26 PM, Arun Ranganathan a...@mozilla.com wrote: On 6/2/10 5:06 PM, Jian Li wrote: Hi, Arun, I have one question regarding the scheme for Blob.url. The latest spec says that The proposed URL scheme is filedata:. Mozilla already ships with moz-filedata:. Since the URL is now part of the Blob and it could be used to refer to both file data blob and binary data blob, should we consider making the scheme as blobdata: for better generalization? In addition, we're thinking it will probably be a good practice to encode the security origin in the blob URL scheme, like blobdata: http://example.com/33c6401f-8779-4ea2-9a9b-1b725d6cd50b. This will make doing the security origin check easier when a page tries to access the blob url that is created in another process, under multi-process architecture. This is a good suggestion. I particularly like the idea of encoding the origin as part of the scheme. Indeed, the URL scheme seems to be more sort of implementation details. Different browser vendors can choose the appropriate scheme, like Mozilla ships with moz-filedata. How do you think? Actually, I'm against leaving it totally up to implementations. Sure, the spec. could simply state how the URL behaves without mentioning format much, but we identified in the past [1] that it was wise to specify things reliably, so that developers didn't rely on arbitrary behavior in one implementation and expect something similar in another. It's precisely that genre of underspecified behavior that got us in trouble before ;-) -- A* [1] http://lists.w3.org/Archives/Public/public-webapps/2009OctDec/0743.html
Re: FileReader question about ProgressEvent
The current version of File API does not refer to the latest version of ProgressEvent and thus I am seeing unsigned long, instead of unsigned long long being used. Certainly for unsigned long long, we could only treat it as EMCAScript Number types not greater than 2^53. On Mon, Apr 26, 2010 at 3:01 PM, Olli Pettay olli.pet...@helsinki.fiwrote: On 4/21/10 1:51 AM, Jian Li wrote: According to the spec, we will dispatch a progress event for a read method. But per the Progress Events 1.0 spec, the attributes loaded and total are defined as unsigned long. interface ProgressEvent : events::Event { ... readonly attribute unsigned long loaded; readonly attribute unsigned long total; ... The type unsigned long is not enough to represent the file size. Do we want to update the Progress Event spec to use unsigned long long? Or we could limit the FileReader to only read from the file with size less than MAX_UINT. Jian Seems like the latest draft http://dev.w3.org/2006/webapi/progress/Progress.html has some bugs. ... readonly attribute unsigned long long loadedItems; readonly attribute unsigned long long totalItems; ... [Optional] in unsigned long loadedItemsArg, in unsigned long totalItemsArg); long long vs. long And it has also an init***NS method. Those have been removed from DOM 3 events. -Olli
FileReader question about ProgressEvent
According to the spec, we will dispatch a progress event for a read method. But per the Progress Events 1.0 spec, the attributes loaded and total are defined as unsigned long. interface ProgressEvent : events::Event { ... readonly attribute unsigned long loaded; readonly attribute unsigned long total; ... The type unsigned long is not enough to represent the file size. Do we want to update the Progress Event spec to use unsigned long long? Or we could limit the FileReader to only read from the file with size less than MAX_UINT. Jian
Re: Not making partial result available during FileReader.readAsText()?
This is what I also feel. I am going to set total and loaded based on binary data. Just want to make sure we're in the same page for those that are spec-ed out clearly. On Mon, Apr 12, 2010 at 5:31 PM, Jonas Sicking jo...@sicking.cc wrote: Unfortunately I think decoded data is impossible as you have no idea what the total amount of decoded data will be until you've decoded the entire file. / Jonas On Mon, Apr 12, 2010 at 4:35 PM, Jian Li jia...@chromium.org wrote: This sounds good. I have one more question related to this. What are we going to set for total and loaded attributes for ProgressEvent? Is it based on the decoded text or the underlying binary data? On Fri, Apr 9, 2010 at 6:59 PM, Jonas Sicking jo...@sicking.cc wrote: I think the spec should say MUST here, rather than SHOULD. Streaming contents seems as useful for text files as for binary files. The one thing that was tricky for us when implementing this was charset detection. We can't really expose any data until we've decided what charset to decode it with. However when loading normal pages over http the same situation arises, and all major browsers support streaming there. So I consider it a bug that firefox doesn't support this yet. It's something we should fix. / Jonas On Fri, Apr 9, 2010 at 6:33 PM, Jian Li jia...@chromium.org wrote: Should we really want to support this? As I know, FF 3.6 does not support this in its current FileReader implementation. I just want to understand if there is a strong need for this. On Fri, Apr 9, 2010 at 5:49 PM, Michael Nordman micha...@google.com wrote: Seems pretty clear from the snippet you provided, it says you SHOULD provide partially decoded results in the result attribute as progress is made. On Thu, Apr 8, 2010 at 8:31 PM, Jian Li jia...@chromium.org wrote: For FileReader.readAsText, the spec seems to allow partial file data being decoded and saved in the result attribute when progress event is fired: Make progress notifications. As the bytes from the fileBlob argument are read, user agents SHOULD ensure that on getting, the result attribute returns partial file data representing the number of bytes currently loaded (as a fraction of the total) [ProgressEvents], decoded in memory according to the encoding determination. The partial file data read so far might not get decoded completely. Could we choose not to decode the partial result till we retrieve all the data, just like what FileReader.readAsDataURL does?
Re: Not making partial result available during FileReader.readAsText()?
This sounds good. I have one more question related to this. What are we going to set for total and loaded attributes for ProgressEvent? Is it based on the decoded text or the underlying binary data? On Fri, Apr 9, 2010 at 6:59 PM, Jonas Sicking jo...@sicking.cc wrote: I think the spec should say MUST here, rather than SHOULD. Streaming contents seems as useful for text files as for binary files. The one thing that was tricky for us when implementing this was charset detection. We can't really expose any data until we've decided what charset to decode it with. However when loading normal pages over http the same situation arises, and all major browsers support streaming there. So I consider it a bug that firefox doesn't support this yet. It's something we should fix. / Jonas On Fri, Apr 9, 2010 at 6:33 PM, Jian Li jia...@chromium.org wrote: Should we really want to support this? As I know, FF 3.6 does not support this in its current FileReader implementation. I just want to understand if there is a strong need for this. On Fri, Apr 9, 2010 at 5:49 PM, Michael Nordman micha...@google.com wrote: Seems pretty clear from the snippet you provided, it says you SHOULD provide partially decoded results in the result attribute as progress is made. On Thu, Apr 8, 2010 at 8:31 PM, Jian Li jia...@chromium.org wrote: For FileReader.readAsText, the spec seems to allow partial file data being decoded and saved in the result attribute when progress event is fired: Make progress notifications. As the bytes from the fileBlob argument are read, user agents SHOULD ensure that on getting, the result attribute returns partial file data representing the number of bytes currently loaded (as a fraction of the total) [ProgressEvents], decoded in memory according to the encoding determination. The partial file data read so far might not get decoded completely. Could we choose not to decode the partial result till we retrieve all the data, just like what FileReader.readAsDataURL does?
Ordering of error/load and loadend events for FileReader
The spec says that loadend event should be dispatched before error event when an error occurs during file read: 2. If an error occurs during file read, set readyState to DONE and set result to null. Proceed to the error steps below. 1. Dispatch a progress event called loadend. 2. Dispatch a progress event called error. Set the error attribute; on getting, the error attribute MUST be a a FileError object with a valid error code that indicates the kind of file error that has occurred. 3. Terminate this overall set of steps. However, it is said that loadend event should be dispatched after load event when the file data has been completely read. When this specification says to make progress notifications for a read method, the following steps MUST be followed: 1. While the read method is processing, queue a task to dispatch a progress event called progress about every 50ms or for every byte read into memory, whichever is least frequent. 2. When the data from the file or fileBlob has been completely read into memory, queue a task to dispatch a progress event called load 3. When the data from the file or fileBlob has been completely read into memory, queue a task to dispatch a progress event called loadend The ordering of error/load and loadend event seem to be inconsistent. Could we move the dispatch of loadend event after error event?
Not making partial result available during FileReader.readAsText()?
For FileReader.readAsText, the spec seems to allow partial file data being decoded and saved in the result attribute when progress event is fired: Make progress notifications. As the bytes from the fileBlob argument are read, user agents SHOULD ensure that on getting, the result attribute returns partial file data representing the number of bytes currently loaded (as a fraction of the total) [ProgressEvents], decoded in memory according to the encoding determination. The partial file data read so far might not get decoded completely. Could we choose not to decode the partial result till we retrieve all the data, just like what FileReader.readAsDataURL does?
Re: FormData with sliced Blob
I mean UUID. It is the UUID part in URN of the File API spec. The UA can choose any appropriate way to generate a unique string, like UUID. On Tue, Mar 23, 2010 at 9:30 AM, Anne van Kesteren ann...@opera.com wrote: On Tue, 23 Mar 2010 01:26:32 +0100, Jian Li jia...@google.com wrote: To be safe, probably UA can choose to create the unique name from the GUID, like blob-5597cb2e-74fb-479a-81e8-10679c523118. Which GUID? Is that in the File API specification? -- Anne van Kesteren http://annevankesteren.nl/
Re: FormData with sliced Blob
Unless we want to treat the blob same as string, we might have to provider some sort of filename. Since without it, the server side might have problem to save it temporarily. On Tue, Mar 23, 2010 at 1:06 AM, Anne van Kesteren ann...@opera.com wrote: On Tue, 23 Mar 2010 02:24:52 +0100, Dmitry Titov dim...@google.com wrote: Seriously though, it would be nice to have XHR2 spec to have these details spelled out, especially mime type (I think David meant application/octet-stream) We previously discussed this and then we decided that for Blob Content-Type simply would not be present. I don't see why that would be different for multipart/form-data. Not really sure what to do about filename. -- Anne van Kesteren http://annevankesteren.nl/
Re: FormData with sliced Blob
To be safe, probably UA can choose to create the unique name from the GUID, like blob-5597cb2e-74fb-479a-81e8-10679c523118. On Mon, Mar 22, 2010 at 4:43 PM, David Levin le...@google.com wrote: What about using a filename that is unique with repect to files sent in that FormData (but it is up to the UA)? For example, a UA may choose to do Blob1, Blob2, etc. For the content-type, application/octet-string seems most fitting. Here's the result applied to your example: --SomeBoundary... Content-Disposition: form-data; name=file; filename=Blob1 Content-Type: application/octet-string dave On Fri, Mar 19, 2010 at 6:25 PM, Jian Li jia...@google.com wrote: Hi, I have questions regarding sending FormData with sliced files. When we send a FormData with a regular file, we send out the multipart data for this file, like the following: --SomeBoundary... Content-Disposition: form-data; name=file; filename=test.js Content-Type: application/x-javascript ... However, when it is sliced into a blob, it does not have the file name and type information any more. I am wondering what we should send. Should we just not provide the filename and Content-Type information? Thanks, Jian
Re: File API: Blob and underlying file changes.
Treating blobs as snapshots sounds like a reasonable approach and it will make the life of the chunked upload and other scenarios easier. Now the problem is: how do we get the blob (snapshot) out of the file? 1) We can still keep the current relationship between File and Blob. When we slice a file by calling File.slice, a new blob that captures the current file size and modification time is returned. The following Blob operations, like slice, will simply inherit the cached size and modification time. When we access the underlying file data in XHR.send() or FileReader, the modification time will be verified and an exception could be thrown. 2) We can remove the inheritance of Blob from File and introduce File.getAsBlob() as dimich suggested. This seems to be more elegant. However, it requires changing the File API spec a lot. On Wed, Jan 20, 2010 at 3:44 PM, Eric Uhrhane er...@google.com wrote: On Wed, Jan 20, 2010 at 3:23 PM, Dmitry Titov dim...@chromium.org wrote: On Wed, Jan 20, 2010 at 2:30 PM, Eric Uhrhane er...@google.com wrote: I think it could. Here's a third option: Make all blobs, file-based or not, just as async as the blobs in option 2. They never do sync IO, but could potentially fail future read operations if their metadata is out of date [e.g. reading beyond EOF]. However, expose the modification time on File via an async method and allow the user to pass it in to a read call to enforce fail if changed since this time. This keeps all file accesses async, but still allows for chunked uploads without mixing files accidentally. If we allow users to refresh the modification time asynchronously, it also allows for adding a file to a form, changing the file on disk, and then uploading the new file. The user would look up the mod time when starting the upload, rather than when the file's selected. It would be great to avoid sync file I/O on calls like Blob.size. They would simply return cached value. Actual mismatch would be detected during actual read operation. However then I'm not sure how to keep File derived from Blob, since: 1) Currently, in FF and WebKit File.fileSize is a sync I/O that returns current file size. The current spec says File is derived from Blob and Blob has Blob.size property that is likely going to co-exist with File.fileSize for a while, for compat reasons. It's weird for file.size and file.fileSize to return different things. True, but we'd probably want to deprecate file.fileSize anyway and then get rid of it, since it's synchronous. 2) Currently, xhr.send(file) does not fail and sends the version of the file that exists somewhere around xhr.send(file) call was issued. Since File is also a Blob, xhr.send(blob) would behave the same which means if we want to preserve this behavior the Blob can not fail async read operation if file has changed. There is a contradiction here. One way to resolve it would be to break File is Blob and to be able to capture the File as Blob by having file.getAsBlob(). The latter would make a snapshot of the state of the file, to be able to fail subsequent async read operations if the file has been changed. I've asked a few people around in a non-scientific poll and it seems developers expect Blob to be a 'snapshot', reflecting the state of the file (or Canvas if we get Canvas.getBlob(...)) at the moment of Blob creation. Since it's obviously bad to actually copy data, it seems acceptable to capture enough information (like mod time) so the read operations later can fail if underlying storage has been changed. It feels really strange if reading the Blob can yield some data from one version of a file (or Canvas) mixed with some data from newer version, without any indication that this is happening. All that means there is an option 3: 3. Treat all Blobs as 'snapshots' that refer to the range of underlying data at the moment of creation of the Blob. Blobs produced further by Blob.slice() operation inherit the captured state w/o actually verifying it against 'live' underlying objects like files. All Blobs can be 'read' (or 'sent') via operations that can fail if the underlying content has changed. Optionally, expose snapshotTime property and perhaps read if not changed since parameter to read operations. Do not derive File from Blob, rather have File.getAsBlob() that produces a Blob which is a snapshot of the file at the moment of call. The advantage here is that it removes need for sync operations from Blob and provides mechanism to ensure the changing underlying storage is detectable. The disadvantage is a bit more complexity and bigger change to File spec. That sounds good to me. If we're treating blobs as snapshots, I retract my suggestion of the read-if-not-changed-since parameter. All reads after the data has changed should fail. If you want to do a chunked upload, don't snapshot your
Re: File API: Blob and underlying file changes.
What we mean for snapshotting here is not to copy all the underlying data. Instead, we only intend to capture the least information needed in order to verify if the underlying data have been changed. I agreed with Eric that the first option could cause inconsistent semantics between File.slice and Bloc.slice. But how are we going to address the synchronous call to get the file size for Blob.size if the blob is a file? On Thu, Jan 21, 2010 at 12:49 PM, Jonas Sicking jo...@sicking.cc wrote: One thing to remember here is that if we require snapshotting, that will mean paying potentially very high costs every time the snapshotting operation is used. Potetially copying hundreds of megabytes of data (think video). But if we don't require snapshotting, things will only break if the user takes the action to modify a file after giving the page access to it. Also, in general snapshotting is something that UAs can experiment with without requiring changes to the spec. Even though File.slice is a synchronous function, the UA can implement snapshotting without using synchronous IO. The UA could simply do a asynchronous file copy in the background. If any read operations are performed on the slice those could simply be stalled until the copy is finished since reads are always asynchronous. / Jonas On Thu, Jan 21, 2010 at 11:22 AM, Eric Uhrhane er...@google.com wrote: On Thu, Jan 21, 2010 at 11:15 AM, Jian Li jia...@chromium.org wrote: Treating blobs as snapshots sounds like a reasonable approach and it will make the life of the chunked upload and other scenarios easier. Now the problem is: how do we get the blob (snapshot) out of the file? 1) We can still keep the current relationship between File and Blob. When we slice a file by calling File.slice, a new blob that captures the current file size and modification time is returned. The following Blob operations, like slice, will simply inherit the cached size and modification time. When we access the underlying file data in XHR.send() or FileReader, the modification time will be verified and an exception could be thrown. This would require File.slice to do synchronous file IO, whereas Blob.slice doesn't do that. 2) We can remove the inheritance of Blob from File and introduce File.getAsBlob() as dimich suggested. This seems to be more elegant. However, it requires changing the File API spec a lot. On Wed, Jan 20, 2010 at 3:44 PM, Eric Uhrhane er...@google.com wrote: On Wed, Jan 20, 2010 at 3:23 PM, Dmitry Titov dim...@chromium.org wrote: On Wed, Jan 20, 2010 at 2:30 PM, Eric Uhrhane er...@google.com wrote: I think it could. Here's a third option: Make all blobs, file-based or not, just as async as the blobs in option 2. They never do sync IO, but could potentially fail future read operations if their metadata is out of date [e.g. reading beyond EOF]. However, expose the modification time on File via an async method and allow the user to pass it in to a read call to enforce fail if changed since this time. This keeps all file accesses async, but still allows for chunked uploads without mixing files accidentally. If we allow users to refresh the modification time asynchronously, it also allows for adding a file to a form, changing the file on disk, and then uploading the new file. The user would look up the mod time when starting the upload, rather than when the file's selected. It would be great to avoid sync file I/O on calls like Blob.size. They would simply return cached value. Actual mismatch would be detected during actual read operation. However then I'm not sure how to keep File derived from Blob, since: 1) Currently, in FF and WebKit File.fileSize is a sync I/O that returns current file size. The current spec says File is derived from Blob and Blob has Blob.size property that is likely going to co-exist with File.fileSize for a while, for compat reasons. It's weird for file.size and file.fileSize to return different things. True, but we'd probably want to deprecate file.fileSize anyway and then get rid of it, since it's synchronous. 2) Currently, xhr.send(file) does not fail and sends the version of the file that exists somewhere around xhr.send(file) call was issued. Since File is also a Blob, xhr.send(blob) would behave the same which means if we want to preserve this behavior the Blob can not fail async read operation if file has changed. There is a contradiction here. One way to resolve it would be to break File is Blob and to be able to capture the File as Blob by having file.getAsBlob(). The latter would make a snapshot of the state of the file, to be able to fail subsequent async read operations if the file has been changed. I've asked a few people around in a non-scientific poll and it seems developers expect Blob
Re: File API: Blob and underlying file changes.
It seems that we feel that when a File object is sent via either Form or XHR, the latest underlying version should be used. When we get a slice via Blob.slice, we assume that the underlying file data is stable since then. So for uploader scenario, we need to cut a big file into multiple pieces. With current File API spec, we will have to do something like the following to make sure that all pieces are cut from a stable file. var file = myInputElement.files[0]; var blob = file.slice(0, file.size); var piece1 = blob.slice(0, 1000); var piece2 = blob.slice(1001, 1000); ... The above seems a bit ugly. If we want to make it clean, what Dmitry proposed above seems to be reasonable. But it would require non-trivial spec change. On Wed, Jan 13, 2010 at 11:28 AM, Dmitry Titov dim...@chromium.org wrote: Atomic read is obviously a nice thing - it would be hard to program against API that behaves as unpredictably as a single read operation that reads half of old content and half of new content. At the same note, it would be likely very hard to program against Blob objects if they could change underneath unpredictably. Imagine that we need to build an uploader that cuts a big file in multiple pieces and sends those pieces to the servers so they will be stitched together later. If during this operation the underlying file changes and this changes all the pieces that Blobs refer to (due to clamping and just silent change of content), all the slicing/stitching assumptions are invalid and it's hard to even notice since blobs are simply 'clamped' silently. Some degree of mess is possible then. Another use case could be a JPEG image processor that uses slice() to cut the headers from the image file and then uses info from the headers to cut further JFIF fields from the file (reading EXIF and populating local database of images for example). Changing the file in the middle of that is bad. It seems the typical use cases that will need Blob.slice() functionality form 'units of work' where Blob.slice() is used with likely assumption that underlying data is stable and does not change silently. Such a 'unit of work' should fail as a whole if underlying file changes. One way to achieve that is to reliably fail operations with 'derived' Blobs and even perhaps have a 'isValid' property on it. 'Derived' Blobs are those obtained via slice(), as opposite to 'original' Blobs that are also File. One disadvantage of this approach is that it implies that the same Blob has 2 possible behaviors - when it is obtained via Blob.slice() (or other methods) vs is a File. It all could be a bit cleaner if File did not derive from Blob, but instead had getAsBlob() method - then it would be possible to say that Blobs are always immutable but may become 'invalid' over time if underlying data changes. The FileReader can then be just a BlobReader and have cleaner semantics. If that was the case, then xhr.send(file) would capture the state of file at the moment of sending, while xhr.send(blob) would fail with exception if the blob is 'invalid' at the moment of send() operation. This would keep compatibility with current behavior and avoid duplicity of Blob behavior. Quite a change to the spec though... Dmitry On Wed, Jan 13, 2010 at 2:38 AM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Jan 12, 2010 at 5:28 PM, Chris Prince cpri...@google.com wrote: For the record, I'd like to make the read atomic, such that you can never get half a file before a change, and half after. But it likely depends on what OSs can enforce here. I think *enforcing* atomicity is difficult across all OSes. But implementations can get nearly the same effect by checking the file's last modification time at the start + end of the API call. If it has changed, the read operation can throw an exception. I'm talking about during the actual read. I.e. not related to the lifetime of the File object, just related to the time between the first 'progress' event, and the 'loadend' event. If the file changes during this time there is no way to fake atomicity since the partial file has already been returned. / Jonas
Re: Proposal for sending multiple files via XMLHttpRequest.send()
Thank you for all your great feedbacks. Yes, the first approach is simpler and it requites far less work from the author and thus less error prone. However, I think the second approach does provide more flexibilities that might fit for the different data assembling and sending purpose. The author can use it to upload multiple attached files, save a set of client generated items, or even send any combinations of string data and file data. For example, the presentation web application wants to save a set of client generated slides to the server. Some of slides include attached files, like a video clip. It will be much easier for the author to send all the data to the server via the second approach: var payload = new Array; payload.push(header_for_slide1); payload.push(data_for_slide1); payload.push(header_for_slide2); payload.push(data_for_slide2); payload.push(attached_file1_for_slide2); payload.push(attached_file2_for_slide2); ... xhr.send(payload); Since XMLHttpRequest spec has already added the overload for send(document), why not just adding more overload for file and array of items? IMHO, having similar send*** methods, like sendFile, together with overloads of send() might make the API more complicated. 2009/9/11 Yaar Schnitman y...@chromium.org How is supposed the web application to detect that the browser supports this feature? Maybe instead of overloaded send, we should create new method sendFile(File) and sendFiles(File[]). And between the two approaches, the first one is simpler, but the second one allows to send the files one each time and add other form data in the request. Maybe sendFiles(File[], [Strings[]])?
Proposal for sending multiple files via XMLHttpRequest.send()
There has already been a discussion on extending XMLHttpRequest.send() to take a File object. Could we also consider enhancing it further to support sending multiple files, like a FileList from the drag and drop. We could make XMLHttpRequest.send() take a FileList object and let the browser add multipart boundary separators automatically. Or, the other simpler way, thanks to Darin's suggestion, is to extend XMLHttpRequest.send() to take an array of items. Each of item could be either a string or a file reference strictly. The web application is responsible to generate the multipart enevelop like the following: var payload = new Array; payload.push(header1); payload.push(file1); payload.push(footer1); ... xhr.send(payload); How do you guys think about these approaches? Thanks, Jian