Re: Blobs: An alternate (complementary?) binary data proposal (Was: File IO...)

Maciej Stachowiak Sun, 11 May 2008 18:47:46 -0700


On May 11, 2008, at 6:01 PM, Aaron Boodman wrote:

On Sun, May 11, 2008 at 5:46 PM, Maciej Stachowiak <[EMAIL PROTECTED]>wrote:
Well, that depends on how good the OS buffer cache is atprefetching. But in
general, there would be some disk access.
It seems better if the read API is just async for this case to prevent
the problem.

It can't entirely prevent the problem. If you read a big enough chunk,it will cause swapping which hits the disk just as much as file reads.Possibly more, because real file access will trigger OS prefetchheuristics for linear access.

I see what you mean for canvas, but not so much for XHR. It seemslikea valid use case to want to be able to use XHR to download verylargefiles. In that case, the thing you get back seems like it shouldhave
an async API for reading.
Hmm? If you get the data over the network it goes into RAM. Whywould youwant an async API to in-memory data? Or are you suggesting XHRshould bechanged to spool its data to disk? I do not think that is practicalto dofor all requests, so this would have to be a special API mode forresponses
that are expected to be too big to fit in memory.
Whether XHR spools to disk is an implementation detail, right? Right
now XHR is not practical to use for downloading large files because
the only way to access the result is as a string. Also because of
this, XHR implementations don't bother spooling to disk. But if this
API were added, then XHR implementations could be modified to start
spooling to disk if the response got large. If the caller requests
responseText, then the implementation just does the best it can to
read the whole thing into a string and reply. But if the caller uses
responseBlob (or whatever we call it) then it becomes practical to,
for example, download movie files, modify them, then re-upload them.

That sounds reasonable for very large files like movies. However,audio and image files are similar in size to the kinds of text or XMLresources that are currently processed synchronously. In such casesthey are likely to remain in memory.

In general it is sounding like it might be desirable to have at leasttwo kinds of objects for representing binary data:

1) An in-memory, mutable representation with synchronous access. Thereshould also be a copying API which is possibly copy-on-write for thebacking store.

2) A possibly disk-backed representation that offers only asynchronousread (possibly in the form of representation #1).

Both representations could be used with APIs that can accept binarydata. In most cases such APIs only take strings currently. The name ofrepresentation #2 may wish to tie it to being a file, since foranything already in memory you'd want representation #1. Perhaps theycould be called ByteArray and File respectively. Open question: can aFile be stored in a SQL database? If so, does the database store thedata or a reference (such as a path or Mac OS X Alias)?


Regards,
Maciej

Re: Blobs: An alternate (complementary?) binary data proposal (Was: File IO...)

Reply via email to