Re: [whatwg] Offline Web Apps

Maciej Stachowiak Tue, 25 Sep 2007 00:27:49 -0700


On Sep 24, 2007, at 10:45 PM, Robert O'Callahan wrote:

On 9/23/07, Maciej Stachowiak <[EMAIL PROTECTED]> wrote:
Obviously, if the way to get the contents as text requires providing
the encoding, then it has to be a method. My comment was about the no-
argument methods. But you have a point that reading from disk is not a
simple get operation. Probably the methods should have names based on
read or the like (read(), readAsText(), etc) to indicate this. Also,
they should arguably be asynchronous since reading from the disk can
be slow, especially for large files, and it is undesirable to block
the main thread.

For small files, synchronous reading is OK. Perhaps there should bea separate whiz-bang asynchronous API ... it could support partialreads too.

What kind of file is small enough is a matter of judgment and dependson device performance characteristics. I tried the followingexperiment to estimate how much time could be taken by synchronouscold reads of a moderate number of files (assuming multi-file supportin <input type="file"> and naiive use of the synchronous read API):


$ time cat ~/Pictures/*.jpg > /dev/null

real    0m1.135s
user    0m0.007s
sys     0m0.076s

This is on a pretty fast machine with a local filesystem. I have76 .jpg files totaling about 19M in size. 1.13 seconds seems like anunacceptable length of time to block the UI, and it could easily bemuch worse for, say, a batch photo upload or an upload of a moderatelylarge video file.

So I suspect that, much like synchronous XMLHttpRequest, synchronousfile reads will lead to excessive UI lockups in bad circumstancesunanticipated by the app author.

Also, I'm not sure how a web app can be expected to know the encoding
of a text file on disk.
The same way that any other app does --- guess based on theextension and expected usage? --- now that we've all standardized onmeta-data-less file systems :-(. I suppose an app could examine thefirst chunk of the file and then re-read the file with a better guess.

The OS and the UA can often make a better guess, so I think the optionto let the UA decide the encoding should at least be provided. Hereare some sources of info that the UA has but the web app doesn't (atleast without doing a separate binary read of the file first andpossibly significant computation):


1) OS-level metadata, as for example in Mac OS X:
$ xattr -l plan.txt
com.apple.TextEncoding: UTF-8;134217984

2) Checking for a BOM.

3) Heuristics for specific file types, like looking for <meta charset>in HTML files or the encoding pseudo-attribute in an XML declaration.

4) General character set autodetection algorithms through statisticalmethods or similar.

5) Knowledge of the user's locale (useful for some legacy systemswhere default text encoding is determined by locale).


6) Knowledge of platform encoding conventions.

Regards,
Maciej

Re: [whatwg] Offline Web Apps

Reply via email to