Re: [whatwg] Asynchronous database API feedback

Maciej Stachowiak Sun, 09 Dec 2007 03:52:31 -0800


On Dec 9, 2007, at 2:34 AM, Aaron Boodman wrote:

On Dec 9, 2007 1:59 AM, Maciej Stachowiak <[EMAIL PROTECTED]> wrote:

a) Disk access is typically going to be a lot faster than network
access


I think this assumption is, if not exactly incorrect, somewhat
misleading.

For users on network home directories, disk access /is/ network
access. This is a common setup at large corporations and educational
institutions. We have specific experience with WebKit that doing

sqlite database access from the UI thread resulted in frequent longUI

stalls for Apple's users on network home directories.(*)

Now, you might argue that network home directories are not "typical"
and this is true, but the web application has no way to know when it
might hit the atypical case and thereby block the user's UI, just as
with synchronous XMLHttpRequest you have no way to know when the user
is on a slow network connection.

So I continue to believe that it's not safe to do synchronous I/Ofrom

the UI thread.


In the case of Firefox and IE on Windows and Mac at least (I don't
recall the situation with Safari), Gears' sqlite database is stored in
the Caches or "Local Settings" folder which, as I understand it, is
meant to be on the local drive.

I'm not sure what you mean specifically in the Mac case, but ~/Library/Caches isn't guaranteed to be on a local drive on the Mac. I'malso not sure that's quite the right place semantically. It isintended that you should be able to delete all of ~/Library/Cacheswithout any behavior changes besides possibly performance, so forinstance http cookies are not stored there.

Since part of the purpose of this API is to allow offline access, it
doesn't seem to make sense to put the data on a network drive, at
least for devices that are mobile.

On a mobile device, it doesn't make sense for your home directory tobe on a network drive either. However, if you use one of severalshared workstations, you probably want your local data to be therewhen the rest of your homedir is.

Another important consideration: even ignoring distributed
filesystems, how do web application developers decide when the writes
they are doing are definitely small enough that it's safe to use the
sync API?

Your test shows 3KB written in a tenth of a second, but datastores
could easily be much larger than that. If the time scales linearly,
then even a modest 300KB of data could take 10 seconds to write,
clearly an unacceptable amount of time to stall the UI (I hope it
doesn't scale linearly because that would be alarmingly slow, but
clearly at some size it gets slow).


There are many different use cases for the local database and a
developer can make reasonable assumptions about how large the queries
are going to be in many cases. For example, pulling up all the data
required to render the first view of Reader is a totally different
kind of query than updating the read or starred state of an individual
item.

This doesn't really convince me that web developers have the tools athand to make the right choice. Even an expert on the topic like youmay not test in a very wide variety of hardware and softwareconfigurations, and so may assume a particular request is safe becauseit seems to work.

I'll be more convinced if there's a better answer to this than "makereasonable assumptions".

Thinking about it now, I can imagine a way to make this more concrete:give synchronous transactions a time limit, and if they exceed it theyreport an error and fail. We can be generous and say the limit is 5seconds, although that's awful close to unacceptable UI lockupterritory. Possible drawbacks:

- I'm not sure it will be any easier to handle timeout errors than itwould be to use the async API (since, if your request is too big tocomplete in reasonable time on this device, you probably have to usethe async API as your fallback).

- In practice web developers probably won't handle the timeout errorcorrectly, if it doesn't happen to them on their test setup, so webapps are likely to fail mysteriously when it does occur. But arguablythis is better than a long UI hang, since that risks the user's wholebrowsing session, not just a single web app.

Thoughts? Would you be willing to use a synchronous API with atimeout? Do you think it's reasonable for other web developers? (I'mhonestly not sure, I haven't thought it through in detail.)

Given how wildly hardware varies, I'm not sure how web developers can
safely make this choice. It seems likely that they'd choose whatever
seems to work for them in simple cases, and not test at all on slower

filesystems. If the queries they do vary in size, they may not testatextreme sizes. These are the same kinds of cases where synchronousXHR

creates surprising problems - it seems ok on the developer's fast
local network, so why expect that end users will see a problem?

It's a similar situation to XHR, but I think the parameters aredifferent:


a) Synchronous network access is almost never be a good idea, but in
our experience synchronous local disk access via SQLite is frequently
fine.

I don't think "in our experience ... is frequently fine" translates to"is a good idea", when dealing something as variable as filesystems.It's like saying "multithreaded programs with complex locking arefrequently fine", having only tested on single-CPU machines. (I'veseen programs that go from "never deadlocks" to "deadlocks withinseconds" when going from single-core to quad-core machines). And Ihope most of us would agree that concurrency with shared read-writestate is probably not a good API to offer in the browser, even thoughit's possible a few developers can sometimes get it right.

Getting back to storage, consider devices with a Flash drive as theprimary disk. Most web developers won't test on these (did you?), butthey have very different performance characteristics than hard drives.While there are no seek latencies to content with, and reads can bepretty fast, the write throughput can be quite a bit worse, especiallyfor scattered small writes. In many cases, such devices have specialfilesystems that try to spread writes over the entire device, toincrease Flash lifetime. But the result can be that write latenciesget much worse than usual at unpredictable times.

I do think a sync API with timeout would adequately handle the varietyin hardware, but it would have the significant drawbacks mentionedabove.


Regards,
Maciej

Re: [whatwg] Asynchronous database API feedback

Reply via email to