On 08 Mar 2011, at 7:23 AM, Dean Landolt wrote:

> This doesn't seem right. Assuming your WebSQL implementation had all the same 
> indexes isn't it doing pretty much the same things as using separate 
> objectStores in IDB? Why would it be an order of magnitude slower? I'm sure 
> whatever implementation you're using hasn't seen much optimization but you 
> seem to be implying there's something more fundamental? The only thing I can 
> think of to blame would be the fat in the objectStore interface -- like, for 
> instance, the index building facilities. It seems to me your proposed 
> solution is to add yet more fat to the interface (more complex indexing), but 
> wouldn't it be just as suitable to instead strip down objectStores to their 
> bare essentials to make them more suitable to act as indexes? Then the 
> indexing functionality and all the hard decisions could be punted to 
> libraries where they'd be free to innovate.

Exactly. It's not what one would expect, and indication of the poor state of 
the IDB implementation (which is essentially a wrapper around SQLite anyway).

If someone is advising that object stores be used to handle indexes then may I 
be the first to raise a red flag and say that IDB is failing us (and it would 
have been better for the spec team to provide a locking mechanism for 
LocalStorage so it could be used in that way). The whole point of IDB as far as 
I can see is to provide transactional indexed access to a key value store.

> Why? You wouldn't necessarily have to store the whole object in each index, 
> just the index key, a value and some pointer to the original source object. 
> Something to resolve this pointer to the source would need to be spec'd (a la 
> couchdb's include_docs), but that's simple. Even better, say it were possible 
> to define a link relation on an object store that can resolve to its source 
> object -- you could define a source link relation and the property to use -- 
> and this would have the added bonus of being more broadly applicable than 
> just linking an index record to its source instance.

Think of the object creation and JSON serialization/deserialization overhead 
for putting 50 indexes and you have got more than enough waste there already.

> We can fix all of this right now very simply:
> 
> 1. Enable objectStore.put and objectStore.delete to accept a setIndexes 
> option and an unsetIndexes option. The value passed for either option would 
> be an array (string list) of index references.
> 
> This would only work for indexes arrays of strings, right? Things can get 
> much more complicated than that, and when they do you'd have to use an 
> objectStore to do your indexing anyway, right?

No it would work for pretty much anything. The application would be free to 
determine the indexes, and also to convert query parameters into indexes when 
querying. It's essentially "computed indexes" without the hassles of IDB trying 
to do it (there was an interesting thread last year on the challenges of 
storing am index computing function in IDB).

> Why is it more theoretically performant than using objectStores in the raw?

It's a more direct interface. Think about it for a second. Using objectStores 
in the raw is interpolating O(n) complexity with multiple function calls, to 
give just one reason. If IDB can receive a list of indexes to add and remove an 
object to and from, then it can also do things like perform a set difference 
first to save unnecessary IO. I have written a database or two with this 
technique and it's certainly faster.

> I don't necessarily understand the stateful vs. stateless distinction here. I 
> don't see how your proposed solution removes the requirement for IDB to 
> enforce constraints when certain indexes are present. Developers would 
> already be able to use IDB statefully (with predefined schemas) -- they'd 
> just use a library that has a schema mechanism. I doubt such a library for 
> IDB already exists, but it'd be quite easy to port perstore, for instance, 
> which is derived from the IDB API and already has this functionality using 
> json-schema. There will no doubt be many ORM-like libraries that will pop up 
> as soon as IDB starts to stabilize (or as soon as it gets a node.js 
> implementation).

The trouble is you always think a database would "be quite easy" until you 
actually try to do it yourself. At first when I dug into IDB I didn't think 
there would be any problems that could not be handled in some way. I have 
actually switched back to WebSQL now and will encourage my users to use Safari 
or Chrome as long as these browsers support WebSQL (and I hope Chrome will at 
least finish up by adding a quota interface for WebSQL). IDB right now is like 
a completely neutered slower SQLite without any of the benefits to be expected 
of a transactional indexed KV store. It's really sad.

For examples of stateless databases see the interfaces for Redis (the best 
example, and a perfect target for IDB), Berkeley, Tokyo. For a statefull 
database see MySql (and read this by Bret Taylor on the subject 
http://bret.appspot.com/entry/how-friendfeed-uses-mysql). I can understand how 
IDB just inherited this idea of pre-defined indexes from SQL. But I think it's 
an assumption that must be challenged given the complexity it involves and the 
greater power, flexibility, and simplicity to be had from a stateless database.

> ISTM giving library authors the freedom and flexibility to control their own 
> indexes would be a huge win. They already have much of what they need fo this 
> (though there are still a few gaps) but complicating the indexing without 
> actually solving the problems would only serve to hamper users. If it's easy 
> to implement, great, but I'm still left wondering why maintaining your own 
> indexes is so slow -- this seems like the use case for IDB to really nail.

I think we both want the same thing. Making IDB stateless is the best step 
towards providing something flexible that library authors can work on top of. 
But this does not appear to be the current goal of IDB, which wants to try and 
tackle things like application state, computing indexes, migrations, the whole 
shebang (all of which seems to be becoming more and more the jurisdiction of 
the application), instead of directly addressing the original goal of providing 
a transactional indexed key value store. IDB is about as high-level as any 
low-level API could be right now.


Reply via email to