Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
What is the minimum that can be in IDB? I am guessing the following: 1. Sorted key-opaque value transactional store 2. Lookup of keys by values (or parts thereof) #1 is essential. #2 is unavoidable because you would want to efficiently manipulate values by values as opposed to values by key. I know of no efficient way of doing callbacks with JS. Moreover, avoiding indices completely seems to miss the point. Yes, IDB can be used without key paths and indices. When you do that, you would not have any headache of setVersion since every version change either adds or removes an object store. Next, originally, I also had floated the idea of application managed indices, but implementors thought of it as cruft. On Sun, Mar 20, 2011 at 3:10 PM, Joran Greef jo...@ronomon.com wrote: On 20 Mar 2011, at 4:54 AM, Jonas Sicking wrote: I don't understand what you are saying about application state though, so please do start that as a separate thread. At present, there's no way for an application to tell IDB what indexes to modify w.r.t. an object at the exact moment when putting or deleting that object. That's because this behavior is defined in advance using createIndex in a setVersion transaction. And then how IDB extracts the referenced value from the object is done using an IDB idea of key paths. But right there, in defining the indexes in advance (and not when the index is actually modified, which is when the object itself is modified), you've captured application state (data relationships that should be known only to the application) within IDB. Because this is done in advance (because IDB seems to have inherited this assumption that this is just the way MySQL happens to do it), there's a disconnect between when the index is defined and when it's actually used. And because of key paths you now need to spec out all kinds of things like how to handle compound keys, multiple values. It's becoming a bit of a spec-fest. That this bubble of state gets captured in IDB, it also means that IDB now needs to provide ways of updating that captured state within IDB when it changes in the application (which will happen, so essentially you now have your indexing logic stuck in the database AND in the application and the application developer now has to try and keep BOTH in sync using this awkward pre-defined indexes interface), thus the need for a setVersion transaction in the first place. None of this would be necessary if the application could reference indexes to be modified (and created if they don't exist, or deleted if they would then become empty) AT THE POINT of putting or deleting an object. Things like data migrations would also be better served if this were possible since this is something the application would need to manage anyway. Do you follow? The application is the right place to be handling indexing logic. IDB just needs to provide an interface to the indexing implementation, but not handle extracting values from objects or deciding which indexes to modify. That's the domain of the application. It's a question of encapsulation. IDB is crossing the boundaries by demanding to know ABOUT the data stored, and not just providing a simple way to put an object, and a simple way to put a reference to an object to an index, and a simple way to query an index and intersect or union an index with another. Essentially an object and its index memberships need to be completely opaque to IDB and you are doing the opposite. Take a look at the BDB interface. Do you see a setVersion or createIndex semantic in there? BDB has secondary databases, which are the same as indices with a one to many relation between primary and secondary database. Moreover, BDB uses application callbacks to let the application encapsulate the definition of the index. Take a look at Redis and Tokyo and many other things. Do you see a setVersion or createIndex semantic in there? Do these databases have any idea about the contents of objects? Any concept of key paths? I, for one, am not enamored by key paths. However, I am also morbidly aware of the perils in JS land when using callback like mechanisms. Certainly, I would like to hear from developers like you how you find IDB if you were to not use any createIndex at all. Or at least that you would like to manage your own indices. No, and that's the whole reason these databases were created in the first place. I'm sure you have read the BDB papers. Obviously this is not the approach of MySQL. But if IDB is trying to be MySQL but saying it wants to be BDB then I don't know. In any event, Firefox would be brave to also embed SQLite. Let the better API win. How much simpler could it be? At the end of the day, it's all objects and sets and sorted sets, and see Redis' epiphany on this point. IDB just needs to provide transactional access to these sets. The application must decide what goes in and out of these sets, and must be able to do
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 26 Mar 2011, at 10:14 AM, Nikunj Mehta wrote: What is the minimum that can be in IDB? I am guessing the following: 1. Sorted key-opaque value transactional store 2. Lookup of keys by values (or parts thereof) Yes, this is what we need. In programmer speak: objects (opaque strings), sets (hash indexes), sorted sets (range indexes). I know of no efficient way of doing callbacks with JS. Moreover, avoiding indices completely seems to miss the point. Callbacks are unnecessary. This is what you would want to do as a developer using the current form of IDB: objectStore.putObject({ name: Joran, emails: [jo...@gmail.com, jo...@ronomon.com] }, { id: 'arbitraryObjectIdProvidedByTheApplication', indexes: [emails=jo...@gmail.com, emails=jo...@ronomon.com, name=Joran] }); IDB would then store the user object using the id provided by the application, and make sure it's referenced by this id in the emails=jo...@gmail.com, emails=jo...@ronomon.com, name=Joran index references provided (creating these indexes along the way if need be). The application is responsible for passing in the extra id and indexes options to putObject. Supporting range indexes would be a question of expanding the above to let the developer pass in a sort score along with the index reference. Next, originally, I also had floated the idea of application managed indices, but implementors thought of it as cruft. I can understand how application managed indices would lead to less work on the part of the spec committee. There seems to be some perverse human characteristic that likes to make easy things difficult. Ships will sail around the world but the Flat Earth Society will flourish. I, for one, am not enamored by key paths. However, I am also morbidly aware of the perils in JS land when using callback like mechanisms. Certainly, I would like to hear from developers like you how you find IDB if you were to not use any createIndex at all. Or at least that you would like to manage your own indices. I am begging to be able to manage my indices. I know my data. I do not want to use any createIndex to declare indexes in advance of when I may or may not use them. What advantage would that give me? I want to create/update indexes only when I put or delete objects and I want to have control over which indexes to update accordingly. With one small change to the putObject and deleteObject interfaces, in the form of the indexes option, we can make that possible. We need these primitives in IDB: opaque strings, sets, sorted sets. Ideally, IDB need simply store these things and provide the standard interfaces (see Redis) to them along with a transactional mechanism. That's the perfect low-level API on which to build almost any database wrapper.
Re: Offline Web Applications status
2011/3/24 louis-rémi BABE lrb...@gmail.com ## Maybe Web devs don't use App Cache because they don't understand what it is... ## I think most webdevs are expecting more than what is offered. It seems like a half baked solution to a potentially useful requirement. I thought I'd add half a cent here, from the perspective of one who isn't a professional web developer... just a hobbyist. When I heard about the app cache, it seemed like a really great thing. Offline web apps! Cool! A way for the web to become even more ubiquitous! But, as the comment above hints, it really doesn't seem to be the full delivery of the solution (even when you get past the browser differences, setting up of mime types, debugging all this, etc). An offline web app is certainly more than just caching the code and ui files, no? It is also some kind of stand-in for the absent server... data storage, and cross-page state of some sort (e.g. I'd expected something like web workers that can live for a session, not a page). These aren't all coming together at the same time, and aren't really being presented as a unified feature (indeed, I'm not sure that they are being thought of as that) I'm sure that as html continues it's forward evolution, these will all come into play and we'll eventually see more use of the feature. David
Re: Offline Web Applications status
A couple of other app cache observations from a hobbyist who's played around with Google's Gears... I built an offline web application based on Gears, with the intention to migrate to something a bit more standardized as it became available. That was a good two years ago now, but the existing and proposed implementations still don't offer the capability that I can get from Gears. If you're in Chrome, point your browser at: http://my.offlinebible.com/. My observations are going to be related to this app. I apologise if they're quite specific, but they present a snapshot of some real-world issues encountered while developing an offline app. - *App Cache* The first task that the app undertakes is downloading a large amount of offline data. This data includes the usual web page resources, plus a massive amount of data destined for the Gears/WebSQL database. Ideally, I want to keep this in a single easy installation process. With Gears I can do this entirely from JavaScript. First, the database is set up. Then, AJAX is used to request data in chunks (JSON, about 60Mb, gzipped for bandwidth gets it down to 10Mb iirc). Once it's all requested, it gets dropped into Gears/WebSQL. Finally all the usual web assets (html, css, js, img) are cached. Throw up a progress bar, run everything in a web worker to keep the UI from freezing, done. With the existing specs, I'd have to do something slightly different. First, I'd have to hit a page with a manifest, and hook the relevant events for when the cache was ready. If I want to offer the user different offline capabilities (i.e. customize what's cached), afaik I can't do that from JavaScript. It requires some server-side code/processing to output a different manifest. Once the cache was ready, I could carry on with the existing installation. However, I don't (think I) have the script-side control over adding and removing items from the cache to customize the user's offline experience. - *Search *This is the use-case that's most important to me. I want to be able to search all the data which I took offline. My current implementation is built using manually indexed items and joins. In theory I could use the full-text capabilities of the underlying SqlLite Gears implementation, but this was a step too proprietary for me. I built all the data indexes to see what performance was like. Throw some words into the search capability, you'll see how long it takes to search. It's *fairly* quick but there's a slight lag (which locks the UI, it's synchronous ATM). I know full-text indexing is on the cards for IndexedDB. I'd love to see some sample implementations of full-text to compare speed against the manual index. For single words there might not be too much difference, but for more complex multi-word or pattern-matching, the manual index is too slow/won't work. I don't think that my scenario is particularly unusual. Taking a large amount of data offline and making it available to search seems like a pretty common use-case. To support this, there are three capabilities which I'd like to see: - Script access to add or remove items from the application cache - document.manifest.add(); - Batch operations (or support for adding a lot of similar data as quickly as possible - this takes ages if you add each record as a single transaction) - Full-text search on data I'm looking forward to this coming together eventually, might be worth an IndexedDB implementation soon : ) On 24 March 2011 05:53, David John Burrowes bain...@davidjohnburrowes.comwrote: 2011/3/24 louis-rémi BABE lrb...@gmail.com ## Maybe Web devs don't use App Cache because they don't understand what it is... ## I think most webdevs are expecting more than what is offered. It seems like a half baked solution to a potentially useful requirement. I thought I'd add half a cent here, from the perspective of one who isn't a professional web developer... just a hobbyist. When I heard about the app cache, it seemed like a really great thing. Offline web apps! Cool! A way for the web to become even more ubiquitous! But, as the comment above hints, it really doesn't seem to be the full delivery of the solution (even when you get past the browser differences, setting up of mime types, debugging all this, etc). An offline web app is certainly more than just caching the code and ui files, no? It is also some kind of stand-in for the absent server... data storage, and cross-page state of some sort (e.g. I'd expected something like web workers that can live for a session, not a page). These aren't all coming together at the same time, and aren't really being presented as a unified feature (indeed, I'm not sure that they are being thought of as that) I'm sure that as html continues it's forward evolution, these will all come into play and we'll eventually see more use of the
Re: Offline Web Applications status
IndexedDB would be more suited to what you're doing Nathan, I've always seen ApplicationCache as something to only use on the core HTML/JS/CSS and perhaps small images, like icons (none of this would change often, and would generally be rather small) whereas IndexedDB sounds more like what you did with gears. Or, there's always the File System API, but I don't think it's widely implemented yet. Excerpts from Nathan Kitchen's message of Sun Mar 27 08:46:35 +1100 2011: A couple of other app cache observations from a hobbyist who's played around with Google's Gears... I built an offline web application based on Gears, with the intention to migrate to something a bit more standardized as it became available. That was a good two years ago now, but the existing and proposed implementations still don't offer the capability that I can get from Gears. If you're in Chrome, point your browser at: http://my.offlinebible.com/. My observations are going to be related to this app. I apologise if they're quite specific, but they present a snapshot of some real-world issues encountered while developing an offline app. - *App Cache* The first task that the app undertakes is downloading a large amount of offline data. This data includes the usual web page resources, plus a massive amount of data destined for the Gears/WebSQL database. Ideally, I want to keep this in a single easy installation process. With Gears I can do this entirely from JavaScript. First, the database is set up. Then, AJAX is used to request data in chunks (JSON, about 60Mb, gzipped for bandwidth gets it down to 10Mb iirc). Once it's all requested, it gets dropped into Gears/WebSQL. Finally all the usual web assets (html, css, js, img) are cached. Throw up a progress bar, run everything in a web worker to keep the UI from freezing, done. With the existing specs, I'd have to do something slightly different. First, I'd have to hit a page with a manifest, and hook the relevant events for when the cache was ready. If I want to offer the user different offline capabilities (i.e. customize what's cached), afaik I can't do that from JavaScript. It requires some server-side code/processing to output a different manifest. Once the cache was ready, I could carry on with the existing installation. However, I don't (think I) have the script-side control over adding and removing items from the cache to customize the user's offline experience. - *Search *This is the use-case that's most important to me. I want to be able to search all the data which I took offline. My current implementation is built using manually indexed items and joins. In theory I could use the full-text capabilities of the underlying SqlLite Gears implementation, but this was a step too proprietary for me. I built all the data indexes to see what performance was like. Throw some words into the search capability, you'll see how long it takes to search. It's *fairly* quick but there's a slight lag (which locks the UI, it's synchronous ATM). I know full-text indexing is on the cards for IndexedDB. I'd love to see some sample implementations of full-text to compare speed against the manual index. For single words there might not be too much difference, but for more complex multi-word or pattern-matching, the manual index is too slow/won't work. I don't think that my scenario is particularly unusual. Taking a large amount of data offline and making it available to search seems like a pretty common use-case. To support this, there are three capabilities which I'd like to see: - Script access to add or remove items from the application cache - document.manifest.add(); - Batch operations (or support for adding a lot of similar data as quickly as possible - this takes ages if you add each record as a single transaction) - Full-text search on data I'm looking forward to this coming together eventually, might be worth an IndexedDB implementation soon : ) On 24 March 2011 05:53, David John Burrowes bain...@davidjohnburrowes.comwrote: 2011/3/24 louis-rémi BABE lrb...@gmail.com ## Maybe Web devs don't use App Cache because they don't understand what it is... ## I think most webdevs are expecting more than what is offered. It seems like a half baked solution to a potentially useful requirement. I thought I'd add half a cent here, from the perspective of one who isn't a professional web developer... just a hobbyist. When I heard about the app cache, it seemed like a really great thing. Offline web apps! Cool! A way for the web to become even more ubiquitous! But, as the comment above hints, it really doesn't seem to be the full delivery of the solution (even when you get past the browser differences, setting up of mime types, debugging all this, etc). An