Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-26 Thread Nikunj Mehta
What is the minimum that can be in IDB? I am guessing the following:

1. Sorted key-opaque value transactional store
2. Lookup of keys by values (or parts thereof)

#1 is essential.
#2 is unavoidable because you would want to efficiently manipulate values by
values as opposed to values by key.

I know of no efficient way of doing callbacks with JS. Moreover, avoiding
indices completely seems to miss the point. Yes, IDB can be used without key
paths and indices. When you do that, you would not have any headache of
setVersion since every version change either adds or removes an object
store. Next, originally, I also had floated the idea of application managed
indices, but implementors thought of it as cruft.

On Sun, Mar 20, 2011 at 3:10 PM, Joran Greef jo...@ronomon.com wrote:


  On 20 Mar 2011, at 4:54 AM, Jonas Sicking wrote:
 
  I don't understand what you are saying about application state though,
  so please do start that as a separate thread.

 At present, there's no way for an application to tell IDB what indexes to
 modify w.r.t. an object at the exact moment when putting or deleting that
 object. That's because this behavior is defined in advance using
 createIndex in a setVersion transaction. And then how IDB extracts the
 referenced value from the object is done using an IDB idea of key paths.
 But right there, in defining the indexes in advance (and not when the index
 is actually modified, which is when the object itself is modified), you've
 captured application state (data relationships that should be known only to
 the application) within IDB. Because this is done in advance (because IDB
 seems to have inherited this assumption that this is just the way MySQL
 happens to do it), there's a disconnect between when the index is defined
 and when it's actually used. And because of key paths you now need to spec
 out all kinds of things like how to handle compound keys, multiple values.
 It's becoming a bit of a spec-fest.

 That this bubble of state gets captured in IDB, it also means that IDB now
 needs to provide ways of updating that captured state within IDB when it
 changes in the application (which will happen, so essentially you now have
 your indexing logic stuck in the database AND in the application and the
 application developer now has to try and keep BOTH in sync using this
 awkward pre-defined indexes interface), thus the need for a setVersion
 transaction in the first place. None of this would be necessary if the
 application could reference indexes to be modified (and created if they
 don't exist, or deleted if they would then become empty) AT THE POINT of
 putting or deleting an object. Things like data migrations would also be
 better served if this were possible since this is something the application
 would need to manage anyway. Do you follow?

 The application is the right place to be handling indexing logic. IDB just
 needs to provide an interface to the indexing implementation, but not handle
 extracting values from objects or deciding which indexes to modify. That's
 the domain of the application. It's a question of encapsulation. IDB is
 crossing the boundaries by demanding to know ABOUT the data stored, and not
 just providing a simple way to put an object, and a simple way to put a
 reference to an object to an index, and a simple way to query an index and
 intersect or union an index with another. Essentially an object and its
 index memberships need to be completely opaque to IDB and you are doing the
 opposite. Take a look at the BDB interface. Do you see a setVersion or
 createIndex semantic in there?


BDB has secondary databases, which are the same as indices with a one to
many relation between primary and secondary database. Moreover, BDB uses
application callbacks to let the application encapsulate the definition of
the index.


 Take a look at Redis and Tokyo and many other things. Do you see a
 setVersion or createIndex semantic in there? Do these databases have any
 idea about the contents of objects? Any concept of key paths?


I, for one, am not enamored by key paths. However, I am also morbidly aware
of the perils in JS land when using callback like mechanisms. Certainly, I
would like to hear from developers like you how you find IDB if you were to
not use any createIndex at all. Or at least that you would like to manage
your own indices.


 No, and that's the whole reason these databases were created in the first
 place. I'm sure you have read the BDB papers. Obviously this is not the
 approach of MySQL. But if IDB is trying to be MySQL but saying it wants to
 be BDB then I don't know. In any event, Firefox would be brave to also embed
 SQLite. Let the better API win.

 How much simpler could it be? At the end of the day, it's all objects and
 sets and sorted sets, and see Redis' epiphany on this point. IDB just needs
 to provide transactional access to these sets. The application must decide
 what goes in and out of these sets, and must be able to do 

Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-26 Thread Joran Greef
 On 26 Mar 2011, at 10:14 AM, Nikunj Mehta wrote:
 
 What is the minimum that can be in IDB? I am guessing the following:
 
 1. Sorted key-opaque value transactional store
 2. Lookup of keys by values (or parts thereof)

Yes, this is what we need. In programmer speak: objects (opaque strings), sets 
(hash indexes), sorted sets (range indexes).

 I know of no efficient way of doing callbacks with JS. Moreover, avoiding 
 indices completely seems to miss the point.

Callbacks are unnecessary. This is what you would want to do as a developer 
using the current form of IDB:

objectStore.putObject({ name: Joran, emails: [jo...@gmail.com, 
jo...@ronomon.com] }, { id: 'arbitraryObjectIdProvidedByTheApplication', 
indexes: [emails=jo...@gmail.com, emails=jo...@ronomon.com, name=Joran] 
});

IDB would then store the user object using the id provided by the application, 
and make sure it's referenced by this id in the emails=jo...@gmail.com, 
emails=jo...@ronomon.com, name=Joran index references provided (creating 
these indexes along the way if need be). The application is responsible for 
passing in the extra id and indexes options to putObject.

Supporting range indexes would be a question of expanding the above to let the 
developer pass in a sort score along with the index reference.

 Next, originally, I also had floated the idea of application managed indices, 
 but implementors thought of it as cruft.

I can understand how application managed indices would lead to less work on the 
part of the spec committee. There seems to be some perverse human 
characteristic that likes to make easy things difficult. Ships will sail around 
the world but the Flat Earth Society will flourish.

 I, for one, am not enamored by key paths. However, I am also morbidly aware 
 of the perils in JS land when using callback like mechanisms. Certainly, I 
 would like to hear from developers like you how you find IDB if you were to 
 not use any createIndex at all. Or at least that you would like to manage 
 your own indices.

I am begging to be able to manage my indices. I know my data. I do not want to 
use any createIndex to declare indexes in advance of when I may or may not use 
them. What advantage would that give me? I want to create/update indexes only 
when I put or delete objects and I want to have control over which indexes to 
update accordingly. With one small change to the putObject and deleteObject 
interfaces, in the form of the indexes option, we can make that possible.

We need these primitives in IDB: opaque strings, sets, sorted sets. Ideally, 
IDB need simply store these things and provide the standard interfaces (see 
Redis) to them along with a transactional mechanism. That's the perfect 
low-level API on which to build almost any database wrapper.


Re: Offline Web Applications status

2011-03-26 Thread David John Burrowes
 2011/3/24 louis-rémi BABE lrb...@gmail.com
 ## Maybe Web devs don't use App Cache because they don't understand
 what it is... ##
 
 I think most webdevs are expecting more than what is offered. It seems like a 
 half baked solution to a potentially useful requirement.

I thought I'd add half a cent here, from the perspective of one who isn't a 
professional web developer... just a hobbyist.

When I heard about the app cache, it seemed like a really great thing. Offline 
web apps! Cool! A way for the web to become even more ubiquitous!

But, as the comment above hints, it really doesn't seem to be the full delivery 
of the solution (even when you get past the browser differences, setting up of 
mime types, debugging all this, etc).  An offline web app is certainly more 
than just caching the code and ui files, no?  It is also some kind of stand-in 
for the absent server... data storage, and cross-page state of some sort (e.g. 
I'd expected something like web workers that can live for a session, not a 
page).  These aren't all coming together at the same time, and aren't really 
being presented as a unified feature (indeed, I'm not sure that they are 
being thought of as that)

I'm sure that as html continues it's forward evolution, these will all come 
into play and we'll eventually see more use of the feature.  

David




Re: Offline Web Applications status

2011-03-26 Thread Nathan Kitchen
A couple of other app cache observations from a hobbyist who's played around
with Google's Gears...

I built an offline web application based on Gears, with the intention to
migrate to something a bit more standardized as it became available. That
was a good two years ago now, but the existing and proposed implementations
still don't offer the capability that I can get from Gears.

If you're in Chrome, point your browser at: http://my.offlinebible.com/. My
observations are going to be related to this app. I apologise if they're
quite specific, but they present a snapshot of some real-world issues
encountered while developing an offline app.

   - *App Cache*
   The first task that the app undertakes is downloading a large amount of
   offline data. This data includes the usual web page resources, plus a
   massive amount of data destined for the Gears/WebSQL database.

   Ideally, I want to keep this in a single easy installation process. With
   Gears I can do this entirely from JavaScript. First, the database is set up.
   Then, AJAX is used to request data in chunks (JSON, about 60Mb, gzipped for
   bandwidth gets it down to 10Mb iirc). Once it's all requested, it gets
   dropped into Gears/WebSQL. Finally all the usual web assets (html, css, js,
   img) are cached. Throw up a progress bar, run everything in a web worker to
   keep the UI from freezing, done.

   With the existing specs, I'd have to do something slightly different.
   First, I'd have to hit a page with a manifest, and hook the relevant events
   for when the cache was ready. If I want to offer the user different offline
   capabilities (i.e. customize what's cached), afaik I can't do that from
   JavaScript. It requires some server-side code/processing to output a
   different manifest.

   Once the cache was ready, I could carry on with the existing
   installation. However, I don't (think I) have the script-side control over
   adding and removing items from the cache to customize the user's offline
   experience.

   - *Search
   *This is the use-case that's most important to me. I want to be able to
   search all the data which I took offline. My current implementation is built
   using manually indexed items and joins. In theory I could use the full-text
   capabilities of the underlying SqlLite Gears implementation, but this was a
   step too proprietary for me. I built all the data indexes to see what
   performance was like. Throw some words into the search capability, you'll
   see how long it takes to search. It's *fairly* quick but there's a slight
   lag (which locks the UI, it's synchronous ATM).

   I know full-text indexing is on the cards for IndexedDB. I'd love to see
   some sample implementations of full-text to compare speed against the manual
   index. For single words there might not be too much difference, but for more
   complex multi-word or pattern-matching, the manual index is too
   slow/won't work.

I don't think that my scenario is particularly unusual. Taking a large
amount of data offline and making it available to search seems like a pretty
common use-case. To support this, there are three capabilities which I'd
like to see:

   - Script access to add or remove items from the application cache -
   document.manifest.add();
   - Batch operations (or support for adding a lot of similar data as
   quickly as possible - this takes ages if you add each record as a single
   transaction)
   - Full-text search on data

I'm looking forward to this coming together eventually, might be worth an
IndexedDB implementation soon : )

On 24 March 2011 05:53, David John Burrowes
bain...@davidjohnburrowes.comwrote:

 2011/3/24 louis-rémi BABE lrb...@gmail.com

 ## Maybe Web devs don't use App Cache because they don't understand
 what it is... ##


 I think most webdevs are expecting more than what is offered. It seems like
 a half baked solution to a potentially useful requirement.


 I thought I'd add half a cent here, from the perspective of one who isn't a
 professional web developer... just a hobbyist.

 When I heard about the app cache, it seemed like a really great thing.
 Offline web apps! Cool! A way for the web to become even more ubiquitous!

 But, as the comment above hints, it really doesn't seem to be the full
 delivery of the solution (even when you get past the browser differences,
 setting up of mime types, debugging all this, etc).  An offline web app is
 certainly more than just caching the code and ui files, no?  It is also some
 kind of stand-in for the absent server... data storage, and cross-page state
 of some sort (e.g. I'd expected something like web workers that can live for
 a session, not a page).  These aren't all coming together at the same time,
 and aren't really being presented as a unified feature (indeed, I'm not
 sure that they are being thought of as that)

 I'm sure that as html continues it's forward evolution, these will all come
 into play and we'll eventually see more use of the 

Re: Offline Web Applications status

2011-03-26 Thread Jack Coulter
IndexedDB would be more suited to what you're doing Nathan, I've always seen
ApplicationCache as something to only use on the core HTML/JS/CSS and perhaps
small images, like icons (none of this would change often, and would generally
be rather small) whereas IndexedDB sounds more like what you did with gears.

Or, there's always the File System API, but I don't think it's widely
implemented yet.

Excerpts from Nathan Kitchen's message of Sun Mar 27 08:46:35 +1100 2011:
 A couple of other app cache observations from a hobbyist who's played around
 with Google's Gears...
 
 I built an offline web application based on Gears, with the intention to
 migrate to something a bit more standardized as it became available. That
 was a good two years ago now, but the existing and proposed implementations
 still don't offer the capability that I can get from Gears.
 
 If you're in Chrome, point your browser at: http://my.offlinebible.com/. My
 observations are going to be related to this app. I apologise if they're
 quite specific, but they present a snapshot of some real-world issues
 encountered while developing an offline app.
 
- *App Cache*
The first task that the app undertakes is downloading a large amount of
offline data. This data includes the usual web page resources, plus a
massive amount of data destined for the Gears/WebSQL database.
 
Ideally, I want to keep this in a single easy installation process. With
Gears I can do this entirely from JavaScript. First, the database is set 
 up.
Then, AJAX is used to request data in chunks (JSON, about 60Mb, gzipped for
bandwidth gets it down to 10Mb iirc). Once it's all requested, it gets
dropped into Gears/WebSQL. Finally all the usual web assets (html, css, js,
img) are cached. Throw up a progress bar, run everything in a web worker to
keep the UI from freezing, done.
 
With the existing specs, I'd have to do something slightly different.
First, I'd have to hit a page with a manifest, and hook the relevant events
for when the cache was ready. If I want to offer the user different offline
capabilities (i.e. customize what's cached), afaik I can't do that from
JavaScript. It requires some server-side code/processing to output a
different manifest.
 
Once the cache was ready, I could carry on with the existing
installation. However, I don't (think I) have the script-side control over
adding and removing items from the cache to customize the user's offline
experience.
 
- *Search
*This is the use-case that's most important to me. I want to be able to
search all the data which I took offline. My current implementation is 
 built
using manually indexed items and joins. In theory I could use the full-text
capabilities of the underlying SqlLite Gears implementation, but this was a
step too proprietary for me. I built all the data indexes to see what
performance was like. Throw some words into the search capability, you'll
see how long it takes to search. It's *fairly* quick but there's a slight
lag (which locks the UI, it's synchronous ATM).
 
I know full-text indexing is on the cards for IndexedDB. I'd love to see
some sample implementations of full-text to compare speed against the 
 manual
index. For single words there might not be too much difference, but for 
 more
complex multi-word or pattern-matching, the manual index is too
slow/won't work.
 
 I don't think that my scenario is particularly unusual. Taking a large
 amount of data offline and making it available to search seems like a pretty
 common use-case. To support this, there are three capabilities which I'd
 like to see:
 
- Script access to add or remove items from the application cache -
document.manifest.add();
- Batch operations (or support for adding a lot of similar data as
quickly as possible - this takes ages if you add each record as a single
transaction)
- Full-text search on data
 
 I'm looking forward to this coming together eventually, might be worth an
 IndexedDB implementation soon : )
 
 On 24 March 2011 05:53, David John Burrowes
 bain...@davidjohnburrowes.comwrote:
 
  2011/3/24 louis-rémi BABE lrb...@gmail.com
 
  ## Maybe Web devs don't use App Cache because they don't understand
  what it is... ##
 
 
  I think most webdevs are expecting more than what is offered. It seems like
  a half baked solution to a potentially useful requirement.
 
 
  I thought I'd add half a cent here, from the perspective of one who isn't a
  professional web developer... just a hobbyist.
 
  When I heard about the app cache, it seemed like a really great thing.
  Offline web apps! Cool! A way for the web to become even more ubiquitous!
 
  But, as the comment above hints, it really doesn't seem to be the full
  delivery of the solution (even when you get past the browser differences,
  setting up of mime types, debugging all this, etc).  An