Re: Offline Web Applications status

2011-03-26 Thread Jack Coulter
IndexedDB would be more suited to what you're doing Nathan, I've always seen
ApplicationCache as something to only use on the core HTML/JS/CSS and perhaps
small images, like icons (none of this would change often, and would generally
be rather small) whereas IndexedDB sounds more like what you did with gears.

Or, there's always the File System API, but I don't think it's widely
implemented yet.

Excerpts from Nathan Kitchen's message of Sun Mar 27 08:46:35 +1100 2011:
> A couple of other app cache observations from a hobbyist who's played around
> with Google's Gears...
> 
> I built an offline web application based on Gears, with the intention to
> migrate to something a bit more standardized as it became available. That
> was a good two years ago now, but the existing and proposed implementations
> still don't offer the capability that I can get from Gears.
> 
> If you're in Chrome, point your browser at: http://my.offlinebible.com/. My
> observations are going to be related to this app. I apologise if they're
> quite specific, but they present a snapshot of some real-world issues
> encountered while developing an offline app.
> 
>- *App Cache*
>The first task that the app undertakes is downloading a large amount of
>offline data. This data includes the usual web page resources, plus a
>massive amount of data destined for the Gears/WebSQL database.
> 
>Ideally, I want to keep this in a single easy installation process. With
>Gears I can do this entirely from JavaScript. First, the database is set 
> up.
>Then, AJAX is used to request data in chunks (JSON, about 60Mb, gzipped for
>bandwidth gets it down to 10Mb iirc). Once it's all requested, it gets
>dropped into Gears/WebSQL. Finally all the usual web assets (html, css, js,
>img) are cached. Throw up a progress bar, run everything in a web worker to
>keep the UI from freezing, done.
> 
>With the existing specs, I'd have to do something slightly different.
>First, I'd have to hit a page with a manifest, and hook the relevant events
>for when the cache was ready. If I want to offer the user different offline
>capabilities (i.e. customize what's cached), afaik I can't do that from
>JavaScript. It requires some server-side code/processing to output a
>different manifest.
> 
>Once the cache was ready, I could carry on with the existing
>installation. However, I don't (think I) have the script-side control over
>adding and removing items from the cache to customize the user's offline
>experience.
> 
>- *Search
>*This is the use-case that's most important to me. I want to be able to
>search all the data which I took offline. My current implementation is 
> built
>using manually indexed items and joins. In theory I could use the full-text
>capabilities of the underlying SqlLite Gears implementation, but this was a
>step too proprietary for me. I built all the data indexes to see what
>performance was like. Throw some words into the search capability, you'll
>see how long it takes to search. It's *fairly* quick but there's a slight
>lag (which locks the UI, it's synchronous ATM).
> 
>I know full-text indexing is on the cards for IndexedDB. I'd love to see
>some sample implementations of full-text to compare speed against the 
> manual
>index. For single words there might not be too much difference, but for 
> more
>complex multi-word or pattern-matching, the manual index is too
>slow/won't work.
> 
> I don't think that my scenario is particularly unusual. Taking a large
> amount of data offline and making it available to search seems like a pretty
> common use-case. To support this, there are three capabilities which I'd
> like to see:
> 
>- Script access to add or remove items from the application cache -
>document.manifest.add("");
>- Batch operations (or support for adding a lot of similar data as
>quickly as possible - this takes ages if you add each record as a single
>transaction)
>- Full-text search on data
> 
> I'm looking forward to this coming together eventually, might be worth an
> IndexedDB implementation soon : )
> 
> On 24 March 2011 05:53, David John Burrowes
> wrote:
> 
> > 2011/3/24 louis-rémi BABE 
> >
> >> ## Maybe Web devs don't use App Cache because they don't understand
> >> what it is... ##
> >>
> >
> > I think most webdevs are expecting more than what is offered. It seems like
> > a half baked solution to a potentially useful requirement.
> >
> >
> > I thought I'd add half a cent here, from the perspective of one who isn't a
> > professional web developer... just a hobbyist.
> >
> > When I heard about the app cache, it seemed like a really great thing.
> > Offline web apps! Cool! A way for the web to become even more ubiquitous!
> >
> > But, as the comment above hints, it really doesn't seem to be the full
> > delivery of the solution (even when you get past the browser differ

Re: Limited DOM in Web Workers

2011-01-08 Thread Jack Coulter
Excerpts from Boris Zbarsky's message of Sun Jan 09 10:42:46 +1100 2011:
> On 1/8/11 4:07 AM, Jack Coulter wrote:
> You're assuming that none of the DOM implementation code uses any sort
> of non-DOM objects, ever, or that if it does those objects are fully
> threadsafe.  That's just not not the case, at least in Gecko.
>
> The issue in this case is not the same DOM object being touched on
> multiple threads.  The issue is two DOM objects on different threads
> both touching some global third object.
>
> For example, the XML parser has to do some things that in Gecko can only
> be done on the main thread (DTD loading, offhand; there are a few others
> that I've seen before but don't recall offhand).
>

Ah, I didn't understand this before, thanks for the clarification.

> > I know of E4X, and while I think it's a really nice language feature, the 
> > lack
> > non-gecko support makes it substantially less useful.
>
> Well... so we're comparing a feature that's supported in Gecko but not
> other UAs to a feature that's not supported in any UA, right?  ;)
>
> (Fwiw, I think the way E4X was actually done is insane; heck it
> redefines what the |x.y()| syntax means! But perhaps some other API
> along those lines that doesn't actually create DOM nodes with all their
> weird behaviors (e.g. if you create an  it tries to load things off
> the network) and instead just parses XML into objects exposed to JS
> would be a better fit for workers.)

I agree this would probably be the best approach. We need to find or create
some API for *purely* manipulating/parsing/serialising XML documents, no
loading of resources like with the DOM. This is preferable to a javascript
based parser, for both developer ease (a single native implementation, rather
than a whole bunch of different javascript libraries), and speed reasons.


The real question is: Do we want to create something new? Perhaps at least
superficially resembling the DOM api, for developer familiarity. Or do we
simply want to have E4X universally supported, both in workers, and in
the main thread?



Re: Limited DOM in Web Workers

2011-01-08 Thread Jack Coulter
> I would strongly advice using e4x. It seems unlikely to be picked up
> by other browsers, and I'm still hoping that we'll remove support from
> gecko before long.

I assume you meant to say "advise *against*"?

> My question is instead, what part of the DOM is it that you want? One
> of the most important features of the DOM is modifying what is being
> displayed to the user. Obviously that isn't the features requested
> here. Another important feature is simply holding a tree structure.
> However plain javascript objects do that very well (better than the
> DOM in many ways).
>
> Other features of the DOM include form handling, parsing attribute
> values in the form of integers, floats, comma-separated lists, etc,
> URL resolving and more. Much of this doesn't seem very interesting to
> do on workers, or at least important to have the browser provide an
> implementation for in workers.
> 
> Hence I'm asking, why specifically would you like to access a DOM from 
> workers?

Really, only two sections: DOMParser, and holding and manipulating the
tree (appendChild/removeChild/createElement/createTextNode, etc). The
goal here is to allow workers to parse/serialise/manipulate XML with
the same power and flexibility we have with the native JSON parser.



Re: Limited DOM in Web Workers

2011-01-08 Thread Jack Coulter
Excerpts from Boris Zbarsky's message of Sat Jan 08 14:34:14 +1100 2011:
> On 1/7/11 2:29 PM, Jack Coulter wrote:
> > I'm not talking about allowing Worker's to manipulate the main DOM tree of
> > the page, but rather, exposing DOMParser, and XMLHttpRequest.responseXML,
> > and a few other objects to workers, to allow the manipulation of DOM trees
> > which are never actually rendered to the page.
> 
> Whether they're rendered doesn't necessarily matter if the DOM 
> implementation is not threadsafe (which it's not, in today's UAs).  That 
> said...
> 

Sorry, I wasn't really clear. What I meant was, a private DOM hierarchy. You
still wouldn't be able to access it in multiple places simultaneously, and
you'd still have to serialise it to a string to use it in postMessage. Forgive
my ignorance, but if this were the case, then isn't the thread-safety issue
effectively sidestepped? 

> > This would allow developers to parse and manipulate XML in workers, freeing
> > the main thread of a page to perform other tasks.
> ...
> 

Why '...'? Did I say something in error here?

> > An example of a use-case, I'd like to hack on the Strope.js XMPP
> > implementation to allow it to run in a worker thread, currently this is
> > impossible, without writing my own XML parser, which would undoubtedly
> > be slower than the native DOMParser)
> 
> If you think you could do this with your own XML parser, is there a 
> reason you can't do it with e4x (I never thought I'd say that, but this 
> seems like an actually good use case for something like e4x)?  That 
> should work fine in workers in Gecko-based browsers that support it, and 
> doesn't drag in the entire DOM implementation.

I know of E4X, and while I think it's a really nice language feature, the lack
non-gecko support makes it substantially less useful.

> That leaves the problem of convincing developers of those ECMAScript 
> implementations that don't support e4x to support it, of course; while 
> things like http://code.google.com/p/v8/issues/detail?id=235#c42 don't 
> necessarily fill me with hope in that regard it may still be simpler 
> than convincing all browsers to rewrite their DOMs to be threadsafe in 
> the way that would be needed to support exposing an actual DOM in workers.

Heh, some coincidence, I was actually reading through this very thread earlier,
today. After thinking about it, I'd say that E4X would be the best solution for
XML in Workers, but would need to be supported more widely.



Limited DOM in Web Workers

2011-01-07 Thread Jack Coulter
Hi,

I have a proposal of sorts, regarding Workers. As we all know, there's no
access to the DOM from within a Web Worker. While this is ideal for security
purposes, I can't help but think a restricted subset of the available DOM
manipulation methods would be incredibly useful.

I'm not talking about allowing Worker's to manipulate the main DOM tree of
the page, but rather, exposing DOMParser, and XMLHttpRequest.responseXML,
and a few other objects to workers, to allow the manipulation of DOM trees
which are never actually rendered to the page.

This would allow developers to parse and manipulate XML in workers, freeing
the main thread of a page to perform other tasks.

A possible counter argument, may be "Why not just use JSON instead?", and
while I agree that JSON is the easiest method of serialising and parsing
objects, sometimes developers must work with data from a source which
only provides XML.


An example of a use-case, I'd like to hack on the Strope.js XMPP
implementation to allow it to run in a worker thread, currently this is
impossible, without writing my own XML parser, which would undoubtedly
be slower than the native DOMParser)

Any thoughts?


Regards,
Jack Coulter