Re: Allow ... centralized dialog up front

2013-02-06 Thread Keean Schupke
Something is better than nothing, and both the iPhone and Android systems
are better than not asking the user at all. The principle of security in
depth is that you don't rely on a single security feature that may be
flawed, but have a multi layered approach to security.

I think that giving a user control of what information is released is a
necessary part of that model, and I cannot see anything in the documents
you link to that contradicts that. Are you suggesting that users should be
denied control of their information? What about information in sensitive
environments (politicians, security workers etc) who understand privacy are
suggesting we deny then the ability to control the access?

For me personally, being able to see all the information an application
will access and be able to return to this information and edit my
preferences at any time seems the best way to do this. Visibility is key -
you must be able to see what permissions have been granted so they can be
revoked, or just to reassure that certain permissions are denied. Others
may have different ideas, however nearly all on-line services are moving to
this model so it must offer some benefits. This model is also essential in
any kind of enterprise setting where the IT department will want to audit
and approve apps for use.


Cheers,
Keean.



On 6 February 2013 11:03, Robin Berjon ro...@w3.org wrote:

 On 06/02/2013 08:36 , Keean Schupke wrote:

 I don't think you can say either an up front dialog or popups do not
 work. There are clear examples of both working, Android and iPhone
 respectively. Each has a different set of trade-offs and is better in
 some circumstances, worse in others.


 If by working you mean that it is technically feasible and will provide
 developers with access to features then sure.

 If however you mean that it succeeds in protecting users against agreeing
 to escalate privileges to malicious applications then, no, it really,
 really does not work at all.

 Security through user prompting is sweeping the problem under the rug.
 Usually this is the point at which someone will say but we have to
 *educate* the users!. No. We don't. Users don't want to be educated, and
 they shouldn't have to be. We're producing technology for *user* agents. It
 is *our* responsibility to ensure that users remain safe, even in as much
 as possible against their own mistakes.

 And I'm sorry to go all Godwin on you, but the prompting approach is the
 Java applet security model all over again. Let's just not go back there,
 shall we?

 It's not as if this debate hasn't been had time and over again. See (old
 and unfinished):


 http://darobin.github.com/api-**design-privacy/api-design-**
 privacy.html#privacy-**enhancing-api-patternshttp://darobin.github.com/api-design-privacy/api-design-privacy.html#privacy-enhancing-api-patterns

 That includes a short discussion of why the Geolocation model is wrong.
 All of this has been extensively discussed in the DAP WG, as well as IIRC
 around the Web Notifications work. There have been a few attempts to work
 out the details (tl;dr they don't fly):

 
 http://w3c-test.org/dap/**proposals/request-feature/http://w3c-test.org/dap/proposals/request-feature/
 
 http://dev.w3.org/2009/dap/**docs/feat-perms/feat-perms.**htmlhttp://dev.w3.org/2009/dap/docs/feat-perms/feat-perms.html

 That's one of the reasons we have a SysApps WG today. As it happens,
 they're working on a security model, too.

 This is not to say that declaring required privileges cannot be useful.
 There certainly are cases in which it can integrate into a larger system.
 But that larger system isn't upfront prompting.


 --
 Robin Berjon - http://berjon.com/ - @robinberjon





Re: Allow ... centralized dialog up front

2013-02-05 Thread Keean Schupke
I don't think you can say either an up front dialog or popups do not work.
There are clear examples of both working, Android and iPhone respectively.
Each has a different set of trade-offs and is better in some circumstances,
worse in others.

In my opinion an API should allow for both, so that the user experience can
be inline with platforms standards. It us clear iPhone users with safari
will expect one behavior, android users with chrome will expect another.

In order to allow an up front dialog permissions need to be declared up
front, preferably statically in markup (as you would do an a manifest file
for an android application). Browsers would still be able to delay asking
the user for permission until the first use. The opposite of requiring
browsers wishing to use an up front dialog to use some kind of code
scanning to detect calls to services with permissions seems unreliable and
complex. Static permission declarations in markup seems a much simpler and
more light weight solution with other benefits like searchability and app
stores being able to read and display this.

Cheers,
Keean.
On 6 Feb 2013 05:44, Charles Pritchard ch...@jumis.com wrote:

 This direction of placing permissions up there in the site info expansion
 in Chrome feels like the right direction. That spot where they show where
 an SSL cert is valid/expired.

 Now I can easily see cookies and flip various settings in one click as I
 look at site info.

 I've been working on a web app where I don't need any upfront permissions,
 but the user can elect to elevate to clipboard, XSS and a high disk quota.
 I've certainly felt the cost of multiple dialogs vs a one-time grant
 everything prompt.



 On Feb 5, 2013, at 5:09 PM, Charles McCathie Nevile 
 cha...@yandex-team.ru wrote:

 TL;DR: Being able to declare the permissions that an app asks for might be
 useful. User agents are and should continue to be free to innovate in ways
 they present the requests to the user, because a block dialogue isn't a
 universal improvement on current practice (which in turn isn't the same
 everywhere).

 On Mon, 04 Feb 2013 01:35:43 +0100, Florian Bösch pya...@gmail.com
 wrote:

 So how exactly do you imagine this going down when an application that
 uses half a dozen such capabilities starts? Clicking trough half a dozen
 allow - allow - allow - allow - allow - allow, you really think the
 user's gonna bother what the 5th or sixth allow is about?


 Where there are multiple permissions required the way to ensure user
 attention isn't as simple as a list that doesn't get read, witha single
 button clicked by reflex, or multiple buttons to be clicked by reflex
 without reading.

 At least that seems to be what the research shows.

 You'll end up annoying the user, the developer and scaring people off a
 page. Somehow I can't see that as the function of new capabilities you can
 offer on a page. Furthermore, some capabilities (like pointerlock) actively
 interfere with the idea that when you need it you can click it (such as the
 concept of pointer-lock-drag which requests pointerlock on mousedown and
 releases it on mouseup) where your click it when you need it idea will
 always fail the first usage.


 This may be true. But pointer-lock is an example of something that needs
 the entire UX to be thought through. simply switching from one to the other
 without the user knowing is also poor UX, since it risks making the user
 think their system is broken. Add to this a user working with e.g.
 mousekeys, or a magnifier at a few hundred percent plus high-contrast.

 The problems are not simple, and it is unlikely the solutions will be
 either. Ian's claim that everything can be done seamlessly without making
 it seem like a security dialog may be over-confident, and as Robin points
 out the first UI developed (well, the second actually) might not be the
 best approach in the long run, but it is certainly the direction we should
 be aiming.

 So where are we? The single up front dialogue doesn't work. We know
 that. Mutliple contextual requests go from being effective to being
 counter-productive at some magic tipping point that is hard to predict.

 To take an example, let's say I have a chat application that can use
 web-cam and geolocation. Some user agents might decide to put the
 permissions up front when you first load the app. And some users will be
 fine with that. Some will be happy to let it use geolocation when it wants,
 but will want to turn the camera on and off explicitly (note that Skype -
 one of the best-known video chat apps there is - allows this as a matter of
 course. I don't know of anyone who has ever complained).

 Some app stores might refuse to offer the service unless you have
 already accepted that you will let any app from the store use geolocation
 and camera. Others will be quite happy with a user agent that (like skype -
 or Opera) puts the permissions interface in front of the user to modify at
 will. And there are various 

Re: Allow ... centralized dialog up front

2013-02-02 Thread Keean Schupke
I would like the permissions to be changeable. Not a one time dialog that
appears and irrevocably commits me to my choices, but a page with
enable/disable toggles I can return and review the permissions and change
at any time.

How about instead of a request API the required permissions are in tags so
they can be machine readable on page load.

The browser can read the required permissions tags as page load and create
a settings page for the app where each permission can be toggled.

This had the advantage that search engines etc can include permission
requirements in searches. (I want a diary app that does not use my
camera...)

Cheers,
Keean.

Cheers,
Keean.
On 2 Feb 2013 09:09, Florian Bösch pya...@gmail.com wrote:

I do not particularly care what research you will find to support the
UI-flow that the existence of a requestAPIs API will eventually give rise
to. I do say simply this, the research presented, and pretty much common
sense as well easily shows that the current course is foolhardy and ungainy
on both user and developer.


On Sat, Feb 2, 2013 at 3:37 AM, Charles McCathie Nevile 
cha...@yandex-team.ru wrote:

 **
 On Fri, 01 Feb 2013 15:29:16 +0100, Florian Bösch pya...@gmail.com
 wrote:

 Repetitive permission dialog popups at random UI-flows will not solve the
 permission fatique any more than a centralized one does. However a
 centralized permission dialog will solve the following things fairly
 effectively:

 - repeated popup fatique


 Sure. And that is valuable in principle.

 - extension of trust towards a site regardless of what they ask for (do I
 trust that Indie game developer? Yes! Do I trust google? No! or vice versa)


 I don't think so. As Adrienne said, as I have experienced myself, without
 understanding what the permission is for trust can be reduced as easily as
 increased.

  - make it easy for developers not to add UI-flows into their application
 leading to things the user didn't want to give (Do we want a menu entry
 save to local storage if the user checked off the box to allow local
 storage? I think not.)


 - make it easy for developers to not waste users time by pretending to
 have a working application, which requires things the user didn't want to
 give. (Do we really want to offer our geolocated, web-camera using chat app
 to users who didn't want to give permission to to either? I think not. Do
 we want to make him find that out after he's been entering our UI-flow and
 been pressing buttons 5 minutes later? I think not.)

 These are not so clear. As a user, I *do* want to have applications to
 which I will give, and revoke, at my discretion, certain rights. Twitter
 leaps to mind as something that wants access to geolocation, something I
 occasionally grant. for specific requests but blanket refuse in general.
 The hypothetical example you offer is something that in general it seems
 people are happy to offer to a user who has turned off both capabilities.

 I think the ability for a page to declare permission requests in a
 standard way, the same as applications and extensions, is worth pursuing,
 because there are now a number of vendors using stuff that seems to only
 differ by syntax.

 The user agent presentation is a more complex question. I believe there is
 more research done and being done than you seem to credit, and as Hallvord
 said, I think this is an area where users evolve too.

 For the reasons outlined already in the thread I don't think an
 Android-style here are all the requests is as good a solution in practice
 as it seems, and there is a need for continued research as well as
 implementations we can test.

 cheers

 Chaals





 On Fri, Feb 1, 2013 at 3:22 PM, Charles McCathie Nevile 
 cha...@yandex-team.ru wrote:

 On Fri, 01 Feb 2013 15:16:04 +0100, Florian Bösch pya...@gmail.com
 wrote:

 On Fri, Feb 1, 2013 at 3:02 PM, Adrienne Porter Felt 
 adriennef...@gmail.com wrote:

 My user research on Android found that people have a hard
 time connecting upfront permission requests to the application feature that
 needs the permission. This meant that people have no real basis by which to
 allow or deny the request, except for their own supposition.  IMO, this
 implies that the better plan is to temporally tie the permission request to
 the feature so that the user can connect the two.

 In some circumstances this works, in others, it does not. Consider that
 not every capability has a UI-flow, and that some UI flows are fairly
 obscure. More often than not a page will initiate a flurry of permission
 dialogs up front to get it out of the way. Some of the UI-flows to use a
 capability happen deep inside an application activity and can be severely
 distracting, or crippling to the application.

 If a developer wants to use the blow-by-blow popup dialogs, he can still
 do so by simply not calling an API to get done with the business up front.
 But for those who know their application will not work without features X,
 Y, Z, A, B and C there is 

Re: Allow ... centralized dialog up front

2013-02-02 Thread Keean Schupke
There are benefits to the user, in that it allows all permissions to be
managed from one place.

I am not sure I like the idea of making the popups an application thing. I
think it should be decided by the browser. In any case you would still need
the ...Allow callbacks as the user may have gone to the permission
review/edit page and disabled some permissions since the app started.

Cheers,
Keean.

Cheers,
Keean.
 On 2 Feb 2013 10:27, Florian Bösch pya...@gmail.com wrote:

 On Sat, Feb 2, 2013 at 11:16 AM, Keean Schupke ke...@fry-it.com wrote:

 I think a static declaration is better for security, so if a permission
 is not there I don't think it should be allowed to request it later. Of
 course how this is presented to the user is entirely separate, an the UI
 could defer the request until the first time the restricted feature is
 used, or allow all permissions that might be needed to be reviewed and
 enabled/disabled in one place.

 That kills any benefit a developer could derive. The very idea is that you
 can figure out up front what your user is gonna let you do, and take
 appropriate steps (not adding parts of the UI, presenting a suitable
 message that the application won't work etc.) as well as that if a user has
 agreed up front, that you can rely on that API and don't need to
 double-check at every step and add a gazillion pointless
 onFeatureYaddaYaddaAllowCallback handlers.



Re: Allow ... centralized dialog up front

2013-02-02 Thread Keean Schupke
I didn't think of that. The app would have to maintain its own set of
permission flags updated by the callback. I am not sure that's easier than
just chaining an anonymous function... But I guess that's a programming
style issue.

Cheers,
Keean.
 On 2 Feb 2013 10:47, Florian Bösch pya...@gmail.com wrote:

 And you can have the *the* callback (singular, centralized) as
 onAPIPermissionChange just fine.

 If you want to improve things for the user and the developer, you can't go
 with a solution that doesn't make it any easier for the developer. Your
 solution will be ignored, nay ridiculed. If you want developers to play
 along, you've got to give them some carrot as well.


 On Sat, Feb 2, 2013 at 11:43 AM, Keean Schupke ke...@fry-it.com wrote:

 There are benefits to the user, in that it allows all permissions to be
 managed from one place.

 I am not sure I like the idea of making the popups an application thing.
 I think it should be decided by the browser. In any case you would still
 need the ...Allow callbacks as the user may have gone to the permission
 review/edit page and disabled some permissions since the app started.

 Cheers,
 Keean.

 Cheers,
 Keean.
  On 2 Feb 2013 10:27, Florian Bösch pya...@gmail.com wrote:

 On Sat, Feb 2, 2013 at 11:16 AM, Keean Schupke ke...@fry-it.com wrote:

 I think a static declaration is better for security, so if a permission
 is not there I don't think it should be allowed to request it later. Of
 course how this is presented to the user is entirely separate, an the UI
 could defer the request until the first time the restricted feature is
 used, or allow all permissions that might be needed to be reviewed and
 enabled/disabled in one place.

 That kills any benefit a developer could derive. The very idea is that
 you can figure out up front what your user is gonna let you do, and take
 appropriate steps (not adding parts of the UI, presenting a suitable
 message that the application won't work etc.) as well as that if a user has
 agreed up front, that you can rely on that API and don't need to
 double-check at every step and add a gazillion pointless
 onFeatureYaddaYaddaAllowCallback handlers.





Re: [IndexedDB] Closing on bug 9903 (collations)

2011-06-01 Thread Keean Schupke
On 1 June 2011 01:37, Pablo Castro pablo.cas...@microsoft.com wrote:


 -Original Message-
 From: simetri...@gmail.com [mailto:simetri...@gmail.com] On Behalf Of
 Aryeh Gregor
 Sent: Tuesday, May 31, 2011 3:49 PM

  On Tue, May 31, 2011 at 6:39 PM, Pablo Castro
  pablo.cas...@microsoft.com wrote:
   No, that was poor wording on my part, I keep using locale in the
 wrong context. I meant to have the API take a proper collation identifier.
 The identifier can be as specific as the caller wants it to be. The
 implementation could choose to not honor some specific detail if it can't
 handle it (to the extent that doing so is allowed by the specification of
 collation names), or fail because it considers that not handling a
 particular aspect of the collation identifier would severely deviate from
 the caller's expectations.
 
  I'm not sure I understand you.  My personal opinion is that there
  should be no undefined behavior here.  If authors are allowed to pass
  collation identifiers, the spec needs to say exactly how they're to be
  interpreted, so the same identifier passed to two different browsers
  will result in the same collation, i.e., the same strings need to sort
  the same cross-browser.  Having only binary collation is better than
  having non-binary collations but not defining them, IMO.

 I thought BCP47 allowed implementations to drop subtags if needed. I just
 re-read the spec and it seems that it only allows to do that in constrained
 cases where you can't fit the whole name in your buffer (which wouldn't
 apply to the context discussed here). My first instinct is that this is
 quite a bit to guarantee (full consistency in collation), but it seems that
 that's what the spec is shooting for.

   Given the amount of debate on this, could we at least agree that we
 can do binary for v1? We can then have an open item for v2 on taking
 collation names and sort according to UCA or taking callbacks and such.
 
  I'm okay with supporting only binary to start with.

 Great. I'll still wait a bit to see what other folks think, and then update
 the bug one way or the other.

 Thanks
 -pablo


The discussion sounds like it is headed in the right direction. Are there
any issues with non-unicode encodings that need to be dealt with (HTTP
headers default to ISO-8859 I think). Would people be expected to convert on
read into UTF-16 strings or use typed-arrays?


Cheers,
Keean.


Re: [IndexedDB] Closing on bug 9903 (collations)

2011-05-06 Thread Keean Schupke
On 6 May 2011 03:00, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, May 4, 2011 at 11:12 PM, Keean Schupke ke...@fry-it.com wrote:
  On 5 May 2011 00:33, Aryeh Gregor simetrical+...@gmail.com wrote:
 
  On Tue, May 3, 2011 at 7:57 PM, Jonas Sicking jo...@sicking.cc wrote:
   I don't think we should do callbacks for the first version of
   javascript. It gets very messy since we can't rely on that the script
   function will be returning stable values.
 
  The worst that would happen if it didn't return stable values is that
  sorting would return unpredictable results.
 
  Worst is an infinite loop - no return.
 
 
   So the choice here really is between only supporting some form of
   binary sorting, or supporting a built-in set of collations. Anything
   else will have to wait for version 2 in my opinion.
 
  I think it would be a mistake to try supporting a limited set of
  natural-language collations.  Binary collation is fine for a first
  version.  MySQL only supported binary collation up through version 4,
  for instance.
 
  A good point about MySQL.
 
 
  On Wed, May 4, 2011 at 3:49 AM, Keean Schupke ke...@fry-it.com wrote:
   I thought only the app that created the db could open it (for security
   reasons)... so it becomes the app's responsibility to do version
   control.
   The comparison function is not going to change by itself - someone has
   to go
   into the code and change it, when they do that they should up the
   revision
   of the database, if that change is incompatible.
 
  Why should we let such a pitfall exist if we can just store the
  function and avoid the issue?
 
  I don't see it as a pitfall, it is an has the advantage of transparency.
 
 
   There is exactly the same problem with object properties. If the app
   changes
   to expect a new property on all objects stored, then the app has to
   correctly deal with the update.
 
  If a requested property doesn't exist, I assume the API will fail
  immediately with a clear error code.  It will not fail silently and
  mysteriously with no error code.  (Again, I haven't looked at it
  closely, or tried to use it.)
 
  What if the new version uses the same property name for a different
 thing?
  For example in V1 'Employer' is a string name, and in V2 'Employer' is a
  reference to another object. You may say 'you should change the column
  name'? Right thats just the same as me saying you should change the DB
  version number when you change the collation algorithm. Its the same
 thing.
  People seem to be making a big fuss about having a non-persisted
 collation
  function defined in user code, when many many things require the code to
  have the correct model of the data stored in the database to work
 properly.
  It seems illogical to make a special case for this function, and not do
  anything about all the other cases. IMHO either the database should have
 a
  stored schema, or it should not. If IndexedDB is going the direction of
 not
  having a stored schema, then the designers should have the confidence in
  their decision to stick with it and at least produce something with a
  consistent approach to the problem.
 
 
   2) making things easy for the user - for me a simpler more predictable
   API
   is better for the user. Having a function stored inside the database
 is
   bad,
   because you cannot see what function might be stored in there...
 
  We could let you query the stored function.
 
  Why would you need to read it. Every time you open the database you would
  need to check the function is the one you expect. The code would have to
  contain the function so it can compare it with the one in the DB and
 update
  it if necessary. If the code contains the function there are two copies
 of
  the function, one in the database and one in the code? which one is
 correct?
  which one is it using? So sometimes you will write the new function to
 the
  database, and sometimes you will not? More paths to test in code
 coverage,
  more complexity. Its simpler to just always set the function when opening
  the database.
 
 
   it might be
   a function from a previous version of the code and cause all sorts of
   strange bugs (which will only affect certain users with a certain
   version of
   the function stored in their DB).
 
  It will cause *much* less strange bugs than if you have one index that
  used two different collations, which is the alternative possibility.
  If the function is stored, the worst case will be that the collation
  function is out of date.  In practice, authors will mostly want to use
  established collation functions like UCA and won't mind if they're out
  of date.  They'll also only very rarely have occasion to deliberately
  change the function.
 
  As I said, you will end up querying the function to see if it is the one
 you
  want to use, if you do that you may as well set it every time.
  Thinking about this a bit more. If you change the collation function you
  need to re-sort

Re: [IndexedDB] Closing on bug 9903 (collations)

2011-05-06 Thread Keean Schupke
On 6 May 2011 00:22, Aryeh Gregor simetrical+...@gmail.com wrote:

 On Thu, May 5, 2011 at 2:12 AM, Keean Schupke ke...@fry-it.com wrote:
  What if the new version uses the same property name for a different
 thing?

 Yes, obviously it's going to be possible for code changes to cause
 hard-to-catch bugs due to not updating the database correctly.  We
 don't have to add more cases where that's possible than necessary,
 without good reason.  Maybe there's good reason here, but the added
 potential for error can't be neglected as a cost.


I have seen many bugs in real databases due to stored procedures.



  Why would you need to read it. Every time you open the database you would
  need to check the function is the one you expect.

 Not if you never intend to change it, or don't care if it's outdated.
 I expect this to be the most common case.


People don't change the language setting in an application?



 Consider the case of someone using CLDR-tailored UCA and a new version
 comes out.  You want to use the newest version for new indexes, if
 multiple versions are available, but there's no pressing need to
 automatically update existing indexes.  The old version is almost
 certainly good enough, unless your users use obscure languages.  So in
 my scheme, you can just update the function in your code and do
 nothing else.  In your scheme, you'd have to either stick to the old
 version across the board, or include both versions in your code
 indefinitely and include out-of-band logic to choose between them, or
 write a script that rebuilds the whole index on update (which would
 take a long time for a large index).


At least then the logic to chose between collations is visible in the code,
rather than hidden. This is all about transparency and making sure the
programmer has control of what is happening, rather than locking them into
limiting patterns, and giving them the ability to see exactly what the code
will do by reading and code-reviewing it.

With a stored procedure, what happens when a function you call (that is not
stored) changes?

The only way to be sure is to run a validation check in the index (run from
beginning to end checking the order is consistent with the comparison
function). That is the same whether you use stores procedures or not.



  The code would have to
  contain the function so it can compare it with the one in the DB and
 update
  it if necessary. If the code contains the function there are two copies
 of
  the function, one in the database and one in the code? which one is
 correct?
  which one is it using? So sometimes you will write the new function to
 the
  database, and sometimes you will not? More paths to test in code
 coverage,
  more complexity. Its simpler to just always set the function when opening
  the database.

 If the collation function is stored in the database, then I'd expect
 setting the function to rebuild the index if the new and old functions
 differ.  This could happen as a background operation, with the
 existing index still usable (with the old collation function) in the
 meantime.  So if you always wanted collations up-to-date, in my scheme
 authors could just set the function every time they open the database,
 as with your scheme.  But this could trigger a silent rebuild whenever
 necessary, so the author doesn't have to worry about it.  In your
 scheme, the author has to do the rebuild himself, and if he gets it
 wrong, the index will be corrupted.

 So as I see it, my approach is easier to use across the board.  It
 lets you not update collations on old tables without requiring you to
 keep track of multiple collation function versions, and it also
 potentially lets you update collations on old tables to the latest
 versions with rebuilding done for you in the background.  Critically,
 it does not let you change a sort function without rebuilding, since
 that will always cause bugs and you never want to do it (to a first
 approximation).

 Of course, maybe an initial implementation wouldn't do rebuilds for
 you, to keep it simple.  Then the collation function would be
 immutable after index creation, so you'd still have to do rebuilds
 yourself.  But it would still be easier and safer: the old index will
 still work in the interim even if you don't have the old version of
 your collation function around, and you can't mess up and get a
 corrupted index.

  Thinking about this a bit more. If you change the collation function you
  need to re-sort the index to make sure it will work (and avoid those
 strange
  bugs). Storing the function in the DB enables you to compare the function
  and only change it when you need to, thus optimising the number of
 re-sorts.
  That is the _only_ advantage to storing the function - as you still need
 to
  check the function stored is the one you expect to guarantee your code
 will
  run properly. So with a non-persisted function we need to sort every time
 we
  open to make sure the order is correct

Re: [IndexedDB] Closing on bug 9903 (collations)

2011-05-06 Thread Keean Schupke
On 6 May 2011 10:18, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, May 5, 2011 at 11:36 PM, Keean Schupke ke...@fry-it.com wrote:
  On 6 May 2011 03:00, Jonas Sicking jo...@sicking.cc wrote:
 
  On Wed, May 4, 2011 at 11:12 PM, Keean Schupke ke...@fry-it.com
 wrote:
   On 5 May 2011 00:33, Aryeh Gregor simetrical+...@gmail.com wrote:
  
   On Tue, May 3, 2011 at 7:57 PM, Jonas Sicking jo...@sicking.cc
 wrote:
I don't think we should do callbacks for the first version of
javascript. It gets very messy since we can't rely on that the
 script
function will be returning stable values.
  
   The worst that would happen if it didn't return stable values is that
   sorting would return unpredictable results.
  
   Worst is an infinite loop - no return.
  
  
So the choice here really is between only supporting some form of
binary sorting, or supporting a built-in set of collations.
 Anything
else will have to wait for version 2 in my opinion.
  
   I think it would be a mistake to try supporting a limited set of
   natural-language collations.  Binary collation is fine for a first
   version.  MySQL only supported binary collation up through version 4,
   for instance.
  
   A good point about MySQL.
  
  
   On Wed, May 4, 2011 at 3:49 AM, Keean Schupke ke...@fry-it.com
 wrote:
I thought only the app that created the db could open it (for
security
reasons)... so it becomes the app's responsibility to do version
control.
The comparison function is not going to change by itself - someone
has
to go
into the code and change it, when they do that they should up the
revision
of the database, if that change is incompatible.
  
   Why should we let such a pitfall exist if we can just store the
   function and avoid the issue?
  
   I don't see it as a pitfall, it is an has the advantage of
 transparency.
  
  
There is exactly the same problem with object properties. If the
 app
changes
to expect a new property on all objects stored, then the app has to
correctly deal with the update.
  
   If a requested property doesn't exist, I assume the API will fail
   immediately with a clear error code.  It will not fail silently and
   mysteriously with no error code.  (Again, I haven't looked at it
   closely, or tried to use it.)
  
   What if the new version uses the same property name for a different
   thing?
   For example in V1 'Employer' is a string name, and in V2 'Employer' is
 a
   reference to another object. You may say 'you should change the column
   name'? Right thats just the same as me saying you should change the DB
   version number when you change the collation algorithm. Its the same
   thing.
   People seem to be making a big fuss about having a non-persisted
   collation
   function defined in user code, when many many things require the code
 to
   have the correct model of the data stored in the database to work
   properly.
   It seems illogical to make a special case for this function, and not
 do
   anything about all the other cases. IMHO either the database should
 have
   a
   stored schema, or it should not. If IndexedDB is going the direction
 of
   not
   having a stored schema, then the designers should have the confidence
 in
   their decision to stick with it and at least produce something with a
   consistent approach to the problem.
  
  
2) making things easy for the user - for me a simpler more
predictable
API
is better for the user. Having a function stored inside the
 database
is
bad,
because you cannot see what function might be stored in there...
  
   We could let you query the stored function.
  
   Why would you need to read it. Every time you open the database you
   would
   need to check the function is the one you expect. The code would have
 to
   contain the function so it can compare it with the one in the DB and
   update
   it if necessary. If the code contains the function there are two
 copies
   of
   the function, one in the database and one in the code? which one is
   correct?
   which one is it using? So sometimes you will write the new function to
   the
   database, and sometimes you will not? More paths to test in code
   coverage,
   more complexity. Its simpler to just always set the function when
   opening
   the database.
  
  
it might be
a function from a previous version of the code and cause all sorts
 of
strange bugs (which will only affect certain users with a certain
version of
the function stored in their DB).
  
   It will cause *much* less strange bugs than if you have one index
 that
   used two different collations, which is the alternative possibility.
   If the function is stored, the worst case will be that the collation
   function is out of date.  In practice, authors will mostly want to
 use
   established collation functions like UCA and won't mind if they're
 out
   of date.  They'll also only very rarely

Re: [IndexedDB] Closing on bug 9903 (collations)

2011-05-04 Thread Keean Schupke
On 3 May 2011 23:59, Aryeh Gregor simetrical+...@gmail.com wrote:

 On Tue, May 3, 2011 at 10:56 AM, Keean Schupke ke...@fry-it.com wrote:
  Why does it need to be persisted? I would prefer the database to be
  stateless. Obviously all users of the database need to use the same
  function.

 And if they don't use exactly the same function, maybe due to a
 transient bug, the index is silently and permanently corrupted, until
 all affected rows happen to be updated again?  That doesn't sound like
 a good idea to me.


I thought only the app that created the db could open it (for security
reasons)... so it becomes the app's responsibility to do version control.
The comparison function is not going to change by itself - someone has to go
into the code and change it, when they do that they should up the revision
of the database, if that change is incompatible.

There is exactly the same problem with object properties. If the app changes
to expect a new property on all objects stored, then the app has to
correctly deal with the update.

There are two issues here:

1) doing things correctly - there is no problem here, providing the closure
works.

2) making things easy for the user - for me a simpler more predictable API
is better for the user. Having a function stored inside the database is bad,
because you cannot see what function might be stored in there... it might be
a function from a previous version of the code and cause all sorts of
strange bugs (which will only affect certain users with a certain version of
the function stored in their DB). By having the sort function in plain sight
in the source code it is visible and readable. Yes, there is a risk that the
code is changed and the order method is different from that in the DB, which
will cause breakage, but so can a function hidden in the database. Of the
two I would always choose to have everything clearly visible in the source
code where you can check it.


Cheers,
Keean.


Re: [IndexedDB] Closing on bug 9903 (collations)

2011-05-04 Thread Keean Schupke
On 4 May 2011 00:57, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, May 3, 2011 at 12:19 AM, Keean Schupke ke...@fry-it.com wrote:
  The more I think about it, the more I want a user-specified comparison
  function. Efficiency should not be an issue here - the engines should
 tweek
  the JIT compiler to fix any efficiency issues. Just let the user pass a
  closure (remember functions are first-class in JavaScript so this is not
 a
  callback nor an event).

 I don't think we should do callbacks for the first version of
 javascript. It gets very messy since we can't rely on that the script
 function will be returning stable values.

 Additionally we'd either have to ask that the callback function is
 re-registered each time the database is opened, or somehow store a
 serialized copy of the callback function in the browser so that it's
 available the next time the database is opened. Neither of these
 things have been done in other APIs in the past, so if we hold up v1
 until we solve the challenges involved I think it will delay the
 release of a stable spec.

 So the choice here really is between only supporting some form of
 binary sorting, or supporting a built-in set of collations. Anything
 else will have to wait for version 2 in my opinion.

 / Jonas


Thats fine with me, providing the other issues around collation orders are
solved. If something like the unicode algorithm is used (and if not I would
want to be convinced there is a good reason for doing something different
than everyone else) there is the issue of  what orderings are provided by
everyone (maybe DUCET + current CLDR). Then there is how often the CLDR
should be updated. Should there be a live fetch / version check every time
the DB is started (seems like a sensible route to me, where possible),
otherwise the CLDR version could be specified by the standard and updated
with each version of the standard?


Cheers,
Keean.


Re: [IndexedDB] Closing on bug 9903 (collations)

2011-05-04 Thread Keean Schupke
On 4 May 2011 00:57, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, May 3, 2011 at 12:19 AM, Keean Schupke ke...@fry-it.com wrote:
  The more I think about it, the more I want a user-specified comparison
  function. Efficiency should not be an issue here - the engines should
 tweek
  the JIT compiler to fix any efficiency issues. Just let the user pass a
  closure (remember functions are first-class in JavaScript so this is not
 a
  callback nor an event).

 I don't think we should do callbacks for the first version of
 javascript. It gets very messy since we can't rely on that the script
 function will be returning stable values.



garbage in = garbage out. The programmers job is to write a correct
comparison function. All functions have this problem. By this argument we
had all better give up programming because there is a risk we may write a
function that returns incorrect results.



 Additionally we'd either have to ask that the callback function is
 re-registered each time the database is opened, or somehow store a



I still think re-registering is a non-issue. It is trivial to declare a
local open function openNameIndex than calls openIndex with the correct
callback and provide that as a software-module - either in the main code, or
in a separate JS file that can be included in each page. Modular programming
is a good thing, should be encouraged, and is the traditional software
engineering solution to this kind of problem.


serialized copy of the callback function in the browser so that it's
 available the next time the database is opened. Neither of these
 things have been done in other APIs in the past, so if we hold up v1
 until we solve the challenges involved I think it will delay the
 release of a stable spec.

 So the choice here really is between only supporting some form of
 binary sorting, or supporting a built-in set of collations. Anything
 else will have to wait for version 2 in my opinion.

 / Jonas



Cheers,
Keean.


Re: [IndexedDB] Closing on bug 9903 (collations)

2011-05-04 Thread Keean Schupke
On 4 May 2011 21:01, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, May 4, 2011 at 1:10 AM, Keean Schupke ke...@fry-it.com wrote:
  On 4 May 2011 00:57, Jonas Sicking jo...@sicking.cc wrote:
 
  On Tue, May 3, 2011 at 12:19 AM, Keean Schupke ke...@fry-it.com
 wrote:
   The more I think about it, the more I want a user-specified comparison
   function. Efficiency should not be an issue here - the engines should
   tweek
   the JIT compiler to fix any efficiency issues. Just let the user pass
 a
   closure (remember functions are first-class in JavaScript so this is
 not
   a
   callback nor an event).
 
  I don't think we should do callbacks for the first version of
  javascript. It gets very messy since we can't rely on that the script
  function will be returning stable values.
 
  garbage in = garbage out. The programmers job is to write a correct
  comparison function. All functions have this problem. By this argument we
  had all better give up programming because there is a risk we may write a
  function that returns incorrect results.

 Browsers can certainly deal with this, and ensure that the only one
 suffering is the author of the buggy algorithm. However this comes at
 a cost in that the browser sorting algorithm can't go into infinite
 loops or crash even in the face of the most ridiculous comparison
 algorithm. In other words, the browser will likely have to use a
 slower sorting implementation in order to be robust.

 Additionally, there is a significant cost involved in transitioning
 between the C++ code implementing the sorting algorithm, and the
 javascript implemented callback. That is on top of the cost of
 implementing the comparison function in javascript. Even in the best
 JITs, there is a significant overhead to both these parts.

 So rather than repeating myself, i'll just quote myself:

  So the choice here really is between only supporting some form of
  binary sorting, or supporting a built-in set of collations. Anything
  else will have to wait for version 2 in my opinion.

 :)

 / Jonas


I gave my answer, and some follow up questions in a previous email, so I am
not avoiding the question. My point was any event handler (onMouseDown?)
could have an infinite loop - why so fussy about this one function when so
many others have the same problem?

The performance point of calling to JavaScript is a valid one, but is this a
problem? Perhaps it is fast enough. I have seen no evidence that is will be
too slow for people to use - perhaps the bottle neck will be the disk/flash
access speed for fetching the blocks and not the JavaScript comparison
function.


Cheers,
Keean.


Re: [IndexedDB] Closing on bug 9903 (collations)

2011-05-03 Thread Keean Schupke
The more I think about it, the more I want a user-specified comparison
function. Efficiency should not be an issue here - the engines should tweek
the JIT compiler to fix any efficiency issues. Just let the user pass a
closure (remember functions are first-class in JavaScript so this is not a
callback nor an event).


Keean.


On 2 May 2011 19:57, Aryeh Gregor simetrical+...@gmail.com wrote:

 On Fri, Apr 29, 2011 at 3:19 PM, Keean Schupke ke...@fry-it.com wrote:
  As long as we have a binary mode I am happy.

 Something I didn't think to mention: what exactly is binary mode for
 DOMStrings?  I guess it means you encode as big-endian UTF-16, then
 sort bytewise?  This is kind of evil, but it matches what sort() does,
 so I guess it should be the required behavior.  (It's kind of evil
 because it doesn't match code-point order, unlike if you encoded as
 UTF-8.  E.g., U+1 is encoded as 0xd800dc00 and U+E000 is 0xe000,
 so U+E000 sorts after U+1.)

 Perhaps this should be spelled out more clearly in the spec.



Re: [IndexedDB] Closing on bug 9903 (collations)

2011-05-03 Thread Keean Schupke
Why does it need to be persisted? I would prefer the database to be
stateless. Obviously all users of the database need to use the same
function. I would recommend modular programming - create a .js script you
can include in all pages that provides 'collated' versions of the method
calls by adding the collation argument - Infact for good programming in
general make this API your model, so if you were writing a shopping cart,
this '.js' would provide methods like 'addToCart', 'removeFromCart', and all
collations settings would be hidden in this layer and kept out of individual
pages, whilst not needing to be stored in the database at all.

Cheers,
Keean.


On 3 May 2011 15:27, Aryeh Gregor simetrical+...@gmail.com wrote:

 On Tue, May 3, 2011 at 3:19 AM, Keean Schupke ke...@fry-it.com wrote:
  The more I think about it, the more I want a user-specified comparison
  function. Efficiency should not be an issue here - the engines should
 tweek
  the JIT compiler to fix any efficiency issues. Just let the user pass a
  closure (remember functions are first-class in JavaScript so this is not
 a
  callback nor an event).

 Wouldn't it be a bit more complicated than just passing a regular
 closure?  The function has to be persisted in the database across page
 views, but a JavaScript closure is going to contain references to all
 sorts of objects (like document, or local variables) that are very
 specific to the current page view.  It makes no sense to persist those
 objects in general.  You'd need to serialize the function somehow,
 possibly putting restrictions on the sorts of variables it can access,
 so that it can be sensibly restored later.  Is there some established
 way of doing this yet in JavaScript?  It might be useful in other
 contexts too.

 I still agree that this is the correct direction to go in, though.



Re: [IndexedDB] Closing on bug 9903 (collations)

2011-05-02 Thread Keean Schupke
On Sunday, 1 May 2011, Aryeh Gregor simetrical+...@gmail.com wrote:
 On Fri, Apr 29, 2011 at 3:32 PM, Jonas Sicking jo...@sicking.cc wrote:
 I agree that we will eventually want to standardize the set of allowed
 collations. Similarly to how we'll want to standardize on one set of
 charset encodings supported. However I don't think we, in this spec
 community, have enough experience to come up with a good such set. So
 it's something that I think we should postpone for now. As I
 understand it there is work going on in this area in other groups, so
 hopefully we can lean on that work eventually.

 (Disclaimer: I never really tried to figure out how IndexedDB works,
 and I haven't seen the past discussion on this topic.  However, I know
 a decent amount about database collations in practice from my work
 with MediaWiki, which included adding collation support to category
 pages last summer on a contract with Wikimedia.  Maybe everything I'm
 saying has already been brought up before and/or everyone knows it
 and/or it's wrong, in which case I apologize in advance.)

 The Unicode Collation Algorithm is the standard here:

 http://www.unicode.org/reports/tr10/

 It's pretty stable (I think), and out of the box it provides *vastly*
 better sorting than binary sort.  Binary sort doesn't even work for
 English unless you normalize case and avoid punctuation marks, and
 it's basically useless for most non-English languages.  Some type of
 UCA support in browsers would be the way to go here.

 UCA doesn't work perfectly for all locales, though, because different
 locales sort the same strings differently (French handling of accents,
 etc.).  The standard database of locale-specific collations is CLDR:

 http://cldr.unicode.org/

 CLDR tends to have several new releases per year.  For instance, 1.9.1
 was released this March, three versions were released last year, and
 five were released in 2009.  Just looking at the release notes, it
 seems that most if not all of these releases update collation details.
  Because of how collations are actually used in databases, any change
 to the collation version will require rebuilding any index that uses
 that collation.

 I don't think it's a good idea for browsers to try packaging such
 rapidly-changing locale data.  If everyone had Chrome's release and
 support schedule, it might work okay -- if you figured out a way to
 handle updates gracefully -- but in practice, authors deal with a wide
 range of browser ages.  It's not good if every user has a different
 implementation of each collation.  Nor if browsers just use a frozen
 and obsolescent collation version.  I also don't know how realistic
 implementers would find it to ship collation support for every
 language CLDR supports -- the CLDR download is a few megabytes zipped,
 but I don't know how much of that browsers would need to ship to
 support all its tailorings.

 The general solution here would be to allow the creation of indexes
 based on a user-supplied function.  I.e., the user-supplied function
 would (in SQL terms) take the row's data as input, and output some
 binary string.  That string would be used as the key in the index,
 instead of any of the column values for the row.  PostgreSQL allows
 this, or so I've heard.  Then you could implement UCA (optionally with
 CLDR tailorings) or any other collation algorithm you liked in
 JavaScript.

 Of course, we can't expect authors to reimplement the UCA if they want
 to get decent sorting.  It would make sense for browsers to expose
 some default sort functions, but I'm not familiar enough with UCA or
 CLDR to say which ones would be best in practice.  It might make sense
 to expose some medium-level primitives that would allow authors to
 easily overlay tailoring on the basic UCA algorithm, or something.  Or
 maybe it would really make sense to expose all of CLDR's tailored
 collations.  I'm not familiar enough with the specs to say.  But for
 the sake of flexibility, allowing indexes based on user-defined
 functions is the way to go.  (They're useful for things other than
 collations, too.)

 The proposed ECMAScript LocaleInfo.Collator looks like it doesn't
 currently support this use-case, since it provides only sort functions
 and not sortkey generation functions:

 http://wiki.ecmascript.org/doku.php?id=strawman:i18n_api

 If browsers do provide sortkey generation functions based on UCA, some
 versioning mechanism will need to be used, particularly if it supports
 tailored sortkeys.


 FWIW, MySQL provides some built-in collation support, but MediaWiki
 doesn't use it, because it supports too few languages and is too
 inflexible.  MediaWiki's stock localization has 99% support for the
 500 most-used messages in 175 different languages, and the couple
 dozen locales that MySQL supports aren't acceptable for us.  Instead,
 we store everything with a binary collation, and are moving to a
 system where we compute the UCA sortkeys ourselves and put them in
 

Re: [IndexedDB] Closing on bug 9903 (collations)

2011-04-29 Thread Keean Schupke
On Friday, 29 April 2011, Jonas Sicking jo...@sicking.cc wrote:
 On Fri, Apr 29, 2011 at 11:16 AM, Pablo Castro
 pablo.cas...@microsoft.com wrote:
 We've had quite a bit of debate on this but I don't think we've reached 
 closure. At this point I would be fine with either one of a) postpone to v2 
 and agree that for now we'll just do binary collation everywhere or b) the 
 last form of the proposal sent around: extra collation argument (following 
 BCP47 plus whatever the UA wants to allow) in createObjectStore/createIndex, 
 plus a collation property to interrogate it; no way to change the collation 
 of a store/index once created.

 Given that this turned out to be a more elaborate topic than I had 
 originally expected and that it doesn't seem to have a lot of traction right 
 now, my preference would be to postpone to v2. Thoughts? Once we make a call 
 I'll make sure the spec reflects it.

 I'd be fine with postponing it. However I don't think that the counter
 proposals that we've received will work, so I don't think that there
 is a reason to postpone.

 / Jonas



As long as we have a binary mode I am happy. If it is to support other
collations, then all browsers must support the same set of options.
The question then becomes what set of collation modes to standardise
on? Allowing non standard collations will result in apps that will
only run correctly on one browser, and that does not seem a good idea
to me.

Cheers,
Keean.



Re: [IndexedDB] Closing on bug 9903 (collations)

2011-04-29 Thread Keean Schupke
There is always something like UCA:

http://www.unicode.org/reports/tr10/

which looks interesting.

Cheers,
Keean.


On 29 April 2011 20:32, Jonas Sicking jo...@sicking.cc wrote:

 On Fri, Apr 29, 2011 at 12:19 PM, Keean Schupke ke...@fry-it.com wrote:
  On Friday, 29 April 2011, Jonas Sicking jo...@sicking.cc wrote:
  On Fri, Apr 29, 2011 at 11:16 AM, Pablo Castro
  pablo.cas...@microsoft.com wrote:
  We've had quite a bit of debate on this but I don't think we've reached
 closure. At this point I would be fine with either one of a) postpone to v2
 and agree that for now we'll just do binary collation everywhere or b) the
 last form of the proposal sent around: extra collation argument (following
 BCP47 plus whatever the UA wants to allow) in createObjectStore/createIndex,
 plus a collation property to interrogate it; no way to change the collation
 of a store/index once created.
 
  Given that this turned out to be a more elaborate topic than I had
 originally expected and that it doesn't seem to have a lot of traction right
 now, my preference would be to postpone to v2. Thoughts? Once we make a call
 I'll make sure the spec reflects it.
 
  I'd be fine with postponing it. However I don't think that the counter
  proposals that we've received will work, so I don't think that there
  is a reason to postpone.
 
  / Jonas
 
 
 
  As long as we have a binary mode I am happy. If it is to support other
  collations, then all browsers must support the same set of options.
  The question then becomes what set of collation modes to standardise
  on? Allowing non standard collations will result in apps that will
  only run correctly on one browser, and that does not seem a good idea
  to me.

 I agree that we will eventually want to standardize the set of allowed
 collations. Similarly to how we'll want to standardize on one set of
 charset encodings supported. However I don't think we, in this spec
 community, have enough experience to come up with a good such set. So
 it's something that I think we should postpone for now. As I
 understand it there is work going on in this area in other groups, so
 hopefully we can lean on that work eventually.

 Of course, we still do need to have a standardized vocabulary for the
 collations though.

 / Jonas



Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?

2011-04-04 Thread Keean Schupke
This is ignoring the possibility that something like RelationalDB could be
used, where a well defined common subset of SQL can be used (and I use
well-defined in the formal sense). This would allow a relatively thin
wrapper on top of most SQL implementations and would allow SQLite (or BDB)
to be used as the backend.

As a seasoned C++ programmer, I could even write a Firefox plugin using
XPCOM as a reference implementation using the same API as the JavaScript
RelationalDB implementation on my GitHub. Although I am not keen on putting
in the time to do this if nobody is interested.

To me is seems this thread is going in circles. RelationalDB does not have
the standardisation problem that WebSQL has, but is still a relatively thin
API layer that can be implemented over the top of a fast and well tested SQL
implementation. It is based on sound theory and research defining the
abstraction layer, and has a relationally complete API, so there should be
no need to change the core API in the development of a standard.


Cheers,
Keean.


On 4 April 2011 14:39, Jonas Sicking jo...@sicking.cc wrote:

 On Saturday, April 2, 2011, Joran Greef jo...@ronomon.com wrote:
  I am incredibly uncomfortable with the idea of putting the
  responsibility of the health of the web in the hands of one project.
  In fact, one of the main reasons I started working at Mozilla was to
  prevent this.
 
  / Jonas
 
  I agree with you. All the more reason to support both WebSQL and
 IndexedDB. It is not a case of either/or. It would be healthy to have
 competing APIs.

 Competition might be a great thing. But it doesn't address the issue
 in the least. It would still be the case that some developers would
 choose to use WebSQL, and browser makers would still have to support
 it, including support the SQL dialect it uses.

 Hence it would still be the case that we would be relying on the
 SQLite developers to maintain a stable SQL interpretation to keep a
 healthy and functional web.

 / Jonas




Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?

2011-04-04 Thread Keean Schupke
Yes, it already has well defined set operations. Solid is a matter of
testing by enough people (and if you wanted to try it and feed back that
would be a start). Fast should not be a problem, as the SQL database does
all the heavy lifting.

In more detail, Codd's six primitive operators are project, restrict,
cross-product, union and difference. Relations are an extension of Sets, so
intersection and difference on compatible relations behave like they would
on sets.

RelationalDB already implements the following 5 methods making it
relationally-complete. Meaning it can do anything you could possibly want to
do with relations using combinations of these 5 methods.

Relation.prototype.project = function(attributes) { // this implements
rename as well.
Relation.prototype.restrict = function(exp)
Relation.prototype.join = function(relation, exp) {
Relation.prototype.union = function() {};
Relation.prototype.difference = function() {};

Of course some things can be made easier, so the following methods, although
they can be defined in terms of the above 5, will be provided (in future
implementations) to keep user code concise and implementations thin and
fast.

// derived methods
Relation.prototype.intersection = function() {};
Relation.prototype.thetajoin = function() {};
Relation.prototype.semijoin = function() {};
Relation.prototype.antijoin = function() {};
Relation.prototype.divide = function() {};
Relation.prototype.leftjoin = function() {};
Relation.prototype.rightjoin = function() {};
Relation.prototype.fulljoin = function() {};


We also hope to provide the lattice operators meet and join:

http://en.wikipedia.org/wiki/Lattice_(order)

Just these two operators can replace all 5 of Codd's primitives (including
all set operations). With just these two you can do anything that you can
with _all_ the above. Meet is actually the same as Codd's natural-join
(unfortunately terminology in different mathematical fields is not
consistent here) and Join is a generalised union operator.

See:

http://www.arxiv.com/pdf/cs/0501053v2

To see how Meet and Join can be used to construct each of the above
operators.


Cheers,
Keean.


On 4 April 2011 15:36, Joran Greef jo...@ronomon.com wrote:

 On 04 Apr 2011, at 5:26 PM, Keean Schupke wrote:

  This is ignoring the possibility that something like RelationalDB could
 be used, where a well defined common subset of SQL can be used (and I use
 well-defined in the formal sense). This would allow a relatively thin
 wrapper on top of most SQL implementations and would allow SQLite (or BDB)
 to be used as the backend.

 Yes, if an implementation of RelationalDB arrives which is solid and fast
 with support for set operations that would be great. The important thing is
 that we have two competing APIs (and preferably a strong API with a great
 track record).


Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?

2011-04-04 Thread Keean Schupke
On 4 April 2011 15:55, Keean Schupke ke...@fry-it.com wrote:

 Yes, it already has well defined set operations. Solid is a matter of
 testing by enough people (and if you wanted to try it and feed back that
 would be a start). Fast should not be a problem, as the SQL database does
 all the heavy lifting.

 In more detail, Codd's six primitive operators are project, restrict,
 cross-product, union and difference. Relations are an extension of Sets, so
 intersection and difference on compatible relations behave like they would
 on sets.


I missed 'rename' from my list of Codd's operators. Our 'project' function
provides both project  rename, so I overlooked it.


 ...



Cheers,
Keean.


Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?

2011-04-04 Thread Keean Schupke
Some thoughts:

On 4 April 2011 16:10, Mikeal Rogers mikeal.rog...@gmail.com wrote:

 i've mostly stayed out of this thread because i felt like i'd just being
 fanning the flames but i really can't stay out anymore.

 databases are more that SQL, always have been.


 SQL is a DSL for relational database access. all implementations of SQL
 have a similar set of tools they implement first and layer SQL on top of.
 those tools tend to be a storage engine, btree, and some kind of
 transactional model between them. under the ugly covers, most databases look
 like berkeleydb and the layer you live in is just sugar on top.


SQL is a standard language (or API) for talking to databases. Why should a
developer need to learn a different API for each database. W3 is about
standardising APIs. SQL is just an API standardised as a DSL. It is good for
all the reasons any standard is good. Add to that the sound mathematical
theory of relational-algebra, means it has a lot going for it. Although like
any standard is has its problems, most of those seem to be where it has
deviated away from the pure relational-algebra.



 creating an in-browser specification/implementation on top of a given
 relational/SQL story is a terrible idea. it's unnecessarily limiting to a
 higher level api and can't be easily extended the way a simple set of tools
 like IndexedDB can.


Its not limiting it provides a more powerful (higher level) interface, that
allows developers to concentrate on what to do with the data not how to do
it.



 suggesting that other databases be implemented on top of SQL rather than on
 top of the tools in which SQL is built is just backwards to anyone who's
 built a database.


RelationalDB is not a database its a relational-data model.



 it's not very hard to write the abstraction you're talking about on top of
 IndexedDB, and until you do it i'm going to have a hard time taking you
 seriously because it's clearly doable.


Surely its the API that is important, not how it is implemented? You can try
the API now implemented on top of WebSQL. The API will stay the same no
matter what underlying technology is used.



 i implemented a CouchDB compatible datastore on top of IndexedDB, it took
 me less than a week at a time when there was only one implementation that
 was still changing and still had bugs. it would be much easier now.

 https://github.com/mikeal/idbcouch

 it needs to be updated to use the latest version of the spec which is a day
 of work i just haven't gotten to yet.


I am not overly impressed by CouchDB.


 the constructs in IndexedDB are pretty low level but sufficient if you know
 how to implement databases. performance is definitely an issue, but making
 these constructs faster would be much easier than trying to tweak an off the
 shelf SQL implementation to your use case.


I look at the amount of man hours that have gone into developing SQLite, and
BDB and I think, hey if its so easy to write a high performance database,
those guys must have been wasting a lot of time?


Cheers,
Keean.


Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?

2011-04-04 Thread Keean Schupke
On 4 April 2011 16:04, Tab Atkins Jr. jackalm...@gmail.com wrote:

 On Mon, Apr 4, 2011 at 8:07 AM, Joran Greef jo...@ronomon.com wrote:
  On 04 Apr 2011, at 4:39 PM, Jonas Sicking wrote:
  Hence it would still be the case that we would be relying on the
  SQLite developers to maintain a stable SQL interpretation...
 
  SQLite has a fantastic track record of maintaining backwards
 compatibility.
 
  IndexedDB has as yet no track record, no consistent implementations, no
 widespread deployment,

 It's new.


  only measurably poor performance

 Ironically, the poor performance is because it's using sqlite as a
 backing-store in the current implementation.  That's being fixed by
 replacing sqlite.


  and a lukewarm indexing and querying API.

 Kinda the point, in that the power/complexity of SQL confuses a huge
 number of develoeprs, who end up coding something which doesn't
 actually use the relational model in any significant way, but still
 pays the cost of it in syntax.

 (I found normalization forms and similar things completely trivial
 when I was learning SQL, but for some reason almost every codebase
 I've looked at has a horribly-structured db.  As far as I can tell,
 the majority of developers just hack SQL into being a linear object
 store and do the rest in their application code.  We can reduce the
 friction here by actually giving them a linear object store, which is
 what IndexedDB is.)


 ~TJ


SQLite has seen really good use in the mobile app community on both iPhone
and Android. I would have thought that if we wanted the same kind of
thriving app developer community around HTML5 web-apps, taking a few leaves
from the mobile developers book would not be a bad idea?

IMHO its those kind of developers HTML5 should be trying to attract, in
addition to the current web developers.


Cheers,
Keean.


Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?

2011-04-04 Thread Keean Schupke
I would point out that RelationalDB is relationally complete and the api
does not depend on the sqlite spec at all.

Cheers
Keean
On Apr 1, 2011 8:58 PM, Jonas Sicking jo...@sicking.cc wrote:
 On Fri, Apr 1, 2011 at 12:39 PM, Glenn Maynard gl...@zewt.org wrote:
 Lastly, some vendors have expressed unwillingness to embed SQLite for
 legal reasons. Embedding other peoples code definitely exposes you to
 risk of copyright and patent lawsuits. While I can't say that I fully
 agree with this reasoning, I'm also not the one that would be on the
 receiving end of a lawsuit. Nor am I a lawyer and so ultimately will
 have to defer to people that know better. In the end it doesn't really
 matter as if a browser won't embed SQLite then it doesn't matter why,
 the end result is that the same SQL dialect won't be available cross
 browser which is bad for the web.

 If SQLite was to be used as a web standard, I'd hope that it wouldn't
show
 up in a spec as simply do what SQLite does, but as a complete spec of
 SQLite's behavior.  Browser vendors could then, if their lawyers
insisted,
 implement their own compatible implementation, just as they do with other
 web APIs.  I'd expect large portions of SQLite's test suite to be
adaptable
 as a major starting point for spec tests, too.

 Have you read the WebSQL spec?

 Creating such a spec would be a formidable task, of course.

 Indeed. One that the SQL community has failed in doing so far. And
 they have a lot more experience with SQL than we do.

 / Jonas



Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?

2011-04-04 Thread Keean Schupke
On 4 April 2011 22:55, Aryeh Gregor simetrical+...@gmail.com wrote:

 On Fri, Apr 1, 2011 at 2:39 PM, Jonas Sicking jo...@sicking.cc wrote:
  There are several reasons why we don't want to rely exclusively on
  SQLite, other than solely W3C formalia.
 
  First of all, what should we do once the SQLite team releases a new
  version which has some modifications in its SQL dialect? We generally
  always need to embed the latest version of the library since it
  contains critical bug fixes, however SQLite makes no guarantees that
  it will always support exactly the same SQL statements. . . .

 These are good reasons, and I have no problem with them.  SQLite is
 designed with very different compatibility and security needs than the
 web platform has, and its performance goals might be different in some
 respects as well.  There are various ways that you could address this
 short of making up something completely different, but I'm not sure
 whether it would be a good idea.

 Anyway, I didn't intend to reignite this whole discussion.  The
 decision has been made, now we get to see what comes of it.

 On Mon, Apr 4, 2011 at 11:07 AM, Joran Greef jo...@ronomon.com wrote:
  SQLite has a fantastic track record of maintaining backwards
 compatibility.

 SQLite does not face anything close to the compatibility requirements
 that web browsers face.  There are perhaps billions of independent web
 pages, which don't have any control over what browser versions they're
 being run in.  These pages are expected to work in all browsers even
 if they were written ten years ago and no one has looked at them
 since, and even if they were written incompetently.  Just because
 something has an excellent compatibility track record by the standards
 of application libraries doesn't mean it's compatible enough for the
 web.


Something like RelationalDB gives you the power of a relational-db with no
dependence on a specific implementation of SQL, so it would be compatible
enough for the web.  It fixes all the problems with the standardisation of
WebSQL that have been talked about so far.  I think it would find no
technical issues that block its standardisation.  As a high level DB API it
does not need all the low-level features of IndexedDB, so its API can be
much simpler and cleaner. RelationalDB can at least be provided as a library
on top of IndexedDB, and it can use WebSQL where it is supported. My concern
with the library approach is performance when implemented on top of
IndexedDB.


Cheers,
Keean.


Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?

2011-04-02 Thread Keean Schupke
Pity.

Anyway RelationalDB defines its API without reference to the underlying SQL
or non-SQL database... So as a candidate for replacing WebSQL, it does not
suffer from that problem.


Cheers,
Keean.


On 2 April 2011 14:56, Glenn Maynard gl...@zewt.org wrote:

 On Sat, Apr 2, 2011 at 5:24 AM, Keean Schupke ke...@fry-it.com wrote:

 Infact now BDB supports the SQLite-3.0 API, you can have two
 implementation that conform to the same API. So the original reason for
 abandoning WebSQL seems no longer valid. As there are now more than one
 implementation of the SQLite-3.0 API it is a de-facto (open) standard.


 Based on
 http://download.oracle.com/docs/cd/E17076_02/html/installation/upgrade_11gr2_51_sqlite_ver.html,
 it's not like an independent implementation.

 --
 Glenn Maynard





Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?

2011-04-01 Thread Keean Schupke
Hi Shawn

I would be interested in this. What would need to be done to make this a
Firefox plugin? I've done XPCOM stuff before in xulrunner if that's any
help.

Cheers,
Keean
 On Apr 1, 2011 6:09 PM, Shawn Wilsher sdwi...@mozilla.com wrote:
 On 4/1/2011 5:40 AM, Nathan Kitchen wrote:
 Are there any browser vendor representatives on the mailing list who
would
 care to comment on the criteria for implementing something akin to
Keean's
 RelationalDBhttps://github.com/keean/RelationalDB idea? What would need
 to be in place to start work on such an implementation?
 It wouldn't be terribly difficult to prototype this as an add-on for
 Firefox, I don't think (and I'd be happy to provide technical assistance
 to anyone wishing to do so). Doing this would allow web developers to
 install the add-on and play with it, which can give us useful feedback.

 I'm not saying we'd move it into the tree at that point, but it's a good
 first step to building a case to take it.

 1. Opportunity to explore more solutions to offline data than *just *
 IndexedDB.
 There is also http://dev.w3.org/html5/spec/offline.html and
 http://dev.w3.org/html5/webstorage/ (even if you don't like them, they
 are other solutions to the offline problem). Browser vendors are not
 just looking at IndexedDB.

 2. Many web developers have a working knowledge of SQL, so the concepts
 of a relational database may be more familiar. If adoption could be
 considered a proxy for the success of a standard, I'd suggest that
aiming
 for something the web development community understands would be a large
 factor in adoption.
 I don't really think IndexedDB is that dissimilar to a relational
 database. There are a lot of one-to-one mappings of concepts of one to
 the other.

 3. It's probably (!) easier to implement RelationalDB than IndexedDB, as
 it maps fairly cleanly to existing relational database technologies. This
 would allow vendors to implement it using Sqlite, Access, etc independent
of
 the spec.
 Given that most vendors already have working implementations of
 IndexedDB, I don't think this is a good argument ;)

 Cheers,

 Shawn



Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Keean Schupke
I was the one that asked for callbacks.

 but what do we do if those callbacks don't
 return consistent results? Or even do evil things like modify the
 stores where data is being inserted?

If the callback maps all values to a sort-order of '1' there could only ever
be one entry in the index... its not hard, the callback is passed an
immutable copy of the object and returns a sort-order as a binary-blob. If
you capture the object store in the closure you of course you could do evil
things as side-effects. But that is true in any non-purely-functional
language, you can always do evil things with side-effects.

 In short, I don't think we'll get much further here without a concrete
proposal.

Which basically means nobody working on the current implementations
understands the issues, or thinks the issues are unimportant?

Cheers,
Keean.


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Keean Schupke
On 31 March 2011 08:38, Joran Greef jo...@ronomon.com wrote:

 On 31 Mar 2011, at 9:34 AM, Jeremy Orlow wrote:

  We have made an effort to understand other contributions to the field.
 
  I'm not convinced that these are essential database concepts and having
 personally spent quite some time working with the API in JS and implementing
 it, I feel pretty confident that what we have for v1 is pretty solid.  There
 are definitely some things I wouldn't mind re-visiting or looking at closer,
 possibly even for v1, but they all seem reasonable to study further for v2
 as well.
 
  We've spent a lot of time over the last year and a half talking about
 IndexedDB.  But now it's shipping in Firefox 4 and soon Chrome 11.  So
 realistically v1 is not going to change much unless we are convinced that
 what's there is fundamentally broken.
 
  We intentionally limited the scope of v1, which is why we know there'll
 be a v2.  We can't solve all the problems at once, and the difficulty of
 speccing something is typically exponential to the size of the API.
 
  Maybe a constructive way to discuss this would be to look at what use
 cases will be difficult or impossible to achieve with the current design?

 Application-managed indices for starters. I would consider that to be
 essential when designing indexed key/value stores, and I would consider that
 to be the contribution made by almost every other indexed key/value store to
 date. If we have to use IDB the way FriendFeed used MySQL to achieve
 application-managed indices then I would argue that the API is in fact
 fundamentally broken and we would be better off with an embedding of
 SQLite by Mozilla.

 Regarding the difficulty of speccing something is typically exponential to
 the size of the API, if people want to build a Rube Goldberg device then
 they must deal with the spec issues of that.

 If we were provided with the primitives for an indexed key/value store with
 application-managed indices (as Nikunj suggested at the time), we would have
 been well out of the starting blocks by now, and issues such as computed
 indexes, indexing array values etc. would have been non-issues.

 Summary:

 1. There's a problem.
 2. It can still be fixed with a minimum of fuss.


I totally agree with everything so far...


 3. This requires an adjustment to the putObject and deleteObject interfaces
 (see previous threads).


I disagree that a simple API change is the answer. The problem is
architectural, not just a superficial API issue.


Cheers,
Keean.


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Keean Schupke
On 31 March 2011 12:41, Joran Greef jo...@ronomon.com wrote:

 On 31 Mar 2011, at 12:52 PM, Keean Schupke wrote:

  I totally agree with everything so far...
 
  3. This requires an adjustment to the putObject and deleteObject
 interfaces (see previous threads).
 
  I disagree that a simple API change is the answer. The problem is
 architectural, not just a superficial API issue.

 Yes, for IndexedDB to be stateless with respect to application schema, one
 would need to:

 1. Provide the application with a first-class means to manage indexes at
 time of putting/deleting objects.
 2. Treat objects as opaque (remove key path, structured clone mechanisms,
 application must provide an id and JSON value to put/delete calls, reduces
 serialization/deserialization overhead where application already has the
 object as a string).
 3. Remove setVersion (redundant, application migrates objects and indexes
 using transactions as it needs to).
 4. Remove createIndex.

 This would rip so much from the spec as to reduce it to a bunch of tatters,
 defining nothing more than an interface for index/key/value primitives in
 terms of well-established interfaces.

 Essentially, we need LocalStorage with asynchronous IO (based on Node's
 callback style), large quota support, and a BTree API. Failing that, a
 decent FileSystem API on which to build these.


Stateless indexes can be provided differently to how you suggest. You can
have a 'validate_index' call that checks the index exists and creates it if
it does not. It is stateless in the sense that you call that to open
existing index or create one, you dont care if the database has one already
or not.

Infact you can make SQL stateless by providing a validate_schema call that
succeeds if the schema of the database matches the passed schema, can be
modified with no data loss to be the same, or needs to be created.

The RelationalDB wrapper for WebSQL provides this kind of stateless approach
for SQL... you can check it out on github if you like (its a work in
progress though):

https://github.com/keean/RelationalDB


Cheers,
Keean.


Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?

2011-03-31 Thread Keean Schupke
No real reason - just trying to implement a minimal framework. Date objects
would be a definite must have going forward.

I was interested in trying to get something like this standardised, as I
believe it has none of the issues that stopped WebSQL, as it defines a
complete relational API independent of the implementation of SQL behind it.

The key thing is to get the browser implementors interested in implementing
it. If even one of the main browser implementors is not interested in
implementing it, then it will suffer the same fate as WebSQL.

Independent of standardisation (which I would like) I intend to try and
implement the same API on top of WebSQL and IndexedDB as a library. So
people are free to use the backend with the best performance without
changing their code. It aims to be as stateless as possible, and to
implement relational algebra on relations.


Cheers,
Keean.


On 31 March 2011 15:54, Nathan Kitchen w...@nathankitchen.com wrote:

 That's nice, pretty much what I was thinking but somewhat more complete : )
 Is there not a w3 group progressing something like this? And if not, who
 would need to be lobbied to get one started?!

 As an aside, I note you didn't implement date as a supported data type.
 Was that a conscious decision, and if so what was the reasoning behind it?

 N


 On 31 March 2011 16:33, Keean Schupke ke...@fry-it.com wrote:

 Have a look at my RelationalDB API

 https://github.com/keean/RelationalDB

 In particular examples/candy.html

 A lot of work went into the underlying concepts - Its work originally
 published by myself and others at the 2004 Haskell Workshop, and follows on
 from HaskellDB which was the original inspiration behind C#s Linq
 functionality).

 It implements the relational-algebra operators as methods that operate on
 relation objects.

 Let me know what you think.


 Cheers,
 Keean.


 On 31 March 2011 15:19, Nathan Kitchen w...@nathankitchen.com wrote:

 Hi.

 I've been watching discussions on IndexedDB for a while now, and wondered
 if anyone would mind spending a few moments to explain how IndexedDB is
 related (or not) to WebSQL. Is IndexedDB seen as replacing the functionality
 originally offered by WebSQL? If not, are there any plans to make a
 cross-platform variant of Web SQL?

 If (?) most web developers know SQL, is there a case to be made for
 abstracting SQL into JSON/JavaScript rather than moving to IndexedDB
 document storage? Reasons for asking this:

- Many of the posts appearing to come from the dev community rather
than W3C seem to expect more SQL-esque functionality from IndexedDB. If 
 the
enthusiasts who get involved enough to post to the board are expecting
SQL/query type experience, maybe there is a driver for a native database 
 API
supporting this.
- Several people have noted that third-party frameworks could
implement this functionality. This might be a daft question, but isn't it
easier to implement an IndexedDB-like framework on top of WebSQL, 
 than a
WebSQL-like framework on top of IndexedDB (overuse of quotes to 
 indicate
the general concept).

 I had a ponder on how I'd like to see such a framework implemented (in
 both Access  SQLite :p ), and came up with a stack of pseudo-code below in
 my lunch break. Might make an interesting discussion point. It's not really
 IndexedDB, it's WebSQL v2. Or maybe WebJSQL or something. I'd be really
 interested to understand what advantages IndexedDB has over an
 implementation like the one below though.

 // DATABASE
 // First, open a database with the specified name. The number at the
 // end denotes the version of the specification that the application
 // plans to use. This allows forward-compatibility with vNext.
 var db = window.openDatabase(shoppinglist, 1.0);

 // MIGRATIONS
 // Next, create some migrations. These are predefined structures which
 // are validated by the browser database engine. A migration consists
 // of two actions: one up, one down. Each action specifies some
 // operations and parameters. It's up to the browser database to read
 // these and perform the appropriate action, as defined in the spec.

 // Other actions may include a batch add for static data. It could
 // also be valid to have key and index creation and removal as separate
 // actions.

 // SHOPPING TRIP
 var createTripTableAction = {
 action: create-table,
 params: {
 name: trip,
 columns: [
   { name: id, type: whole-number, primaryKey: true },
   { name: name, type: string, length: 32, regex:
 [A-Z]{1,32} } // regex: wouldn't that be nice...
 ],
 indexes: [
   {
 columns: [
   { name: name,   type: full-text }
 ]
   } // More indexes here if required
 ]
   }
   };

 var removeTripTableAction = {
 action: remove-table,
 params: {
 name: shopping,
 cleardata: true
   }
   };


 // SHOPPING
 var

Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Keean Schupke
On 31 March 2011 18:17, Jeremy Orlow jor...@chromium.org wrote:

 On Thu, Mar 31, 2011 at 11:09 AM, Keean Schupke ke...@fry-it.com wrote:

 On 31 March 2011 17:41, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, Mar 31, 2011 at 1:32 AM, Joran Greef jo...@ronomon.com wrote:
  On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote:
 
  I previously have asked for a detailed proposal, but so far you have
  not supplied one but instead keep referring to other unnamed database
  APIs.
 
  I have already provided an adequate interface proposal for putObject
 and deleteObject.

 That is hardly a comprehensive proposal, but rather just one small part
 of it.


 I wanted to make a few comments about these points :-



 I do really think the idea of not having the implementation keep track
 of the set of indexes for a objectStore is a really interesting one.
 As is the idea of not even having a set set of objectStores. However,
 there are several problems that needs to be solved. In particular how
 do you deal with collations?


 no indexes, no object stores... well I for one prefer the
 validate_object_store, validate_index approach, in that it can hide
 statefullness if necessary (like I do with RelationalDB) whilst presenting a
 stateless API. It also keeps the size of the put statements down.



 I.e. we have concluded that there are important use cases which
 require using different collations for different indexes and
 objectStores. Even for different indexes attached to the same
 objectStore.

 Additionally, if we're getting rid of setVersion, how do we expect
 pages dealing with the (application managed) schema changing while the
 page has a connection open to the database?


 1 - there is no schema
 2 - dont allow it to change whilst the database is open

 In reality a schema is implicitly tied to a code version. In other words
 the source code of the application assumes a certain schema. If the assumed
 schema and the schema in the DB do not match things are going to go very
 wrong very quickly. Schema changes _always_ accompany code changes
 (otherwise they are not schema changes just data changes). As such they
 never happen when a DB is open. The way I handle this in RelationalDB, by
 validating the actual schema against the source-code schema in the db-open
 (actually the method is called validate), is probably the best way to handle
 this. If the database does not exist we create it according to the schema.
 If it exists we check it matches the schema. If there is a difference we see
 if we can 'upgrade' the database automatically (certain changes like adding
 a new column with a default value can be done automaticall), if we cannot
 automaticall upgrade, we exit with an error - as allowing the program to run
 will result in corruption of the data already in the database. At this point
 it is up to the application to figure out how to upgrade the database (by
 opening one database with an old schema and another with a new schema)...
 There is not point in ever allowing a database to be opened with the wrong
 schema.


 So pretty please, with sugar on top, please come up with a proposal
 for the full API rather than bits and pieces.

 And I should mention that I have as an absolute requirement that you
 should be able to specify collation by simply saying that you want to
 use en-US or sv-SV sorting. Using callbacks or other means is ok
 *in addition to this*, but callback mechanisms tend to be a lot more
 complex since they have to deal with the callback doing all sorts of
 evil things such as returning inconsistent results (think return
 Math.random()), or simply do evil things like navigate the current
 page, deleting the database, or modifying the record that is in the
 process of being stored.


 The core API only needs to deal with sorting binary-blob sort orders. A
 library wrapper could provide all the collation ordering goodness that
 people want. For example RelationalDB will have to deal with sorting orders,
 it does not need the browser to provide that functionality. In fact browser
 provided functionality may limit what can be done in libraries on top.


 This is difficult if not impossible to do.  See previous threads on the
 matter.

 J


I can find a lot of stuff on collation, but not a lot about why it could not
be done in a library. Could you summerise the reasons why this needs to be
core functionality for me?

A library could chose to use an object store as meta-data to store the
collation orders that it is using for various indexes for example.


Cheers,
Keean.


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Keean Schupke
On 31 March 2011 18:36, Jeremy Orlow jor...@chromium.org wrote:

 On Thu, Mar 31, 2011 at 11:24 AM, Keean Schupke ke...@fry-it.com wrote:

 On 31 March 2011 18:17, Jeremy Orlow jor...@chromium.org wrote:

 On Thu, Mar 31, 2011 at 11:09 AM, Keean Schupke ke...@fry-it.comwrote:

 On 31 March 2011 17:41, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, Mar 31, 2011 at 1:32 AM, Joran Greef jo...@ronomon.com
 wrote:
  On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote:
 
  I previously have asked for a detailed proposal, but so far you have
  not supplied one but instead keep referring to other unnamed
 database
  APIs.
 
  I have already provided an adequate interface proposal for putObject
 and deleteObject.

 That is hardly a comprehensive proposal, but rather just one small part
 of it.


 I wanted to make a few comments about these points :-



 I do really think the idea of not having the implementation keep track
 of the set of indexes for a objectStore is a really interesting one.
 As is the idea of not even having a set set of objectStores. However,
 there are several problems that needs to be solved. In particular how
 do you deal with collations?


 no indexes, no object stores... well I for one prefer the
 validate_object_store, validate_index approach, in that it can hide
 statefullness if necessary (like I do with RelationalDB) whilst presenting 
 a
 stateless API. It also keeps the size of the put statements down.



 I.e. we have concluded that there are important use cases which
 require using different collations for different indexes and
 objectStores. Even for different indexes attached to the same
 objectStore.

 Additionally, if we're getting rid of setVersion, how do we expect
 pages dealing with the (application managed) schema changing while the
 page has a connection open to the database?


 1 - there is no schema
 2 - dont allow it to change whilst the database is open

 In reality a schema is implicitly tied to a code version. In other words
 the source code of the application assumes a certain schema. If the assumed
 schema and the schema in the DB do not match things are going to go very
 wrong very quickly. Schema changes _always_ accompany code changes
 (otherwise they are not schema changes just data changes). As such they
 never happen when a DB is open. The way I handle this in RelationalDB, by
 validating the actual schema against the source-code schema in the db-open
 (actually the method is called validate), is probably the best way to 
 handle
 this. If the database does not exist we create it according to the schema.
 If it exists we check it matches the schema. If there is a difference we 
 see
 if we can 'upgrade' the database automatically (certain changes like adding
 a new column with a default value can be done automaticall), if we cannot
 automaticall upgrade, we exit with an error - as allowing the program to 
 run
 will result in corruption of the data already in the database. At this 
 point
 it is up to the application to figure out how to upgrade the database (by
 opening one database with an old schema and another with a new schema)...
 There is not point in ever allowing a database to be opened with the wrong
 schema.


 So pretty please, with sugar on top, please come up with a proposal
 for the full API rather than bits and pieces.

 And I should mention that I have as an absolute requirement that you
 should be able to specify collation by simply saying that you want to
 use en-US or sv-SV sorting. Using callbacks or other means is ok
 *in addition to this*, but callback mechanisms tend to be a lot more
 complex since they have to deal with the callback doing all sorts of
 evil things such as returning inconsistent results (think return
 Math.random()), or simply do evil things like navigate the current
 page, deleting the database, or modifying the record that is in the
 process of being stored.


 The core API only needs to deal with sorting binary-blob sort orders. A
 library wrapper could provide all the collation ordering goodness that
 people want. For example RelationalDB will have to deal with sorting 
 orders,
 it does not need the browser to provide that functionality. In fact browser
 provided functionality may limit what can be done in libraries on top.


 This is difficult if not impossible to do.  See previous threads on the
 matter.

 J


 I can find a lot of stuff on collation, but not a lot about why it could
 not be done in a library. Could you summerise the reasons why this needs to
 be core functionality for me?


 Sorry, but that stuff is paged out of my brain.  Pablo, can you?


 A library could chose to use an object store as meta-data to store the
 collation orders that it is using for various indexes for example.


 Cheers,
 Keean.




Thanks would help me understand. As long as there is a way to turn default
collation off and just have a binary string sort order thats fine for my
needs.


Cheers,
Keean.


Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?

2011-03-31 Thread Keean Schupke
On 31 March 2011 19:08, Joran Greef jo...@ronomon.com wrote:

  This is painful to read.  WebSQL development died because SQLite, the
 most widely-deployed database software in the world, was too good?  That
 sounds like a catastrophic failure of the W3C process.
 
  --
  Glenn Maynard

 Hear.

 I am starting to think that Mozilla will step up and provide an embedding
 of SQLite, even if it has to only think of it as such. It will have to.

 People would rather use a working database than something crippled albeit
 specced (see LocalStorage or IndexedDB).

 It was things like XHR in all their unspecced glory that brought the web to
 where it is today.


Do you want to take a look at my RelationalDB library - it could form the
basis of a replacement for WebSQL, and as it is based on relational algebra
not SQL, it has not user visible dependencies on the particular SQL
implementation?

Have a look at:
https://github.com/keean/RelationalDB/blob/master/examples/candy.html

For a usage example. This should run in chrome right now (using WebSQL as a
backend).

I would appreciate any thoughts, comments etc.


Cheers,
Keean.


Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque

2011-03-31 Thread Keean Schupke
 Currently there are no APIs in JavaScript to compare strings using
specific collations

We dont actually need this, just a mapping from UTF-16 string to a
sort-score (binary blob).

Its true that downloading the collation tables might take time, so we could
just provide:

var blob = string_to_score('utf-16 string', 'en-US');

as a built in function to make this efficient.

I agree with the other points though.


Cheers,
Keean.


On 31 March 2011 22:38, Pablo Castro pablo.cas...@microsoft.com wrote:


 From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy
 Orlow
 Sent: Thursday, March 31, 2011 11:36 AM

  I can find a lot of stuff on collation, but not a lot about why it could
 not be done in a library. Could you summerise the reasons why this needs to
 be core functionality for me?
 
  Sorry, but that stuff is paged out of my brain.  Pablo, can you?
  
  A library could chose to use an object store as meta-data to store the
 collation orders that it is using for various indexes for example.

 - Currently there are no APIs in JavaScript to compare strings using
 specific collations. There are folks that are looking into this, but it will
 need time.
 - I'm far from an expert in the topic, but from talking to folks that
 understand this well it seems that to actually implement this entirely in
 JavaScript it would mean you have to download collation tables and apply
 them as needed in callbacks. Not only this means a hit in download size/time
 for the app but also that callbacks have to either download stuff or inline
 collation rules/tables in the callback itself.
 - In pure practical terms, I suspect the 80% scenario can be covered by
 implementing this natively, having it be fast and simple to use for common
 cases. Not pushing back on the callback stuff, just saying that I find it
 valuable to have users simply say en-US and get what they wanted.
 - Also from the practical perspective, simple cases that don't require the
 flexibility and can avoid having to take care of making the callbacks
 perfectly consistent even as you roll out updates that may hit only some of
 the pages, use components written by someone else, etc.
 - By default we would still do binary collation (there was a question in
 the thread, I forget exactly where).

 Thanks
 -pablo




Re: [Bug 12321] New: Add compound keys to IndexedDB

2011-03-18 Thread Keean Schupke
I like BDB's solution. You have one primary key you cannot mess with (say an
integer for fast comparisons) you can then add any number of secondary
indexes. With a secondary index there is a callback to generate a binary
blob that is used for indexing. The callback has access to all the fields of
the object plus any info in the closure and can use that to generate the
index data any way it likes.

This has the advantage of supporting any indexing scheme's the user may wish
to implement (by writing a custom callback), whist allowing a few common
options to be provided for the user (say a hash of all fields, or a field
name, international char set, and direction captured in a closure). The user
gets the power, the core implementation is simple, and common cases can be
implemented in an easy to use way.

var lex_order = function(field, charset, direction) {return function(object)
{/* map indexed 'field' to blob in required order */ return key;};};

Then create a new index:

object_store.validate_index(1, lex_order('name', 'us',
'ascending')).on_done(function(status) {/* status ok or error */})

validate index checks if the requested secondary index (1) exists, if it
does not it creates the index and calls the done callback (with a status
code indicating successful creation), if it does and it passes some
validation checks it also calls the done callback (with a status code
indicating successful validation). If anything goes wrong with either the
creation or validation of the secondary index if would call the done
callback with an error status code.


Cheers,
Keean.


On 18 March 2011 02:03, Jeremy Orlow jor...@chromium.org wrote:

 Here's one ugliness with A: There's no way to specify ascending
 or descending for the individual components of the key.  So there's no way
 for me to open a cursor that looks at one field ascending and the other
 field descending.  In addition, I can't think of any easy/good ways to hack
 around this.

 Any thoughts on how we could address this use case?

 J

 On Wed, Mar 16, 2011 at 4:50 PM, bugzi...@jessica.w3.org wrote:

 http://www.w3.org/Bugs/Public/show_bug.cgi?id=12321

   Summary: Add compound keys to IndexedDB
   Product: WebAppsWG
   Version: unspecified
  Platform: PC
OS/Version: All
Status: NEW
  Severity: normal
  Priority: P2
 Component: Indexed Database API
AssignedTo: dave.n...@w3.org
ReportedBy: jor...@chromium.org
 QAContact: member-webapi-...@w3.org
CC: m...@w3.org, public-webapps@w3.org


 From the thread [IndexedDB] Compound and multiple keys by Jonas
 Sicking,
 we're going to go with both options A and B.

 =

 Hi IndexedDB fans (yay!!),

 Problem description:

 One of the current shortcomings of IndexedDB is that it doesn't
 support compound indexes. I.e. indexing on more than one value. For
 example it's impossible to index on, and therefor efficiently search
 for, firstname and lastname in an objectStore which stores people. Or
 index on to-address and date sent in an objectStore holding emails.

 The way this is traditionally done is that multiple values are used as
 key for each individual entry in an index or objectStore. For example
 the CREATE INDEX statement in SQL can list multiple columns, and
 CREATE TABLE statment can list several columns as PRIMARY KEY.

 There have been a couple of suggestions how to do this in IndexedDB

 Option A)
 When specifying a key path in createObjectStore and createIndex, allow
 an array of key-paths to be specified. Such as

 store = db.createObjectStore(mystore, [firstName, lastName]);
 store.add({firstName: Benny, lastName: Zysk, age: 28});
 store.add({firstName: Benny, lastName: Andersson, age: 63});
 store.add({firstName: Charlie, lastName: Brown, age: 8});

 The records are stored in the following order
 Benny, Andersson
 Benny, Zysk
 Charlie, Brown

 Similarly, createIndex accepts the same syntax:
 store.createIndex(myindex, [lastName, age]);

 Option B)
 Allowing arrays as an additional data type for keys.
 store = db.createObjectStore(mystore, fullName);
 store.add({fullName: [Benny, Zysk], age: 28});
 store.add({fullName: [Benny, Andersson], age: 63});
 store.add({fullName: [Charlie, Brown], age: 8});

 Also allows out-of-line keys using:
 store = db.createObjectStore(mystore);
 store.add({age: 28}, [Benny, Zysk]);
 store.add({age: 63}, [Benny, Andersson]);
 store.add({age: 8}, [Charlie, Brown]);

 (the sort order here is the same as in option A).

 Similarly, if an index pointed used a keyPath which points to an
 array, this would create an entry in the index which used a compound
 key consisting of the values in the array.

 There are of course advantages and disadvantages with both options.

 Option A advantages:
 * Ensures that at objectStore/index creation time the number of keys
 are known. This allows the implementation to create and optimize the
 index using this 

Re: [IndexedDB] Spec changes for international language support

2011-03-18 Thread Keean Schupke
See my proposal in another thread. The basic idea is to copy BDB. Have a
primary index that is based on an integer, something primitive and fast.
Allow secondary indexes which use a callback to generate a binary index key.
IDB shifts the complexity out into a library. Common use cases can be
provided (a hash of all fields in the object, internationalised
bidirectional lexicographic etc...), but the user is free to write their own
for less usual cases (for example indexing by the last word in a name string
to order by surname).


Cheers,
Keean.


On 18 March 2011 02:19, Jonas Sicking jo...@sicking.cc wrote:

 2011/3/17 Pablo Castro pablo.cas...@microsoft.com:
 
  From: Jonas Sicking [mailto:jo...@sicking.cc]
  Sent: Tuesday, March 08, 2011 1:11 PM
 
  All in all, is there anything preventing adding the API Pablo suggests
  in this thread to the IndexedDB spec drafts?
 
  I wanted to propose a couple of specific tweaks to the initial proposal
 and then unless I hear pushback start editing this into the spec.
 
  From reading the details on this thread I'm starting to realize that
 per-database collations won't do it. What did it for me was the example that
 has a fuzzier matching mode (case/accent insensitive). This is exactly the
 kind of index I would want to sort people's names in my address book, but
 most likely not the index I'll want to use for my primary key.
 
  Refactoring the API to accommodate for this would mean to move the
 setCollation() method and the collation property to the object store and
 index objects. If we were willing to live without the ability to change them
 we could take collation as one of the optional parameters to
 createObjectStore()/createIndex() and reduce a bit of surface area...

 Unfortunately I think you bring up good use cases for
 per-objectStore/index collations. It's definitely tempting to just add
 it as a optional parameter to createObjectStore/createIndex. The
 downside is obviously pushing more complexity onto web developers.
 Complexity which will be duplicated across sites.

 However there is another problem to consider here. Can switching
 collation on a objectStore or a unique index can affect its validity?
 I.e. if you switch from a case sensitive to a case insensitive
 collation, does that mean that if you have two entries with the
 primary keys Sweden and sweden they collide and thus the change of
 collation must result in an error (or aborted transaction)?

 I do seem to recall that there are ways to do at least case
 sensitivity such that you generally don't take case into account when
 sorting, unless two entries are exactly the same, in which case you do
 look at casing to differentiate them. However I don't really know a
 whole lot about this and so defer to people that know
 internationalization better.

  I don't have a strong preference there. In any case both would use BCP47
 names as discussed in this thread (as Jonas pointed out, implementations can
 also do their thing as long as they don't interfere with BCP47).
 
  Another piece of feedback I heard consistently as I discussed this with
 various folks at Microsoft is the need to be able to pick up what the UA
 would consider the collation that's most appropriate for the user
 environment (derived from settings, page language or whatever). We could
 support this by introducing a special value that  you can pass to
 setCollation that indicates pick whatever is the right for the
 environment's language right now. Given that there is no other way for
 people to discover the user preference on this, I think this is pretty
 important.

 I would be fine with this as long as it's a explicit opt-in. There is
 definitely a risk that people will do this and then only do testing in
 one language, but it seems to me like a useful use case to support,
 and I don't see a way of supporting this while completely avoiding the
 risk of internationalization bugs.

 / Jonas




Re: [IndexedDB] Spec changes for international language support

2011-03-18 Thread Keean Schupke
On 18 March 2011 19:29, Pablo Castro pablo.cas...@microsoft.com wrote:


 From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com]
 On Behalf Of Keean Schupke
 Sent: Friday, March 18, 2011 1:53 AM

  See my proposal in another thread. The basic idea is to copy BDB. Have a
 primary index that is based on an integer, something primitive and fast.
 Allow secondary indexes which use a callback to generate a binary index key.
 IDB shifts the complexity out into a library. Common use cases can be
 provided (a hash of all fields in the object, internationalised
 bidirectional lexicographic etc...), but the user is free to write their own
 for less usual cases (for example indexing by the last word in a name string
 to order by surname).

 I agree with Jeremy's comments on the other thread for this. Having the
 callback mechanism definitely sounds interesting but there are a ton of
 common cases that we can solve by just taking a language identifier, I'm not
 sure we want to make people work hard to get something that's already
 supported in most systems. The idea of having a callback to compute the
 index value feels incremental to this, so we could take on it later on
 without disrupting the explicit international collation stuff.


The idea would be to provide pre-defined implementations of the callback for
common use cases, then it is just as simple to register a callback as set
any other option. All this means to the API is you pass a function instead
of a string. It also is better for modularity as all the code relating to
the sort order is kept in the callback functions.

The difference comes down to something like:

index.set_order_lexicographic('us');

vs

index.set_order_method(order_lexicographic('us'));

So more than just setting a property like the first case, where presumably
all the ordering code is mixed in with the indexing code, the second case
encapsulates all the ordering code in the function returned from the
execution of order_lexicographic('us'). This function would represent a
mapping from the object being indexed to a binary blob that is the actual
stored index data.

So doing it this was does not necessarily make things harder, and it
improves encapsulation, the type-safety, and the flexibility of the API.


Cheers,
Keean.


Re: [IndexedDB] Compound and multiple keys

2011-03-09 Thread Keean Schupke
Getting pgsql people involved sounds a great idea. Having some more people
to argue for formalised and standardised database APIs like SQL, and
experience with relational operations and optimisation would be good (That
is an assumption on my part, but then they are writing PostgreSQL not
CouchDB). Do you know some people you could invite?

More generally though, I think BerkeleyDB would make a much better target
for IDB. I don't think it should be trying to be PostgreSQL or MySQL. I
think that implementing a good low-level API like BerkeleyDB that has enough
functionality to allow SQL to be implemented over the top.

The problem with trying to implement IDB on top of PostgreSQL is that IDB
has a very narrow interface, that does not support any of the powerful
features of pgsql. It would give you the worst of both. BDB would make a
much implementation.

Far more sensible would be to target the feature set of BDB for IDB, then
PostgreSQL could be re-implemented in JavaSctipt on top.  (a massive and
impractical task, but I am trying to express the relationship between high
level and low level database APIs).


If we wanted to go fully relational, and avoid the correctness problems with
string processing SQL commands, take a look at my relational library,
currently implemented on top of WebSQL but an IDB version is in the works:
https://github.com/keean/RelationalDB


Cheers,
Keean.


On 9 March 2011 04:10, Charles Pritchard ch...@jumis.com wrote:

  On 3/8/2011 6:12 PM, Jeremy Orlow wrote:

 On Tue, Mar 8, 2011 at 5:55 PM, Pablo Castro 
 pablo.cas...@microsoft.comwrote:


 From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org]
 On Behalf Of Keean Schupke
 Sent: Tuesday, March 08, 2011 3:03 PM

  No objections here.
 
  Keean.
 
  On 8 March 2011 21:14, Jonas Sicking 
  jo...@sicking.ccjo...@sicking.ccwrote:
  On Mon, Mar 7, 2011 at 10:43 PM, Jeremy Orlow jor...@chromium.org
 wrote:
   On Fri, Jan 21, 2011 at 1:41 AM, Jeremy Orlow jor...@chromium.org
 wrote:

   After thinking about it a bunch and talking to others, I'm actually
 leaning
   towards both option A and B.  Although this will be a little harder
 for
   implementors, it seems like there are solid reasons why some users
 would
   want to use A and solid reasons why others would want to use B.
   Any objections to us going that route?
  Not from me. If I don't hear objections I'll write up a spec draft and
  attach it here before committing to the spec.

  Option A is pretty well understood, I like that one.

 For option B, at some point we had a debate on whether when indexing an
 array value we should consider it a single key value or we should unfold it
 into multiple index records. The first option makes it very similar to A in
 that an array is just a composite value (it is quite a bit more painful to
 implement...), the second option is interesting in that allows for new
 scenarios such as objects with an array for tags, where you want to look up
 by tag (even after doing options A and B as currently defined, in order
 support multiple tags you'd need a second store that keeps the tags + key
 for the objects you want to tag). Is there any interest in that scenario?


  Yes.  Once we're settled on this, I'm going to send an email on that.
  :-)  Option b won't get in the way of my proposal.

  J


 At some point, I really would like to get people from the PostgreSQL
 project involved with indexeddb.

 They have a wealth of experience to bring to the discussion. For the
 moment, like many server-side packages, they're at quite a distance from
 the w3.

 FWIW, pgsql is a perfectly valid 'host' for idb calls.





Re: [IndexedDB] Compound and multiple keys

2011-03-09 Thread Keean Schupke
I have already said I have no specific concerns regarding this change. Its
difficult to predict problems that will emerge when people actually try and
use an API. That's why there are so many bad APIs out there. One way to
mitigate this risk is to look at well used existing APIs (in languages like
'c') to see what works well. Many people often write different APIs for the
same task, and the best win. I would look to existing winners (like BDB) for
guidance on the total API, as due to the standardisation process (and the
nature of web browsers) there is no opportunity for competition to choose
the best API. It would be nice if jsnode was more advanced, then there might
be many database API implementations in JavaScript we could look at to see
which are preferred and use as a starting point.

Looking at the requirements for IDB, BerkeleyDB would seem to be an ideal
candidate to port the API, its popular, widely used and has stood the test
of time, and is easy to use, and would be even easier in JavaScript with
garbage collection.


Cheers,
Keean.


On 9 March 2011 09:41, Jeremy Orlow jor...@chromium.org wrote:

 Keean/Charles:

 I definitely think the more people involved the better, but let's not get
 too hung up on the specifics of PostgreSQL, BDB, etc.  Our goal here should
 be to make a great API for web developers while balancing practical
 considerations like how difficult it'll be to implement and/or use
 efficiently.

 That said, I'm not understanding what your comments have to do with this
 proposal.  Do you have specific concerns?

 J


 On Wed, Mar 9, 2011 at 12:55 AM, Keean Schupke ke...@fry-it.com wrote:

 Getting pgsql people involved sounds a great idea. Having some more people
 to argue for formalised and standardised database APIs like SQL, and
 experience with relational operations and optimisation would be good (That
 is an assumption on my part, but then they are writing PostgreSQL not
 CouchDB). Do you know some people you could invite?

 More generally though, I think BerkeleyDB would make a much better target
 for IDB. I don't think it should be trying to be PostgreSQL or MySQL. I
 think that implementing a good low-level API like BerkeleyDB that has enough
 functionality to allow SQL to be implemented over the top.

 The problem with trying to implement IDB on top of PostgreSQL is that IDB
 has a very narrow interface, that does not support any of the powerful
 features of pgsql. It would give you the worst of both. BDB would make a
 much implementation.

 Far more sensible would be to target the feature set of BDB for IDB, then
 PostgreSQL could be re-implemented in JavaSctipt on top.  (a massive and
 impractical task, but I am trying to express the relationship between high
 level and low level database APIs).


 If we wanted to go fully relational, and avoid the correctness problems
 with string processing SQL commands, take a look at my relational library,
 currently implemented on top of WebSQL but an IDB version is in the works:
 https://github.com/keean/RelationalDB


 Cheers,
 Keean.


 On 9 March 2011 04:10, Charles Pritchard ch...@jumis.com wrote:

  On 3/8/2011 6:12 PM, Jeremy Orlow wrote:

 On Tue, Mar 8, 2011 at 5:55 PM, Pablo Castro pablo.cas...@microsoft.com
  wrote:


 From: public-webapps-requ...@w3.org [mailto:
 public-webapps-requ...@w3.org] On Behalf Of Keean Schupke
 Sent: Tuesday, March 08, 2011 3:03 PM

  No objections here.
 
  Keean.
 
  On 8 March 2011 21:14, Jonas Sicking 
  jo...@sicking.ccjo...@sicking.ccwrote:
  On Mon, Mar 7, 2011 at 10:43 PM, Jeremy Orlow jor...@chromium.org
 wrote:
   On Fri, Jan 21, 2011 at 1:41 AM, Jeremy Orlow jor...@chromium.org
 wrote:

   After thinking about it a bunch and talking to others, I'm actually
 leaning
   towards both option A and B.  Although this will be a little harder
 for
   implementors, it seems like there are solid reasons why some users
 would
   want to use A and solid reasons why others would want to use B.
   Any objections to us going that route?
  Not from me. If I don't hear objections I'll write up a spec draft
 and
  attach it here before committing to the spec.

  Option A is pretty well understood, I like that one.

 For option B, at some point we had a debate on whether when indexing an
 array value we should consider it a single key value or we should unfold it
 into multiple index records. The first option makes it very similar to A in
 that an array is just a composite value (it is quite a bit more painful to
 implement...), the second option is interesting in that allows for new
 scenarios such as objects with an array for tags, where you want to look up
 by tag (even after doing options A and B as currently defined, in order
 support multiple tags you'd need a second store that keeps the tags + key
 for the objects you want to tag). Is there any interest in that scenario?


  Yes.  Once we're settled on this, I'm going to send an email on that.
  :-)  Option b won't get in the way of my

Re: [IndexedDB] Two Real World Use-Cases

2011-03-08 Thread Keean Schupke
On 8 March 2011 06:33, Joran Greef jo...@ronomon.com wrote:

 On 08 Mar 2011, at 7:23 AM, Dean Landolt wrote:

  This doesn't seem right. Assuming your WebSQL implementation had all the
 same indexes isn't it doing pretty much the same things as using separate
 objectStores in IDB? Why would it be an order of magnitude slower? I'm sure
 whatever implementation you're using hasn't seen much optimization but you
 seem to be implying there's something more fundamental? The only thing I can
 think of to blame would be the fat in the objectStore interface -- like, for
 instance, the index building facilities. It seems to me your proposed
 solution is to add yet more fat to the interface (more complex indexing),
 but wouldn't it be just as suitable to instead strip down objectStores to
 their bare essentials to make them more suitable to act as indexes? Then the
 indexing functionality and all the hard decisions could be punted to
 libraries where they'd be free to innovate.

 Exactly. It's not what one would expect, and indication of the poor state
 of the IDB implementation (which is essentially a wrapper around SQLite
 anyway).

 If someone is advising that object stores be used to handle indexes then
 may I be the first to raise a red flag and say that IDB is failing us (and
 it would have been better for the spec team to provide a locking mechanism
 for LocalStorage so it could be used in that way). The whole point of IDB as
 far as I can see is to provide transactional indexed access to a key value
 store.

  Why? You wouldn't necessarily have to store the whole object in each
 index, just the index key, a value and some pointer to the original source
 object. Something to resolve this pointer to the source would need to be
 spec'd (a la couchdb's include_docs), but that's simple. Even better, say it
 were possible to define a link relation on an object store that can resolve
 to its source object -- you could define a source link relation and the
 property to use -- and this would have the added bonus of being more broadly
 applicable than just linking an index record to its source instance.

 Think of the object creation and JSON serialization/deserialization
 overhead for putting 50 indexes and you have got more than enough waste
 there already.

  We can fix all of this right now very simply:
 
  1. Enable objectStore.put and objectStore.delete to accept a setIndexes
 option and an unsetIndexes option. The value passed for either option would
 be an array (string list) of index references.
 
  This would only work for indexes arrays of strings, right? Things can get
 much more complicated than that, and when they do you'd have to use an
 objectStore to do your indexing anyway, right?

 No it would work for pretty much anything. The application would be free to
 determine the indexes, and also to convert query parameters into indexes
 when querying. It's essentially computed indexes without the hassles of
 IDB trying to do it (there was an interesting thread last year on the
 challenges of storing am index computing function in IDB).

  Why is it more theoretically performant than using objectStores in the
 raw?

 It's a more direct interface. Think about it for a second. Using
 objectStores in the raw is interpolating O(n) complexity with multiple
 function calls, to give just one reason. If IDB can receive a list of
 indexes to add and remove an object to and from, then it can also do things
 like perform a set difference first to save unnecessary IO. I have written a
 database or two with this technique and it's certainly faster.

  I don't necessarily understand the stateful vs. stateless distinction
 here. I don't see how your proposed solution removes the requirement for IDB
 to enforce constraints when certain indexes are present. Developers would
 already be able to use IDB statefully (with predefined schemas) -- they'd
 just use a library that has a schema mechanism. I doubt such a library for
 IDB already exists, but it'd be quite easy to port perstore, for instance,
 which is derived from the IDB API and already has this functionality using
 json-schema. There will no doubt be many ORM-like libraries that will pop up
 as soon as IDB starts to stabilize (or as soon as it gets a node.js
 implementation).

 The trouble is you always think a database would be quite easy until you
 actually try to do it yourself. At first when I dug into IDB I didn't think
 there would be any problems that could not be handled in some way. I have
 actually switched back to WebSQL now and will encourage my users to use
 Safari or Chrome as long as these browsers support WebSQL (and I hope Chrome
 will at least finish up by adding a quota interface for WebSQL). IDB right
 now is like a completely neutered slower SQLite without any of the benefits
 to be expected of a transactional indexed KV store. It's really sad.

 For examples of stateless databases see the interfaces for Redis (the best
 example, and a perfect 

Re: [IndexedDB] Two Real World Use-Cases

2011-03-08 Thread Keean Schupke
Actually I am not sure now if SQLite uses BDB now (they might be moving to
it though). However BDB definitely now has an SQLite-3.0 compatible API now
and supports better concurrency, as well as AES encryption. So at the moment
looks like i'm moving to using BDB instead of SQLite, (apart from when size
of the app package file is an issue and SQLite is provided as part of the
platform).


Cheers,
Keean.



On 8 March 2011 17:54, Dean Landolt d...@deanlandolt.com wrote:



 On Tue, Mar 8, 2011 at 1:33 AM, Joran Greef jo...@ronomon.com wrote:

 On 08 Mar 2011, at 7:23 AM, Dean Landolt wrote:

  This doesn't seem right. Assuming your WebSQL implementation had all the
 same indexes isn't it doing pretty much the same things as using separate
 objectStores in IDB? Why would it be an order of magnitude slower? I'm sure
 whatever implementation you're using hasn't seen much optimization but you
 seem to be implying there's something more fundamental? The only thing I can
 think of to blame would be the fat in the objectStore interface -- like, for
 instance, the index building facilities. It seems to me your proposed
 solution is to add yet more fat to the interface (more complex indexing),
 but wouldn't it be just as suitable to instead strip down objectStores to
 their bare essentials to make them more suitable to act as indexes? Then the
 indexing functionality and all the hard decisions could be punted to
 libraries where they'd be free to innovate.

 Exactly. It's not what one would expect, and indication of the poor state
 of the IDB implementation (which is essentially a wrapper around SQLite
 anyway).


 Which implementation? Why do you think it's a wrapper around SQLite? I
 doubt it could be implemented efficiently this way (due to its schema-free
 nature), so that would explain your benchmarks. But why would you judge the
 spec on one poor implementation?



 If someone is advising that object stores be used to handle indexes then
 may I be the first to raise a red flag and say that IDB is failing us (and
 it would have been better for the spec team to provide a locking mechanism
 for LocalStorage so it could be used in that way).


 This is hyperbole. The critical feature IDB gives us is efficient range
 retrieval -- try that with LocalStorage.


 The whole point of IDB as far as I can see is to provide transactional
 indexed access to a key value store.


 You say indexed, I say ordered. An objectStore is more than a kv store
 -- the keys are stored and traversed in order. This is the win, what makes
 IDB objectStores so special. This also makes them look an awful lot like
 indexes too!

 (Which reminds me: last time I checked collation is still up in the air --
 this could be very problematic for interop. Anyone know of any plans to
 correct this in the first version?)



  Why? You wouldn't necessarily have to store the whole object in each
 index, just the index key, a value and some pointer to the original source
 object. Something to resolve this pointer to the source would need to be
 spec'd (a la couchdb's include_docs), but that's simple. Even better, say it
 were possible to define a link relation on an object store that can resolve
 to its source object -- you could define a source link relation and the
 property to use -- and this would have the added bonus of being more broadly
 applicable than just linking an index record to its source instance.

 Think of the object creation and JSON serialization/deserialization
 overhead for putting 50 indexes and you have got more than enough waste
 there already.


 How does your proposal avoid this?



  We can fix all of this right now very simply:
 
  1. Enable objectStore.put and objectStore.delete to accept a setIndexes
 option and an unsetIndexes option. The value passed for either option would
 be an array (string list) of index references.
 
  This would only work for indexes arrays of strings, right? Things can
 get much more complicated than that, and when they do you'd have to use an
 objectStore to do your indexing anyway, right?

 No it would work for pretty much anything. The application would be free
 to determine the indexes, and also to convert query parameters into indexes
 when querying. It's essentially computed indexes without the hassles of
 IDB trying to do it (there was an interesting thread last year on the
 challenges of storing am index computing function in IDB).

  Why is it more theoretically performant than using objectStores in the
 raw?

 It's a more direct interface. Think about it for a second. Using
 objectStores in the raw is interpolating O(n) complexity with multiple
 function calls, to give just one reason.


 Huh? If an objectStore is backed by something like a BDB btree, as is
 implied by the design of the spec, retrieval ought to be O(log base_n) where
 base_n is the average page size. Writing would have O(n) complexity where n
 is the number of indexes, but the same is true for your proposal, right?


 If 

Re: [IndexedDB] Compound and multiple keys

2011-03-08 Thread Keean Schupke
No objections here.

Keean.


On 8 March 2011 21:14, Jonas Sicking jo...@sicking.cc wrote:

 On Mon, Mar 7, 2011 at 10:43 PM, Jeremy Orlow jor...@chromium.org wrote:
  On Fri, Jan 21, 2011 at 1:41 AM, Jeremy Orlow jor...@chromium.org
 wrote:
 
  On Thu, Jan 20, 2011 at 6:29 PM, Tab Atkins Jr. jackalm...@gmail.com
  wrote:
 
  On Thu, Jan 20, 2011 at 10:12 AM, Keean Schupke ke...@fry-it.com
 wrote:
   Compound primary keys are commonly used afaik.
 
  Indeed.  It's one of the common themes in the debate between natural
  and synthetic keys.
 
  Fair enough.
  Should we allow explicit compound keys?  I.e myOS.put({...}, ['first
  name', 'last name'])?  I feel pretty strongly that if we do, we should
  require this be specified up-front when creating the objectStore.  I.e.
 add
  some additional parameter to the optional options object.  Otherwise,
 we'll
  force implementations to handle variable compound keys for just this one
  case, which seems kind of silly.
  The other option is to just disallow them.
 
  After thinking about it a bunch and talking to others, I'm actually
 leaning
  towards both option A and B.  Although this will be a little harder for
  implementors, it seems like there are solid reasons why some users would
  want to use A and solid reasons why others would want to use B.
  Any objections to us going that route?

 Not from me. If I don't hear objections I'll write up a spec draft and
 attach it here before committing to the spec.

 / Jonas



Re: [IndexedDB] Two Real World Use-Cases

2011-03-03 Thread Keean Schupke
On 3 March 2011 09:15, Joran Greef jo...@ronomon.com wrote:

 Hi Jonas

 I have been trying out your suggestion of using a separate object store to
 do manual indexing (and so support compound indexes or index object
 properties with arrays as values).

 There are some problems with this approach:

 1. It's far too slow. To put an object and insert 50 index records (typical
 when updating an inverted index) this way takes 100ms using IDB versus 10ms
 using WebSQL (with a separate indexes table and compound primary key on
 index name and object key). For instance, my application has a real
 requirement to replicate 4,000,000 emails between client and server and I
 would not be prepared to accept latencies of 100ms to store each object.
 That's more than the network latency.

 2. It's a waste of space.

 Using a separate object store to do manual indexing may work in theory but
 it does not work in practice. I do not think it can even be remotely
 suggested as a panacea, however temporary it may be.

 We can fix all of this right now very simply:

 1. Enable objectStore.put and objectStore.delete to accept a setIndexes
 option and an unsetIndexes option. The value passed for either option would
 be an array (string list) of index references.

 2. The object would first be removed as a member from any indexes
 referenced by the unsetIndexes option. Any referenced indexes which would be
 empty thereafter would be removed.

 3. The object would then be added as a member to any indexes referenced by
 the setIndexes option. Any referenced indexes which do not yet exist would
 be created.

 This would provide the much-needed indexing capabilities presently lacking
 in IDB without sacrificing performance.

 It would also enable developers to use IDB statefully (MySQL-like
 pre-defined schemas with the DB taking on the complexities of schema
 migration and data migration) or statelessly (See Berkeley DB with the
 application responsible for the complexities of data maintenance) rather
 than enforcing an assumption at such an early stage.

 Regards

 Joran Greef



Why would this be faster? Surely most of the time in inserting the 50
indexes is the search time of the index, and the JavaScript function call
overhead would be minimal (its only 50 calls)?

Cheers,
Keean.


Re: [IndexedDB] Two Real World Use-Cases

2011-03-02 Thread Keean Schupke
If you are operating on indexes then you do not have a 'join' language as
you are operating on sets. To have a join you need to be operating on
relations. A relation is commonly visualised as a row in a table in a
relational database, With IDB this would be the union of all the
property-sets of the objects in the index. A complete set of relational
operators would be:

project
restrict
rename
join
union
difference

In most useful syntaxes you don't need rename as the other methods handle
renaming attributes already. Join is traditionally a Cartesian-product, but
a natural-join can be substituted without losing completeness. Intersection
is not included as it is easily derived from union and
(symmetric)difference.


Cheers,
Keean.

On 2 March 2011 06:35, Joran Greef jo...@ronomon.com wrote:

 On 01 Mar 2011, at 7:27 PM, Jeremy Orlow wrote:

  1. Be able to put an object and pass an array of index names which must
 reference the object. This may remove the need for a complicated indexing
 spec (perhaps the reason why this issue has been pushed into the future) and
 give developers all the flexibility they need.
 
  You're talking about having multiple entries in a single index that point
 towards the same primary key?  If so, then I strongly agree, and I think
 others agree as well.  It's mostly a question of syntax.  A while ago we
 brainstormed a couple possibilities.  I'll try to send out a proposal this
 week.  I think this + compound keys should probably be our last v1 features
 though.  (Though they almost certainly won't make Chrome 11 or Firefox 4,
 unfortunately, hopefully they'll be done in the next version of each, and
 hopefully that release with be fairly soon after for both.)

 Yes, for example this user object { name: Joran Greef, emails: [
 jo...@ronomon.com, jorangr...@gmail.com] } with indexes on the emails
 property, would be found in the jo...@ronomon.com index as well as in
 the jorangr...@gmail.com index.

 What I've been thinking though is that the problem even with formally
 specifying indexes in advance of object put calls, is that this pushes too
 much application model logic into the database layer, making the database
 enforce a schema (at least in terms of indexes). Of course IDB facilitates
 migrations in the form of setVersion, but most schema migrations are also
 coupled with changes to the data itself, and this would still have to be
 done by the application in any event. So at the moment IDB takes too much
 responsibility on behalf of the application (computing indexes, pre-defined
 indexes, pseudo migrations) and not enough responsibility for pure database
 operations (index intersections and index unions).

 I would argue that things like migrations and schema's are best handled by
 the application, even if this is more work for the application, as most
 people will write wrappers for IDB in any event and IDB is supposed to be a
 core-level API. The acid-test must be that the database is oblivious to
 schemas or anything pre-defined or application-specific (i.e. stateless).
 Otherwise IDB risks being a database for newbies who wouldn't use it, and a
 database that others would treat as a KV anyway (see MySQL at FriendFeed).

 A suggested interface then for putting or deleting objects, would be:
 objectStore.put(object, [indexname1, indexname2, indexname3]) and then
 IDB would need to ensure that the object would be referenced by the given
 index names. When removing the object, the application would need to provide
 the indexes again (or IDB could keep track of the indexes associated with an
 object).

 Using a function to compute indexes would not work as this would entrap
 application-specific schema knowledge within the function (which would need
 to be persisted) and these may subsequently change in the application, which
 would then need a way to modify the function again. The key is that these
 things must be stateless.

 The objects must be opaque to IDB (no need for
 serialization/deserialization overhead at the DB layer). Things like
 key-paths etc. could be removed and the object id just passed in to put or
 delete calls.

  2. Be able to intersect and union indexes. This covers a tremendous
 amount of ground in terms of authorization and filtering.
 
  Our plan was to punt some sort of join language to v2.  Could you give a
 more concrete proposal for what we'd add?  It'd make it easier to see if
 it's something realistic for v1 or not.

 If you can perform intersect or union operations (and combinations of
 these) on indexes (which are essentially sets or sorted sets), then this
 would be the join language. It has the benefit that the interface would then
 be described in terms of operations on data structures (set operations on
 sets) rather than a custom language which would take longer to spec out.

 I've written databases over append-only files, S3, WebSQL and even
 LocalStorage (!) and from what I've found with my own applications, you
 could handle 

Re: [IndexedDB] Two Real World Use-Cases

2011-03-02 Thread Keean Schupke
On 2 March 2011 11:31, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Mar 1, 2011 at 10:35 PM, Joran Greef jo...@ronomon.com wrote:
  On 01 Mar 2011, at 7:27 PM, Jeremy Orlow wrote:
 
  1. Be able to put an object and pass an array of index names which must
 reference the object. This may remove the need for a complicated indexing
 spec (perhaps the reason why this issue has been pushed into the future) and
 give developers all the flexibility they need.
 
  You're talking about having multiple entries in a single index that
 point towards the same primary key?  If so, then I strongly agree, and I
 think others agree as well.  It's mostly a question of syntax.  A while ago
 we brainstormed a couple possibilities.  I'll try to send out a proposal
 this week.  I think this + compound keys should probably be our last v1
 features though.  (Though they almost certainly won't make Chrome 11 or
 Firefox 4, unfortunately, hopefully they'll be done in the next version of
 each, and hopefully that release with be fairly soon after for both.)
 
  Yes, for example this user object { name: Joran Greef, emails: [
 jo...@ronomon.com, jorangr...@gmail.com] } with indexes on the emails
 property, would be found in the jo...@ronomon.com index as well as in
 the jorangr...@gmail.com index.
 
  What I've been thinking though is that the problem even with formally
 specifying indexes in advance of object put calls, is that this pushes too
 much application model logic into the database layer, making the database
 enforce a schema (at least in terms of indexes). Of course IDB facilitates
 migrations in the form of setVersion, but most schema migrations are also
 coupled with changes to the data itself, and this would still have to be
 done by the application in any event. So at the moment IDB takes too much
 responsibility on behalf of the application (computing indexes, pre-defined
 indexes, pseudo migrations) and not enough responsibility for pure database
 operations (index intersections and index unions).
 
  I would argue that things like migrations and schema's are best handled
 by the application, even if this is more work for the application, as most
 people will write wrappers for IDB in any event and IDB is supposed to be a
 core-level API. The acid-test must be that the database is oblivious to
 schemas or anything pre-defined or application-specific (i.e. stateless).
 Otherwise IDB risks being a database for newbies who wouldn't use it, and a
 database that others would treat as a KV anyway (see MySQL at FriendFeed).
 
  A suggested interface then for putting or deleting objects, would be:
 objectStore.put(object, [indexname1, indexname2, indexname3]) and then
 IDB would need to ensure that the object would be referenced by the given
 index names. When removing the object, the application would need to provide
 the indexes again (or IDB could keep track of the indexes associated with an
 object).
 
  Using a function to compute indexes would not work as this would entrap
 application-specific schema knowledge within the function (which would need
 to be persisted) and these may subsequently change in the application, which
 would then need a way to modify the function again. The key is that these
 things must be stateless.
 
  The objects must be opaque to IDB (no need for
 serialization/deserialization overhead at the DB layer). Things like
 key-paths etc. could be removed and the object id just passed in to put or
 delete calls.

 I agree that we are currently enforcing a bit of schema due to the way
 indexes work. However I think it's a good approach for an initial
 version of this API as it covers the most simple use cases. Note that
 the more complex use cases are still very possible by simply using a
 separate objectStore as an index and manually add/remove things there.

 I still believe that using a function, which is persisted in the
 database, is very doable. And yes, the function needs to be stateless
 and it needs to be possible to change the set of functions which
 manage the set of indexes associated with a given objectStore
 (probably by simply allowing indexes to be created and removed, which
 is already the case).

 / Jonas


I would recommend against storing functions in the database (not saying it
should not be possible, but stored procedures obscure functionality, and
cause surprises which are both bad things IMHO). For this kind of thing I
would create a master index from object-id to object, and then create
multiple secondary indexes from property to object-id. Removing an object is
simply removing it from the master index. You would avoid the slow scan of
the secondary indexes (slow because you have to visit each object to delete
by value) by simply leaving the entries there, they would be filtered out of
any results because the object-id is no longer in the master-index (a fast
lookup). You would then occasionally do a scan of the secondary indexes to
remove several dead references in one 

Re: [IndexedDB] Two Real World Use-Cases

2011-03-02 Thread Keean Schupke
On 2 March 2011 12:09, Joran Greef jo...@ronomon.com wrote:

 On 02 Mar 2011, at 1:31 PM, Jonas Sicking wrote:

  I agree that we are currently enforcing a bit of schema due to the way
  indexes work. However I think it's a good approach for an initial
  version of this API as it covers the most simple use cases. Note that
  the more complex use cases are still very possible by simply using a
  separate objectStore as an index and manually add/remove things there.
 
  I still believe that using a function, which is persisted in the
  database, is very doable. And yes, the function needs to be stateless
  and it needs to be possible to change the set of functions which
  manage the set of indexes associated with a given objectStore
  (probably by simply allowing indexes to be created and removed, which
  is already the case).
 
  / Jonas

 Thank you Jonas, I'm using your multi objectStore trick at the moment to
 store indexes.

 It just seems that the most direct way of doing all of this, would just be
 to let the application pass in the relevant index references when it makes
 put or delete calls. IDB is almost becoming a Rube Goldberg device trying to
 find other ways of doing this.

 The reason I bring it up, is because I just made this same change with my
 server database, which used to require schema knowledge, so it could compute
 indexes etc., and then I realized this could all be eliminated completely by
 just passing indexes per put and delete call.

 I really don't think IDB should try and dip it's toes into application
 state in the first place, let alone try and keep up with application state
 thereafter. What is the motivation for doing that? It's not absolutely
 necessary. It's an assumption that is bloating almost every part of the
 spec. It's not the killer feature of IDB, and it's getting in the way of
 things that could be, such as indexing and querying. If version 1 is done
 right, there will be no need for version 2. There's been a tremendous amount
 of discussion regarding IDB and people like yourself and Jeremy have
 certainly contributed massively, but I do get the feeling (as may you) that
 version 2 is becoming a stopover for things that have not been thought
 through completely, for which a solution is not yet clear, something's not
 right. I only say this from recently re-writing a database after making the
 same mistake.



Personally I think allowing multiple index entries for a single object
breaks referential transparency. I would have one index where objects are
indexed by a unique object ID, and another index object-ids are indexed by
email-address. I suspect this is what you are doing now?

To improve on this situation (and keep referential transparency) would
require multiple indexes on a single object (so you can have a unique
primary key (object-id), and a secondary index on email-address), but as I
said earlier you are then well on the way to re-inventing a relational
database. IMHO you are then better off implementing relations properly,
rather than producing something not entirely quite unlike a relational
database.

Cheers,
Keean.


Re: IndexedDB: updates through cursors on indexes that change the key

2011-02-01 Thread Keean Schupke
Surely the cursor should be atomic, representing the instant in time the
query executed. Any updates or deletes etc would not be visible to the
cursor, only to later queries. Then you can allow any modifications
including to keys and indexes.

Cheers,
Keean

On 2 Feb 2011 00:05, Jeremy Orlow jor...@chromium.org wrote:

On Tue, Feb 1, 2011 at 2:56 PM, Jonas Sicking jo...@sicking.cc wrote:



 On Tue, Feb 1, 2011 at 11:44 AM, Jeremy Orlow jor...@chromium.org wrote:
  On Tue, Feb 1, 2...
Good points (against having it remove the original key if it changes).

After some more thought: The original idea behind cursor.delete() and
cursor.update() was that they would basically just be aliases for
objectStore.delete() and objectStore.put().  Maybe calling .update() with a
changed primary key should simply have the same behavior as .put().  Thus
the value corresponding to the original key would be left unmodified and the
new key would then correspond to the new value.

I can't think of any examples where the current behavior would get in
someone's way though.  So I guess maybe we should just leave it as is.  But
I still hate the idea of it being subtly different from being a straight up
alias to put.

J


Re: IndexedDB: updates through cursors on indexes that change the key

2011-02-01 Thread Keean Schupke
That seems to be different from accepted practice in databases. I

On 2 Feb 2011 00:39, ben turner bent.mozi...@gmail.com wrote:

No, that idea was rejected a while ago. IndexedDB cursors are live, so
any change made during the cursor are visible to the cursor as well as
later queries.

-Ben Turner


On Tue, Feb 1, 2011 at 4:35 PM, Keean Schupke ke...@fry-it.com wrote:
 Surely the cursor should ...


Re: IndexedDB: updates through cursors on indexes that change the key

2011-02-01 Thread Keean Schupke
Sorry, sent that before I was finished.

Seems prone to problems in environments with multiple parallel accesses to
the same database.

I guess I would need to do an atomic copy of the elements to a separate
object store to iterate throught? Is there a way of atomically copying a set
of objects?

Cheers,
Keean.

On 2 Feb 2011 00:41, Keean Schupke ke...@fry-it.com wrote:

That seems to be different from accepted practice in databases. I



 On 2 Feb 2011 00:39, ben turner bent.mozi...@gmail.com wrote:

 No, that idea was rejecte...



 On Tue, Feb 1, 2011 at 4:35 PM, Keean Schupke ke...@fry-it.com wrote:
 Surely the cursor should ...


Re: IndexedDB: updates through cursors on indexes that change the key

2011-02-01 Thread Keean Schupke
So whats the benefit of allowing a cursor to modify the data under it?

Cheers,
Keean.

On 2 February 2011 01:17, Jonas Sicking jo...@sicking.cc wrote:

 On Tue, Feb 1, 2011 at 4:48 PM, Keean Schupke ke...@fry-it.com wrote:
  Sorry, sent that before I was finished.
 
  Seems prone to problems in environments with multiple parallel accesses
 to
  the same database.

 As long as you're inside a transaction, no other environments (be they
 separate tabs running in a separate process, workers running in a
 separate thread, or separate components running in the same page) will
 be able to mutate the data under you.

 / Jonas



Re: IndexedDB: updates through cursors on indexes that change the key

2011-02-01 Thread Keean Schupke
I see. I suppose for the relational stuff that I am doing I will have to
copy all the data in the cursor, otherwise it will mess up updates and
inserts with nested selects.

Cheers,
Keean.

On 2 Feb 2011 01:32, Jeremy Orlow jor...@chromium.org wrote:

Please look at the mail archives.  IIRC, it seemed confusing that you could
be looking at old data.  Iterating on live data seems more consistent with
run to completion semantics.

J



On Tue, Feb 1, 2011 at 5:26 PM, Keean Schupke ke...@fry-it.com wrote:

 So whats the benefit o...


Re: [IndexedDB] Compound and multiple keys

2011-01-20 Thread Keean Schupke
Out of line keys (B) for me. You can have a key that is not an object
property that way... and you can include the key in the object optionally.
There is also no need to give the key fields in advance. These two things
together make this the best option IMHO.

Keean
 On 20 Jan 2011 10:52, Jeremy Orlow jor...@chromium.org wrote:
 Ok. So what's the resolution? Let's bug it!

 On Fri, Dec 10, 2010 at 12:34 PM, Jeremy Orlow jor...@chromium.org
wrote:

 Any other thoughts on this issue?


 On Thu, Dec 2, 2010 at 7:19 AM, Keean Schupke ke...@fry-it.com wrote:

 I think I prefer A. Declaring the keys in advance is stating to sound a
 little like a schema, and when you go down that route you end up at SQL
 schemas (which is a good thing in my opinion). I understand however that
 some people are not so comfortable with the idea of a schema, and these
 people seem to be the kind of people that like IndexedDB. So, although I
 prefer A for me, I would have to say B for IndexedDB.

 So in conclusion: I think B is the better choice for IndexedDB, as it
is
 more consistent with the design of IDB.

 As for the cons of B, sorting an array is just like sorting a string,
 and it already supports string types.

 Surely there is also option C:

 store.add({firstName: Benny, lastName: Zysk, age: 28}, [firstName,
 lastName]);
 store.add({firstName: Benny, lastName: Andersson, age:
 63}, [firstName, lastName]);

 Like A, but listing the properties to include in the composite index
 with each add, therefore avoiding the schema...


 As for layering the Relational API over the top, It doesn't make any
 difference, but I would prefer whichever has the best performance.


 Cheers,
 Keean.


 On 2 December 2010 00:57, Jonas Sicking jo...@sicking.cc wrote:

 Hi IndexedDB fans (yay!!),

 Problem description:

 One of the current shortcomings of IndexedDB is that it doesn't
 support compound indexes. I.e. indexing on more than one value. For
 example it's impossible to index on, and therefor efficiently search
 for, firstname and lastname in an objectStore which stores people. Or
 index on to-address and date sent in an objectStore holding emails.

 The way this is traditionally done is that multiple values are used as
 key for each individual entry in an index or objectStore. For example
 the CREATE INDEX statement in SQL can list multiple columns, and
 CREATE TABLE statment can list several columns as PRIMARY KEY.

 There have been a couple of suggestions how to do this in IndexedDB

 Option A)
 When specifying a key path in createObjectStore and createIndex, allow
 an array of key-paths to be specified. Such as

 store = db.createObjectStore(mystore, [firstName, lastName]);
 store.add({firstName: Benny, lastName: Zysk, age: 28});
 store.add({firstName: Benny, lastName: Andersson, age: 63});
 store.add({firstName: Charlie, lastName: Brown, age: 8});

 The records are stored in the following order
 Benny, Andersson
 Benny, Zysk
 Charlie, Brown

 Similarly, createIndex accepts the same syntax:
 store.createIndex(myindex, [lastName, age]);

 Option B)
 Allowing arrays as an additional data type for keys.
 store = db.createObjectStore(mystore, fullName);
 store.add({fullName: [Benny, Zysk], age: 28});
 store.add({fullName: [Benny, Andersson], age: 63});
 store.add({fullName: [Charlie, Brown], age: 8});

 Also allows out-of-line keys using:
 store = db.createObjectStore(mystore);
 store.add({age: 28}, [Benny, Zysk]);
 store.add({age: 63}, [Benny, Andersson]);
 store.add({age: 8}, [Charlie, Brown]);

 (the sort order here is the same as in option A).

 Similarly, if an index pointed used a keyPath which points to an
 array, this would create an entry in the index which used a compound
 key consisting of the values in the array.

 There are of course advantages and disadvantages with both options.

 Option A advantages:
 * Ensures that at objectStore/index creation time the number of keys
 are known. This allows the implementation to create and optimize the
 index using this information. This is especially useful in situations
 when the indexedDB implementation is backed by a SQL database which
 uses columns as a way to represent multiple keys.
 * Easy to use when key values appear as separate properties on the
 stored object.
 * Obvious how to sort entries.

 Option A disadvantages:
 * Doesn't allow compound out-of-line keys.
 * Requires multiple properties to be added to stored objects if the
 components of the key isn't available there (for example if it's
 out-of-line or stored in an array).

 Option B advantages:
 * Allows compound out-of-line keys.
 * Easy to use when the key values are handled as an array by other
 code. Both when using in-line and out-of-line keys.
 * Maximum flexibility since you can combine single-value keys and
 compound keys in one objectStore, as well as arrays of different
 length (we couldn't come up with use cases for this though).

 Option B disadvantages:
 * Requires defining sorting between single values

Re: [chromium-html5] LocalStorage inside Worker

2011-01-11 Thread Keean Schupke
I think the idea is that JavaScript should not do unexpected things. The
suggestion to only make local storage accessible from inside callbacks seems
the best suggestion so far.


Cheers,
Keean.


On 11 January 2011 06:20, Felix Halim felix.ha...@gmail.com wrote:

 On Tue, Jan 11, 2011 at 1:02 PM, Glenn Maynard gl...@zewt.org wrote:
  localStorage should focus on simplicity and performance and ignore
  thread safety since, IMHO, localStorage is used for UI purposes or
  preferences settings (not data itself). If you open two tab, you
  change settings in one tab, you can just refresh the other tab and I
  believe both of them will have the same UI state again.
 
  It's used for data storage, too, particularly since it's widely
  available in production; IndexedDB is not.

 Then, why don't introduce a new storage, like localStorageNTS (NTS =
 non thread safe), and allow this storage to be used everywhere...

 Felix Halim



Re: [chromium-html5] LocalStorage inside Worker

2011-01-11 Thread Keean Schupke
I think I already came to the same conclusion... JavaScript has no control
over effects, which devalues STM. In the absence of effect control, apparent
serialisation (of transactions) is the best you can do.

What we need is a purely functional JavaScript, it makes threading so much
easier ;-)


Cheers,
Keean.


On 10 January 2011 23:42, Robert O'Callahan rob...@ocallahan.org wrote:

 STM is not a panacea. Read
 http://www.bluebytesoftware.com/blog/2010/01/03/ABriefRetrospectiveOnTransactionalMemory.aspxif
  you haven't already.

 In Haskell, where you have powerful control over effects, it may work well,
 but Javascript isn't anything like that.

 Rob
 --
 Now the Bereans were of more noble character than the Thessalonians, for
 they received the message with great eagerness and examined the Scriptures
 every day to see if what Paul said was true. [Acts 17:11]



Re: [IndexedDB] Events and requests

2011-01-11 Thread Keean Schupke
Comments inline:

On 11 January 2011 07:11, Axel Rauschmayer a...@rauschma.de wrote:

 Coming back to the initial message in this thread (at the very bottom):
 = General rule of thumb: clearly separate input data and output data.

 Using JavaScript dynamic nature, things could look as follows:

 indexedDB.open('AddressBook', 'Address Book', {
 success: function(evt) {
 },
 error: function(evt) {
 }
 });


Personally I prefer a single callback passed an object.

indexedDB.open('AddressBook', 'Address Book', function(event) {
switch(event.status) {
case EVENT_SUCCESS: 
break;
case EVENT_ERROR: 
break;
}
});

As it allows callbacks to be composed more easily.

- The last argument is thus the request and clearly input.

 - If multiple success handlers are needed, success could be an array of
 functions (same for error handlers).


multiple handlers can be passes using a composition function:

// can be defined in the library
var all = function(flist) {
   return function(event) {
   for (int i = 0; i  flist.length; i++) {
   flist[i](event);
   }
};
};

indexedDB.open('AddressBook', 'Address Book', all([fn1, fn2, fn3]));


Cheers,
Keean.



 - I would eliminiate readyState and move abort() to IDBEvent (=output and
 an interface to the DB client).

 - With subclasses of IDBEvent one has the choice of eliminating them by
 making their fields additional parameters of success() and error().
 event.result is a prime candidate for this!

 - This above way eliminates the need of manipulating the request *after* (a
 reference to) it has been placed in the event queue.

 Questions:

 - Is it really necessary to make IDBEvent a subclass of Event and thus drag
 the DOM (which seems to be universally hated) into IndexedDB?

 - Are there any other asynchronous DB APIs for dynamic languages that one
 could learn from (especially from mistakes that they have made)? They must
 have design principles and rationales one might be able to use. WebDatabase
 (minus schema plus cursor) looks nice.

 On Jan 10, 2011, at 23:40 , Keean Schupke wrote:

 Hi,

 I did say it was for fun!  If you think it should be suggested somewhere I
 am happy to do so. Note that  I renamed 'onsuccess' to 'bind' to show how it
 works as a monad, there is no need to do this (although I prefer to it to
 explicitly show it is a Monad).

 The definition of unit is simply:

 var unit = function(v) {
 return {
 onsuccess: function(f) {f(v);}
 };
 };

 And then you can compose callbacks using 'onsuccess'...

 you might like to keep onsuccess, and use result instead of unit... So
 simply using the above definition you can compose callbacks:

 var y =
 db.transaction([foo]).objectStore(foo).getM(mykey1).onsuccess(function(result1)
 {

  
 db.transaction([foo]).objectStore(foo).getM(mykey2).onsuccess(function(result2)
 {
 result(result1 + result2);
 });
 });


 Cheers,
 Keean.


 On 10 January 2011 22:31, Jonas Sicking jo...@sicking.cc wrote:

 This seems like something better suggeseted to the lists at ECMA where
 javascript (or rather ECMAScript) is being standardized. I hardly
 think that a database API like indexedDB is the place to redefine how
 javascript should handle asynchronous programming.

 / Jonas

 On Mon, Jan 10, 2011 at 2:26 PM, Keean Schupke ke...@fry-it.com wrote:
  Just to correct my cut and paste error, that was of course supposed to
 be:
  var y = do {
  result1 - db.transaction([foo]).objectStore(foo).getM(mykey1);
  result2 - db.transaction([foo]).objectStore(foo).getM(mykey2);
  unit(result1 + result2);
  }
 
  Cheers,
  Keean.
  On 10 January 2011 22:24, Keean Schupke ke...@fry-it.com wrote:
 
  Okay, sorry, the original change seemed sensible, I guess I didn't see
 how
  you got from there to promises.
 
  Here's some fun to think about as an alternative though:
 
  Interestingly the pattern of multiple callbacks, providing each
 callback
  is passed zero or one parameter forms a Monad.
  So for example if 'unit' is the constructor for the object returned
 from
  get then onsuccess it 'bind' and I can show that these obey the 3
 monad
  laws. Allowing composability of callbacks. So you effectively have:
  var x = db.transaction([foo]).objectStore(foo).getM(mykey);
  var y =
 
 db.transaction([foo]).objectStore(foo).getM(mykey1).bind(function(result1)
  {
 
 
  
 db.transaction([foo]).objectStore(foo).getM(mykey2).bind(function(result2)
  {
  unit(result1 + result2);
  });
  });
  The two objects returned x and y are both the same kind of object.
 y
  represents the sum or concatination of the results of the lookups
 mykey1
  and mykey2. You would use it identically to using the result of a
 single
  lookup:
  x.bind(function(result) {... display the result of a single lookup
 ...});
  y.bind(function(result) {... display the result of both lookups ...});
 
  If we could then have some syntactic

Re: [IndexedDB] Events and requests

2011-01-11 Thread Keean Schupke
If one handler changes the state who knows what will happen. I guess the
order in which handers are called is significant. That's one advantage to
using a function like all to compose callbacks - its very clear what order
they get called in. You could call it 'sequence' to make it even clearer
(that they are called one at a time left to right, not in parallel).

You could make the callback an optional parameter, and use it if supplied,
and return an object (for the existing API if none is supplied).


Cheers,
Keean.


On 11 January 2011 09:31, Axel Rauschmayer a...@rauschma.de wrote:

 Looks great, I just tried to stay as close to the current API as possible.

 A single handler should definitely be enough. Can, say, a cursor be read
 multiple times (if there are several success handlers)? Doesn’t that make
 things more complicated?

 On Jan 11, 2011, at 10:22 , Keean Schupke wrote:

 Comments inline:

 On 11 January 2011 07:11, Axel Rauschmayer a...@rauschma.de wrote:

 Coming back to the initial message in this thread (at the very bottom):
 = General rule of thumb: clearly separate input data and output data.

 Using JavaScript dynamic nature, things could look as follows:

 indexedDB.open('AddressBook', 'Address Book', {
 success: function(evt) {
 },
 error: function(evt) {
 }
 });


 Personally I prefer a single callback passed an object.

 indexedDB.open('AddressBook', 'Address Book', function(event) {
 switch(event.status) {
 case EVENT_SUCCESS: 
 break;
 case EVENT_ERROR: 
 break;
 }
 });

 As it allows callbacks to be composed more easily.

 - The last argument is thus the request and clearly input.

 - If multiple success handlers are needed, success could be an array of
 functions (same for error handlers).


 multiple handlers can be passes using a composition function:

 // can be defined in the library
 var all = function(flist) {
return function(event) {
for (int i = 0; i  flist.length; i++) {
flist[i](event);
}
 };
 };

 indexedDB.open('AddressBook', 'Address Book', all([fn1, fn2, fn3]));


 Cheers,
 Keean.



 - I would eliminiate readyState and move abort() to IDBEvent (=output and
 an interface to the DB client).

 - With subclasses of IDBEvent one has the choice of eliminating them by
 making their fields additional parameters of success() and error().
 event.result is a prime candidate for this!

 - This above way eliminates the need of manipulating the request *after*
 (a reference to) it has been placed in the event queue.

 Questions:

 - Is it really necessary to make IDBEvent a subclass of Event and thus
 drag the DOM (which seems to be universally hated) into IndexedDB?

 - Are there any other asynchronous DB APIs for dynamic languages that one
 could learn from (especially from mistakes that they have made)? They must
 have design principles and rationales one might be able to use. WebDatabase
 (minus schema plus cursor) looks nice.

 On Jan 10, 2011, at 23:40 , Keean Schupke wrote:

 Hi,

 I did say it was for fun!  If you think it should be suggested somewhere I
 am happy to do so. Note that  I renamed 'onsuccess' to 'bind' to show how it
 works as a monad, there is no need to do this (although I prefer to it to
 explicitly show it is a Monad).

 The definition of unit is simply:

 var unit = function(v) {
 return {
 onsuccess: function(f) {f(v);}
 };
 };

  And then you can compose callbacks using 'onsuccess'...

 you might like to keep onsuccess, and use result instead of unit... So
 simply using the above definition you can compose callbacks:

 var y =
 db.transaction([foo]).objectStore(foo).getM(mykey1).onsuccess(function(result1)
 {

  
 db.transaction([foo]).objectStore(foo).getM(mykey2).onsuccess(function(result2)
 {
 result(result1 + result2);
 });
 });


 Cheers,
 Keean.


 On 10 January 2011 22:31, Jonas Sicking jo...@sicking.cc wrote:

 This seems like something better suggeseted to the lists at ECMA where
 javascript (or rather ECMAScript) is being standardized. I hardly
 think that a database API like indexedDB is the place to redefine how
 javascript should handle asynchronous programming.

 / Jonas

 On Mon, Jan 10, 2011 at 2:26 PM, Keean Schupke ke...@fry-it.com wrote:
  Just to correct my cut and paste error, that was of course supposed to
 be:
  var y = do {
  result1 - db.transaction([foo]).objectStore(foo).getM(mykey1);
  result2 - db.transaction([foo]).objectStore(foo).getM(mykey2);
  unit(result1 + result2);
  }
 
  Cheers,
  Keean.
  On 10 January 2011 22:24, Keean Schupke ke...@fry-it.com wrote:
 
  Okay, sorry, the original change seemed sensible, I guess I didn't see
 how
  you got from there to promises.
 
  Here's some fun to think about as an alternative though:
 
  Interestingly the pattern of multiple callbacks, providing each
 callback
  is passed zero or one parameter forms a Monad.
  So for example

Re: [chromium-html5] LocalStorage inside Worker

2011-01-11 Thread Keean Schupke
Would each 'name' storage have its own thread to improve parallelism?


would:

withNamedStorage('x', function(store) {...});

make more sense from a naming point of view?


Cheers,
Keean.


On 11 January 2011 20:58, Jonas Sicking jo...@sicking.cc wrote:

 With localStorage being the way it is, I personally don't think we can
 ever allow localStorage access in workers.

 However I do think we can and should provide access to a separate
 storage area (or several named storage areas) which can only be
 accessed from callbacks. On the main thread those callbacks would be
 asynchronous. In workers those callbacks can be either synchronous or
 asynchronous. Here is the API I'm proposing:

 getNamedStorage(in DOMString name, in Function callback);
 getNamedStorageSync(in DOMString name, in Function callback);

 The latter is only available in workers. The former is available in
 both workers and in windows. When the callback is called it's given a
 reference to the Storage object which has the exact same API as
 localStorage does.

 Also, you're not allowed to nest getNamedStorageSync and/or
 IDBDatabaseSync.transaction calls.

 This has the added advantage that it's much more implementable without
 threading hazards than localStorage already is.

 / Jonas

 On Tue, Jan 11, 2011 at 6:40 AM, Jeremy Orlow jor...@chromium.org wrote:
  So what's the plan for localStorage in workers?
  J
 
  On Tue, Jan 11, 2011 at 9:10 AM, Keean Schupke ke...@fry-it.com wrote:
 
  I think I already came to the same conclusion... JavaScript has no
 control
  over effects, which devalues STM. In the absence of effect control,
 apparent
  serialisation (of transactions) is the best you can do.
  What we need is a purely functional JavaScript, it makes threading so
 much
  easier ;-)
 
  Cheers,
  Keean.
 
  On 10 January 2011 23:42, Robert O'Callahan rob...@ocallahan.org
 wrote:
 
  STM is not a panacea. Read
 
 http://www.bluebytesoftware.com/blog/2010/01/03/ABriefRetrospectiveOnTransactionalMemory.aspx
  if you haven't already.
 
  In Haskell, where you have powerful control over effects, it may work
  well, but Javascript isn't anything like that.
 
  Rob
  --
  Now the Bereans were of more noble character than the Thessalonians,
 for
  they received the message with great eagerness and examined the
 Scriptures
  every day to see if what Paul said was true. [Acts 17:11]
 
 
 



Re: [IndexedDB] Events and requests

2011-01-10 Thread Keean Schupke
Okay, sorry, the original change seemed sensible, I guess I didn't see how
you got from there to promises.


Here's some fun to think about as an alternative though:


Interestingly the pattern of multiple callbacks, providing each callback is
passed zero or one parameter forms a Monad.

So for example if 'unit' is the constructor for the object returned from
get then onsuccess it 'bind' and I can show that these obey the 3 monad
laws. Allowing composability of callbacks. So you effectively have:

var x = db.transaction([foo]).objectStore(foo).getM(mykey);

var y =
db.transaction([foo]).objectStore(foo).getM(mykey1).bind(function(result1)
{

 db.transaction([foo]).objectStore(foo).getM(mykey2).bind(function(result2)
{
unit(result1 + result2);
});
});

The two objects returned x and y are both the same kind of object. y
represents the sum or concatination of the results of the lookups mykey1
and mykey2. You would use it identically to using the result of a single
lookup:

x.bind(function(result) {... display the result of a single lookup ...});

y.bind(function(result) {... display the result of both lookups ...});


If we could then have some syntactic sugar for this like haskell's do
notation we could write:

var y = do {
db.transaction([foo]).objectStore(foo).getM(mykey1);
result1 - db.transaction([foo]).objectStore(foo).getM(mykey2);
result2 - db.transaction([foo]).objectStore(foo).getM(mykey2);
unit(result1 + result2);
}

Which would be a very neat way of chaining callbacks...


Cheers,
Keean.


On 10 January 2011 22:00, Keean Schupke ke...@fry-it.com wrote:

 Whats wrong with callbacks? To me this seems an unnecessary complication.

 Presumably you would do:

 var promise = db.transaction([foo]).objectStore(foo).get(mykey);
 var result = promise.get();
 if (!result) {
 promise.onsuccess(function(res) {...X...});
 } else {
 ...Y...
 }


 So you end up having to duplicate code at X and Y to do the same thing
 directly or in the context of a callback. Or you define a function to
 process the result:

 var f = function(res) {...X...};
 var promise = db.transaction([foo]).objectStore(foo).get(mykey);
 var result = promise.get();
 if (!result) {
 promise.onsuccess(f);
 } else {
 f(result)
 };

 But in which case what advantage does all this extra clutter offer over:

 db.transaction([foo]).objectStore(foo).get(mykey).onsuccess(function(res)
 {...X...});


 I am just wondering whether the change is worth the added complexity?


 Cheers,
 Keean.


 On 10 January 2011 21:31, Jonas Sicking jo...@sicking.cc wrote:

 I did some outreach to developers and while I didn't get a lot of
 feedback, what I got was positive to this change.

 The basic use-case that was brought up was implementing a promises
 which, as I understand it, works similar to the request model I'm
 proposing. I.e. you build up these promise objects which represent a
 result which may or may not have arrived yet. At some point you can
 either read the value out, or if it hasn't arrived yet, register a
 callback for when the value arrives.

 It was pointed out that this is still possible with how the spec is
 now, but it will probably result in that developers will come up with
 conventions to set the result on the request themselves. This wouldn't
 be terribly bad, but also seems nice if we can help them.

 / Jonas

 On Mon, Jan 10, 2011 at 8:13 AM, ben turner bent.mozi...@gmail.com
 wrote:
  FWIW Jonas' proposed changes have been implemented and will be
  included in Firefox 4 Beta 9, due out in a few days.
 
  -Ben
 
  On Fri, Dec 10, 2010 at 12:47 PM, Jonas Sicking jo...@sicking.cc
 wrote:
  I've been reaching out to get feedback, but no success yet. Will
 re-poke.
 
  / Jonas
 
  On Fri, Dec 10, 2010 at 4:33 AM, Jeremy Orlow jor...@chromium.org
 wrote:
  Any additional thoughts on this?  If no one else cares, then we can go
 with
  Jonas' proposal (and we should file a bug).
  J
 
  On Thu, Nov 11, 2010 at 12:06 PM, Jeremy Orlow jor...@chromium.org
 wrote:
 
  On Tue, Nov 9, 2010 at 11:35 AM, Jonas Sicking jo...@sicking.cc
 wrote:
 
  Hi All,
 
  One of the things we briefly discussed at the summit was that we
  should make IDBErrorEvents have a .transaction. This since we are
  allowing you to place new requests from within error handlers, but
 we
  currently provide no way to get from an error handler to any useful
  objects. Instead developers will have to use closures to get to the
  transaction or other object stores.
 
  Another thing that is somewhat strange is that we only make the
 result
  available through the success event. There is no way after that to
 get
  it from the request. So instead we use special event interfaces with
  supply access to source, transaction and result.
 
  Compare this to how XMLHttpRequests work. Here the result and error
  code is available on the request object itself. The 'load' event,
  which is equivalent to our 'success' event didn't supply any

Re: [IndexedDB] Events and requests

2011-01-10 Thread Keean Schupke
Just to correct my cut and paste error, that was of course supposed to be:

var y = do {
result1 - db.transaction([foo]).objectStore(foo).getM(mykey1);
result2 - db.transaction([foo]).objectStore(foo).getM(mykey2);
unit(result1 + result2);
}


Cheers,
Keean.

On 10 January 2011 22:24, Keean Schupke ke...@fry-it.com wrote:

 Okay, sorry, the original change seemed sensible, I guess I didn't see how
 you got from there to promises.


 Here's some fun to think about as an alternative though:


 Interestingly the pattern of multiple callbacks, providing each callback is
 passed zero or one parameter forms a Monad.

 So for example if 'unit' is the constructor for the object returned from
 get then onsuccess it 'bind' and I can show that these obey the 3 monad
 laws. Allowing composability of callbacks. So you effectively have:

 var x = db.transaction([foo]).objectStore(foo).getM(mykey);

 var y =
 db.transaction([foo]).objectStore(foo).getM(mykey1).bind(function(result1)
 {

  
 db.transaction([foo]).objectStore(foo).getM(mykey2).bind(function(result2)
 {
 unit(result1 + result2);
 });
 });

 The two objects returned x and y are both the same kind of object. y
 represents the sum or concatination of the results of the lookups mykey1
 and mykey2. You would use it identically to using the result of a single
 lookup:

 x.bind(function(result) {... display the result of a single lookup ...});

 y.bind(function(result) {... display the result of both lookups ...});


 If we could then have some syntactic sugar for this like haskell's do
 notation we could write:

 var y = do {
 db.transaction([foo]).objectStore(foo).getM(mykey1);
 result1 - db.transaction([foo]).objectStore(foo).getM(mykey2);
 result2 - db.transaction([foo]).objectStore(foo).getM(mykey2);
 unit(result1 + result2);
 }

 Which would be a very neat way of chaining callbacks...


 Cheers,
 Keean.


 On 10 January 2011 22:00, Keean Schupke ke...@fry-it.com wrote:

 Whats wrong with callbacks? To me this seems an unnecessary complication.

 Presumably you would do:

 var promise = db.transaction([foo]).objectStore(foo).get(mykey);
 var result = promise.get();
 if (!result) {
 promise.onsuccess(function(res) {...X...});
 } else {
 ...Y...
 }


 So you end up having to duplicate code at X and Y to do the same thing
 directly or in the context of a callback. Or you define a function to
 process the result:

 var f = function(res) {...X...};
 var promise = db.transaction([foo]).objectStore(foo).get(mykey);
 var result = promise.get();
 if (!result) {
  promise.onsuccess(f);
 } else {
 f(result)
 };

 But in which case what advantage does all this extra clutter offer over:

 db.transaction([foo]).objectStore(foo).get(mykey).onsuccess(function(res)
 {...X...});


 I am just wondering whether the change is worth the added complexity?


 Cheers,
 Keean.


 On 10 January 2011 21:31, Jonas Sicking jo...@sicking.cc wrote:

 I did some outreach to developers and while I didn't get a lot of
 feedback, what I got was positive to this change.

 The basic use-case that was brought up was implementing a promises
 which, as I understand it, works similar to the request model I'm
 proposing. I.e. you build up these promise objects which represent a
 result which may or may not have arrived yet. At some point you can
 either read the value out, or if it hasn't arrived yet, register a
 callback for when the value arrives.

 It was pointed out that this is still possible with how the spec is
 now, but it will probably result in that developers will come up with
 conventions to set the result on the request themselves. This wouldn't
 be terribly bad, but also seems nice if we can help them.

 / Jonas

 On Mon, Jan 10, 2011 at 8:13 AM, ben turner bent.mozi...@gmail.com
 wrote:
  FWIW Jonas' proposed changes have been implemented and will be
  included in Firefox 4 Beta 9, due out in a few days.
 
  -Ben
 
  On Fri, Dec 10, 2010 at 12:47 PM, Jonas Sicking jo...@sicking.cc
 wrote:
  I've been reaching out to get feedback, but no success yet. Will
 re-poke.
 
  / Jonas
 
  On Fri, Dec 10, 2010 at 4:33 AM, Jeremy Orlow jor...@chromium.org
 wrote:
  Any additional thoughts on this?  If no one else cares, then we can
 go with
  Jonas' proposal (and we should file a bug).
  J
 
  On Thu, Nov 11, 2010 at 12:06 PM, Jeremy Orlow jor...@chromium.org
 wrote:
 
  On Tue, Nov 9, 2010 at 11:35 AM, Jonas Sicking jo...@sicking.cc
 wrote:
 
  Hi All,
 
  One of the things we briefly discussed at the summit was that we
  should make IDBErrorEvents have a .transaction. This since we are
  allowing you to place new requests from within error handlers, but
 we
  currently provide no way to get from an error handler to any useful
  objects. Instead developers will have to use closures to get to the
  transaction or other object stores.
 
  Another thing that is somewhat strange is that we only make the
 result
  available through the success

Re: [chromium-html5] LocalStorage inside Worker

2011-01-08 Thread Keean Schupke
On 8 January 2011 00:57, Glenn Maynard gl...@zewt.org wrote:

  On Thu, Jan 6, 2011 at 6:06 PM, Charles Pritchardch...@jumis.com
  wrote:
  I don't think localStorage should be (to web workers), but
 sessionStorage
  seems
  a reasonable request.

  It's not arbitrary: the names local and session convey some meaning.
  localStorage works well enough, out in the wild. sessionStorage is not in
  wide use.
 
  I don't think it's restrictive, it just creates a wider implementation
  divide between session and local.

 What I meant was: you said that you don't think localStorage should be
 available to workers, but I don't understand why.  Why should
 sessionStorage be available, but localStorage not?

 --
 Glenn Maynard


There is also the issue that current localStorage implementations may be
broken by multiple tabs/windows. To say it works well enough in the wild
seems to ignore this brokenness.

If access had to be from inside an atomic block (a callback from a single
storage-thread) then this would fix access from multiple tabs/windows as
well as from worker threads.

This could be implemented as a single threaded callback serialising access
to the storage, but implementers could choose to use Software Transactional
Memory techniques to give their browser a speed advantage.


Cheers,
Keean.


Re: Limited DOM in Web Workers

2011-01-08 Thread Keean Schupke
Hi, Sorry for this small aside, but it (slightly) relevent.

What do you suggest people use instead of e4x in general. For example:

var x = tabletrtdsomething/td/tr/table;

Is a lot more elegant than:

var x2 = document.createTextNode('something');
var x1 = document.createElement('td');
x1.appendChild(x2);
var x0 = document.createElement('tr');
x0.appendChild(x1);
var x = document.createElement('table');
x.appendChild(x0);

The only thing I can think of is having a the table attached to the document
but hidden and then copying the html fragment:

var x = document.getElementById('hiddentable').cloneNode(true);

But how do you ensure the renderer and DOM traversal ignores the hidden node
as in a HTML5 app with multiple UI element that need be on screen at
different times it could slow things down a lot.


Cheers,
Keean.


On 8 January 2011 09:09, Jonas Sicking jo...@sicking.cc wrote:

 On Fri, Jan 7, 2011 at 7:34 PM, Boris Zbarsky bzbar...@mit.edu wrote:
  On 1/7/11 2:29 PM, Jack Coulter wrote:
 
  I'm not talking about allowing Worker's to manipulate the main DOM tree
 of
  the page, but rather, exposing DOMParser, and
 XMLHttpRequest.responseXML,
  and a few other objects to workers, to allow the manipulation of DOM
 trees
  which are never actually rendered to the page.
 
  Whether they're rendered doesn't necessarily matter if the DOM
  implementation is not threadsafe (which it's not, in today's UAs).  That
  said...
 
  This would allow developers to parse and manipulate XML in workers,
  freeing
  the main thread of a page to perform other tasks.
 
  ...
 
  An example of a use-case, I'd like to hack on the Strope.js XMPP
  implementation to allow it to run in a worker thread, currently this is
  impossible, without writing my own XML parser, which would undoubtedly
  be slower than the native DOMParser)
 
  If you think you could do this with your own XML parser, is there a
 reason
  you can't do it with e4x (I never thought I'd say that, but this seems
 like
  an actually good use case for something like e4x)?  That should work fine
 in
  workers in Gecko-based browsers that support it, and doesn't drag in the
  entire DOM implementation.
 
  That leaves the problem of convincing developers of those ECMAScript
  implementations that don't support e4x to support it, of course; while
  things like http://code.google.com/p/v8/issues/detail?id=235#c42 don't
  necessarily fill me with hope in that regard it may still be simpler than
  convincing all browsers to rewrite their DOMs to be threadsafe in the way
  that would be needed to support exposing an actual DOM in workers.

 I would strongly advice using e4x. It seems unlikely to be picked up
 by other browsers, and I'm still hoping that we'll remove support from
 gecko before long.

 My question is instead, what part of the DOM is it that you want? One
 of the most important features of the DOM is modifying what is being
 displayed to the user. Obviously that isn't the features requested
 here. Another important feature is simply holding a tree structure.
 However plain javascript objects do that very well (better than the
 DOM in many ways).

 Other features of the DOM include form handling, parsing attribute
 values in the form of integers, floats, comma-separated lists, etc,
 URL resolving and more. Much of this doesn't seem very interesting to
 do on workers, or at least important to have the browser provide an
 implementation for in workers.

 Hence I'm asking, why specifically would you like to access a DOM from
 workers?

 / Jonas

 / Jonas




Re: [chromium-html5] LocalStorage inside Worker

2011-01-08 Thread Keean Schupke
On 8 January 2011 10:00, Glenn Maynard gl...@zewt.org wrote:

 On Sat, Jan 8, 2011 at 4:06 AM, Keean Schupke ke...@fry-it.com wrote:
  If access had to be from inside an atomic block (a callback from a
 single
  storage-thread) then this would fix access from multiple tabs/windows as
  well as from worker threads.

 Your suggestion and Jonas's are very similar.  I think the difference
 is that you're suggesting an API that would permit non-serialized
 access to the objects, by using transactional methods, where Jonas's
 completely serializes access.  Jonas's is much simpler; I don't think
 the complexity of this type of transactional access is needed, or
 appropriate for simple Storage objects.

 --
 Glenn Maynard


I am suggesting that as the semantics are the same, People can think of this
like serialised access, but implementers can use STMs to make their browser
faster than the competition (if they want). To the user it will look the
same.

Cheers,
Keean.


Re: [chromium-html5] LocalStorage inside Worker

2011-01-07 Thread Keean Schupke


 So long as you only allow asynchronous access the implementation can
 ensure that a worker and the main thread doesn't have access to the
 storage at the same time. Then it is safe to allow everyone to modify
 the storage area.

 / Jonas


This is true, serialising access would have the same semantics as STM.
Infact you could consider STM to be a performance enhancement to sequential
access by optimistically allowing concurrent modifications and only doing
something special if there is a collision (a read from a location written by
another thread during the transaction). In which case STM works like a
database and rolls back the transaction. It is really putting a thread local
log between the user and the storage. The main storage is then only locked
during the log commit, reducing resource contention. A rollback is simply
discarding the log.

But this would behave identically (apart from the extra features in STM like
guards and retry) to serialisation of requests.

A simple (non STM) implementation would be to have a single thread
associated with the localStorage and require all accesses to be executed by
that thread (in callbacks). You could use the main UI thread, but it would
make worker threads wait for storage access during DOM processing in
callbacks etc...


Cheers,
Keean.


Re: [chromium-html5] LocalStorage inside Worker

2011-01-07 Thread Keean Schupke



 Race conditions still happen if you (jarringly) forgot to wrap your
 shared object inside atomic block :P. So, maybe it's a good idea to
 only allow localStorage to be accessed inside an atomic block (even in
 workers)?



Yes, that was in my original suggestion.

atomic(function(shared) {...});

The callback scoped variable shared is the only way to access the shared
namespace.


Cheers,
Keean.


Re: [chromium-html5] LocalStorage inside Worker

2011-01-06 Thread Keean Schupke
There is always Software Transactional Memory that provides a safe model for
memory shared between threads.

http://en.wikipedia.org/wiki/Software_transactional_memory

This has been used very successfully in Haskell for overcoming threading /
state issues. Combined with Haskells Channels (message queues) it provides
for very elegant multi-threading.


Cheers,
Keean.


On 6 January 2011 22:44, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, Jan 6, 2011 at 2:25 PM, João Eiras joao.ei...@gmail.com wrote:
  On , Jonas Sicking jo...@sicking.cc wrote:
 
  On Thu, Jan 6, 2011 at 12:01 PM, Jeremy Orlow jor...@chromium.org
 wrote:
 
  public-webapps is probably the better place for this email
 
  On Sat, Jan 1, 2011 at 4:22 AM, Felix Halim felix.ha...@gmail.com
  wrote:
 
  I know this has been discussed  1 year ago:
 
  http://www.mail-archive.com/whatwg@lists.whatwg.org/msg14087.html
 
  I couldn't find the follow up, so I guess localStorage is still
  inaccessible from Workers?
 
  Yes.
 
 
  I have one other option aside from what mentioned by Jeremy:
 
  http://www.mail-archive.com/whatwg@lists.whatwg.org/msg14075.html
 
  5: Why not make localStorage accessible from the Workers as read
 only
  ?
 
  The use case is as following:
 
  First, the user in the main window page (who has read/write access to
  localStorage), dumps a big data to localStorage. Once all data has
  been set, then the main page spawns Workers. These workers read the
  data from localStorage, process it, and returns via message passing
  (as they cannot alter the localStorage value).
 
  What are the benefits?
  1. No lock, no deadlock, no data race, fast, and efficient (see #2
  below).
  2. You only set the data once, read by many Worker threads (as opposed
  to give the big data again and again from the main page to each of the
  Workers via message).
  3. It is very easy to use compared to using IndexedDB (i'm the big
  proponent in localStorage).
 
  Note: I was not following the discussion on the spec, and I don't know
  if my proposal has been discussed before? or is too late to change
  now?
 
  I don't think it's too late or has had much discussion any time
 recently.
   It's probably worth re-exploring.
 
  Unfortunately this is not possible. Since localStorage is
  synchronously accessed, if we allowed workers to access it that would
  mean that we no longer have a shared-nothing-message-passing threading
  model. Instead we'd have a shared memory threading model which would
  require locks, mutexes, etc.
 
  Making it readonly unfortunately doesn't help. Consider worker code
 like:
 
  var x = 0;
  if (localStorage.foo  10) {
   x += localStorage.foo;
  }
 
  would you expect x ever being something other than 0 or 1?
 
 
  Not different from two different tabs/windows running the same code. So
 the
  same solution for that case would work for Workers.
 
  Making the API async would make it more hard to use, which is, I believe,
  one of the design goals of localStorage: to be simple.

 Exposing the web platform to shared memory multithreading is the exact
 opposite of simple.

  If two consecutive reads of the same localStorage value can yield
 different
  values, then that's something that developers have to cope with. If they
 do
  code that is sensible to that issue, then they can take a snapshot of the
  storage object, and apply it back later.

 Multithreaded shared memory programming is extremely complex.
 Multithreaded shared memory programming without the use of locks is
 beyond what I'd ever want to expose anyone to. Much less web
 developers.

 We've been down this discussion before. Please read the threads on why
 workers were designed as a shared-nothing message passing model rather
 than a pthreads or similar model.

 / Jonas




Re: [chromium-html5] LocalStorage inside Worker

2011-01-06 Thread Keean Schupke
Did you see section 7 in the link I posted?

7 Implementations
7.1 C/C++
7.2 C#
7.3 Common Lisp
7.4 Haskell
7.5 Java
7.6 OCaml
7.7 Perl
7.8 Python
7.9 Scala
7.10 Smalltalk

JavaScript as a functional language (first class functions, closures,
anonymous functions) has a lot in common with Haskell and other functional
languages (Lisp)... Although as you can see there are plenty of OO
implementations too.


Cheers,
Keean.


2011/1/6 Jonas Sicking jo...@sicking.cc

 2011/1/6 Keean Schupke ke...@fry-it.com:
  There is always Software Transactional Memory that provides a safe model
 for
  memory shared between threads.
  http://en.wikipedia.org/wiki/Software_transactional_memory
  This has been used very successfully in Haskell for overcoming threading
 /
  state issues. Combined with Haskells Channels (message queues) it
 provides
  for very elegant multi-threading.

 Can you provide a link to the Haskell API which you think has been
 working well for haskell. Or even better, considering that haskell is
 a vastly different language from javascript, could you propose a
 javascript API based on Software Transactional Memory.

 / Jonas



Re: [chromium-html5] LocalStorage inside Worker

2011-01-06 Thread Keean Schupke
Here's a link to some papers on STM:

http://research.microsoft.com/en-us/um/people/simonpj/papers/stm/

A simple example:

http://www.haskell.org/haskellwiki/Simple_STM_example

Here's a tutorial:

http://book.realworldhaskell.org/read/software-transactional-memory.html

Here's a link to the docs:

http://hackage.haskell.org/package/stm


Cheers,
Keean.


2011/1/6 Keean Schupke ke...@fry-it.com

 Did you see section 7 in the link I posted?

 7 Implementations
 7.1 C/C++
 7.2 C#
 7.3 Common Lisp
 7.4 Haskell
 7.5 Java
 7.6 OCaml
 7.7 Perl
 7.8 Python
 7.9 Scala
 7.10 Smalltalk

 JavaScript as a functional language (first class functions, closures,
 anonymous functions) has a lot in common with Haskell and other functional
 languages (Lisp)... Although as you can see there are plenty of OO
 implementations too.


 Cheers,
 Keean.


 2011/1/6 Jonas Sicking jo...@sicking.cc

 2011/1/6 Keean Schupke ke...@fry-it.com:
  There is always Software Transactional Memory that provides a safe model
 for
  memory shared between threads.
  http://en.wikipedia.org/wiki/Software_transactional_memory
  This has been used very successfully in Haskell for overcoming threading
 /
  state issues. Combined with Haskells Channels (message queues) it
 provides
  for very elegant multi-threading.

 Can you provide a link to the Haskell API which you think has been
 working well for haskell. Or even better, considering that haskell is
 a vastly different language from javascript, could you propose a
 javascript API based on Software Transactional Memory.

 / Jonas





Re: [chromium-html5] LocalStorage inside Worker

2011-01-06 Thread Keean Schupke
Applying this to JavaScript (ignoring local storage and just implementing an
STM) would come up with something like:

1) Objects from one thread should not be visible to another. Global variable
test defined in the UI or any worker thread should no be in scope in any
other worker-thread.

2) shared objects could be accessed only though the atomic method
(implemented natively).

atomic(function(shared) {
shared.x += 1;
shared.y -= 2;
});

Here, the callback is the transaction, and shared is the shared
namespace... Thats all you need for a basic implementation. The clever stuff
is all hidden from the user.

We could implement retry by returning true... the guard could just be a
boolean function too:

atomic(function(shared) {
if (queueSize  0) {
// remove item from queue and use it
return false; // no retry
} else {
return true; // retry
}
});

Thats pretty much the entire user visible API that would be needed. Of
course the implementation behind the scenes is more complex.


Cheers,
Keean.

2011/1/6 Keean Schupke ke...@fry-it.com

 Here's a link to some papers on STM:

 http://research.microsoft.com/en-us/um/people/simonpj/papers/stm/

 A simple example:

 http://www.haskell.org/haskellwiki/Simple_STM_example

 Here's a tutorial:

 http://book.realworldhaskell.org/read/software-transactional-memory.html

 Here's a link to the docs:

 http://hackage.haskell.org/package/stm


 Cheers,
 Keean.


 2011/1/6 Keean Schupke ke...@fry-it.com

 Did you see section 7 in the link I posted?

 7 Implementations
 7.1 C/C++
 7.2 C#
 7.3 Common Lisp
 7.4 Haskell
 7.5 Java
 7.6 OCaml
 7.7 Perl
 7.8 Python
 7.9 Scala
 7.10 Smalltalk

 JavaScript as a functional language (first class functions, closures,
 anonymous functions) has a lot in common with Haskell and other functional
 languages (Lisp)... Although as you can see there are plenty of OO
 implementations too.


 Cheers,
 Keean.


 2011/1/6 Jonas Sicking jo...@sicking.cc

  2011/1/6 Keean Schupke ke...@fry-it.com:
  There is always Software Transactional Memory that provides a safe
 model for
  memory shared between threads.
  http://en.wikipedia.org/wiki/Software_transactional_memory
  This has been used very successfully in Haskell for overcoming
 threading /
  state issues. Combined with Haskells Channels (message queues) it
 provides
  for very elegant multi-threading.

 Can you provide a link to the Haskell API which you think has been
 working well for haskell. Or even better, considering that haskell is
 a vastly different language from javascript, could you propose a
 javascript API based on Software Transactional Memory.

 / Jonas






Re: [IndexedDB] Why rely on run-to-completion?

2010-12-30 Thread Keean Schupke
This is very similar


window.indexedDB.open(..., {
onsuccess: function(event) { ... };
});


Except it requires an extra level on indenting for the callback definitions.
In both this and the current implementation there is the additional overhead
of an object creation for every call, when compared to simply having plain
function arguments:


window.indexedDB.open(..., function(event) { ... });


However just parsing the callback as a function argument does not make it as
clear what is happening when reading the code.



There may be a point in making this more generic. As there is only a single
thread there is no way any callback code can be executed while the current
function does not return. Consider:


var f = function() {
setTimeout(function g() {...}, 1000);
while(true) {};
};


In this code the 'g' will never get called... how can it when the single
thread is busy in the while loop? Technically this could be possible if the
interpreter implemented interrupts and continuations, so that the timeout
stops the JS interpreter which saves a continuation allowing it to resume
later and then executes the callback in a fresh context. However interpreter
level continuations are not a feature of standard JavaScript. If interpreter
continuations were implemented they would break the current API... This
could be fixed by deferring the execution of the initial function like so:


var request = window.indexedDB.open(...); // request object stores
parameters (maybe some pre-comutation is done).
request.onsuccess = function(event) { ... }; // set callback
request.run(); // execute the part of the open function that can cause the
callback.


So this would keep the current style, but also be more generic.


Cheers,
Keean.


On 30 December 2010 08:45, Axel Rauschmayer a...@rauschma.de wrote:

 Right. But is there anything one loses by not relying on it, by making the
 API more generic?

 On Dec 30, 2010, at 7:58 , Jonas Sicking wrote:

  On Wed, Dec 29, 2010 at 2:44 PM, Axel Rauschmayer a...@rauschma.de
 wrote:
  Can someone explain a bit more about the motivation behind the current
 design of the async API?
 
  var request = window.indexedDB.open(...);
  request.onsuccess = function(event) { ... };
 
  The pattern of assigning the success continuation after invoking the
 operation seems to be to closely tied to JavaScript’s current
 run-to-completion event handling. But what about future JavaScript
 environments, e.g. a multi-threaded Node.js with IndexedDB built in or Rhino
 with IndexedDB running in parallel? Wouldn’t a reliance on run-to-completion
 unnecessarily limit future developments?
 
  Maybe it is just me, but I would like it better if the last argument was
 an object with the error and the success continuations (they could also be
 individual arguments). That is also how current JavaScript RPC APIs are
 designed, resulting in a familiar look. Are there any arguments *against*
 this approach?
 
  Whatever the reasoning behind the design, I think it should be explained
 in the spec, because the current API is a bit tricky to understand for
 newbies.
 
  Note that almost everyone relies on this anyway. I bet that almost all
  code out there depends on that the code in for example onload handlers
  for XHR requests run after the current thread of execution has fully
  finished.
 
  Asynchronous events isn't something specific to javascript.
 
  / Jonas
 

 --
 Dr. Axel Rauschmayer
 axel.rauschma...@ifi.lmu.de
 http://hypergraphs.de/
 ### Hyena: organize your ideas, free at hypergraphs.de/hyena/







Re: [IndexedDB] Why rely on run-to-completion?

2010-12-30 Thread Keean Schupke
The JavaScript engine we have implemented has interpreter continuations. So
at bytecode boundaries it is able to process pending events. (not saying it
currently does this, but it may in the future). This is not multi-threading,
there is only one thread per engine which maintains an interpreter
environment and communicates with other engines by message passing (we
already have a worker API, although non-standard).

This could cause a problem with the current API. The fix for this is to make
sure the callbacks are defined before the function using the callbacks is
called.

I think keeping away from multi-threading in JS is sensible (perhaps Erlang
style multi-processing would be good though). However interrupting the
interpreter to process callbacks is just a single thread and causes no
problems providing the callbacks are initialised before the call that
initialises the background process that will generate the asynchronous
event.


Cheers,
Keean.


On 30 December 2010 20:44, Jonas Sicking jo...@sicking.cc wrote:

 Even if we decide to make the environment in which we run webpage
 script multithreaded the current API will work fine. Generally
 speaking in multithreaded environments you do callbacks on the same
 thread as which the initial function is called.

 Alternatively you'd want to pass in the thread on which you want
 callbacks, along with the callbacks you want called. But in that case
 using EventTargets doesn't make sense as you don't know if a callback
 has already happened by the time you call addEventListener. Likewise,
 the readyState property also would need to be removed as by the time
 you check it it can already be out of date. In short, a complete
 revamping of the API would be needed, the small modification you are
 proposing would be nowhere near enough.

 However most of all I'm not terribly worried that we'll make the
 browser scripting environment multitheaded. Multithreading is
 extremely complicated. To this day research is still happening on how
 to implement even the most simple datastructures, such as queues and
 hash tables, effectively in a mulithreaded environment. See the
 discussions on the WhatWG list which took place when we designed the
 workers API.

 I find it much more likely that we'll stick with the approach that
 workers have introduced of having separate environments which run on
 different threads and with no shared state. Communication between
 threads happen through message passing. This is similar to languages
 such as Google's Go and Mozilla's Rust.

 / Jonas

 On Thu, Dec 30, 2010 at 12:45 AM, Axel Rauschmayer a...@rauschma.de
 wrote:
  Right. But is there anything one loses by not relying on it, by making
 the API more generic?
 
  On Dec 30, 2010, at 7:58 , Jonas Sicking wrote:
 
  On Wed, Dec 29, 2010 at 2:44 PM, Axel Rauschmayer a...@rauschma.de
 wrote:
  Can someone explain a bit more about the motivation behind the current
 design of the async API?
 
  var request = window.indexedDB.open(...);
  request.onsuccess = function(event) { ... };
 
  The pattern of assigning the success continuation after invoking the
 operation seems to be to closely tied to JavaScript’s current
 run-to-completion event handling. But what about future JavaScript
 environments, e.g. a multi-threaded Node.js with IndexedDB built in or Rhino
 with IndexedDB running in parallel? Wouldn’t a reliance on run-to-completion
 unnecessarily limit future developments?
 
  Maybe it is just me, but I would like it better if the last argument
 was an object with the error and the success continuations (they could also
 be individual arguments). That is also how current JavaScript RPC APIs are
 designed, resulting in a familiar look. Are there any arguments *against*
 this approach?
 
  Whatever the reasoning behind the design, I think it should be
 explained in the spec, because the current API is a bit tricky to understand
 for newbies.
 
  Note that almost everyone relies on this anyway. I bet that almost all
  code out there depends on that the code in for example onload handlers
  for XHR requests run after the current thread of execution has fully
  finished.
 
  Asynchronous events isn't something specific to javascript.
 
  / Jonas
 
 
  --
  Dr. Axel Rauschmayer
  axel.rauschma...@ifi.lmu.de
  http://hypergraphs.de/
  ### Hyena: organize your ideas, free at hypergraphs.de/hyena/
 
 
 
 




Re: [IndexedDB] Why rely on run-to-completion?

2010-12-30 Thread Keean Schupke
On 30 December 2010 23:08, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, Dec 30, 2010 at 2:19 PM, Keean Schupke ke...@fry-it.com wrote:
  The JavaScript engine we have implemented has interpreter continuations.
 So
  at bytecode boundaries it is able to process pending events. (not saying
 it
  currently does this, but it may in the future). This is not
 multi-threading,
  there is only one thread per engine which maintains an interpreter
  environment and communicates with other engines by message passing (we
  already have a worker API, although non-standard).
  This could cause a problem with the current API. The fix for this is to
 make
  sure the callbacks are defined before the function using the callbacks is
  called.
  I think keeping away from multi-threading in JS is sensible (perhaps
 Erlang
  style multi-processing would be good though). However interrupting the
  interpreter to process callbacks is just a single thread and causes no
  problems providing the callbacks are initialised before the call that
  initialises the background process that will generate the asynchronous
  event.

 If you are interrupting at arbitrary points in the execution and
 running other script contexts which can synchronously call into the
 first javascript context, then you are implementing multithreading.
 This is in fact exactly how multithreading works on single-core CPUs.
 It means that you are exposing race conditions and all other threading
 hazards to webpages.

 / Jonas


That makes complete sense to me, although not all threading hazards would be
exposed, as partial writes will not be a problem, all variable accesses will
automatically be atomic.  But yes, race conditions would be a problem with
interrupts so I agree its a bad idea.  (The interpreter continuations are
currently used to store interpreter state when executing a blocking IO
action, so that other engines can carry on running when running in a single
threaded environment).

In that case I can't see any limitations to the current API.

As for the aesthetic considerations, the way JavaScript works is by events,
it makes more sense to expose the API as events rather than callbacks, as
callbacks give the false impression that the callback can happen at any
time.


Cheers,
Keean.


Re: FileAPI use case: making an image downloading app

2010-12-19 Thread Keean Schupke
On 18 December 2010 17:38, Charles Pritchard ch...@jumis.com wrote:

  On 12/17/2010 5:03 PM, Gregg Tavares (wrk) wrote:



 On Fri, Dec 17, 2010 at 4:16 PM, Charles Pritchard ch...@jumis.comwrote:

 We're actively developing such functionality.

 The limit per directory is for the sake of the os file system. If you want
 to create a data store, use indexedDB or use standard file system practices
 (create a subdirectory tree).


  I think you're missing the point. If I have a folder with 6000 files on
 some server and I want to mirror that through the FileAPI to the user's
 local machine I'm screwed. I can't **mirror** it if it's not a mirror.

 I'm not missing the point. I'm actively developing an app that downloads
 images from photo sites.

 A strict Mirror (lets use a capital M) is something you're not going to
 pull off with the File System API at this time.
 You can't set meta data, like permissions/flags and modification/creation
 dates. Developing a Mirror is not feasible with the current API.
 You can't create a direct Mirror, one which would work seamlessly with
 rsync.

 You can mirror, the data on a remote server, and check that the data
 already exists on your file system,
 by using a subdirectory system, much like the two level directory structure
 that many cache apps use (like Squid).

   Disk space availability (quota) is an issue no matter what happens. When
 downloading 1000 images, you'll still only be doing so x at a time.


  I don't see the point you're trying to make here. I don't know the size
 of the images before hand. Many internet APIs make getting the sizes
 prohibitively expensive. One REST call per file. So before I can download a
 single file I'd have to issue 1000 REST XHRs to find the sizes of the files
 (waiting the several minutes for that to complete) before I can ask the user
 for the space needed.  That's not a good user experience. If on the other
 hand the user can give me permission for unlimited space then I can just
 start downloading the files without having to find out their ultimate size.

  I suppose I can just request 1 terabyte up front. Or query how much space
 is free and ask for all of it.

 Yes, that's correct, you'd hope the space is available.
 When you run out of space, you can use a limited amount of RAM while
 waiting.

 There are few resource management APIs available for memory/bandwidth
 hungry applications.

 My point was that these questions you're bringing up are common to all
 cross-platform applications.

 Regarding your issue of a ray tracing program : You'd also want to
 either: create sub directories,
 or simply create a large file with its own methods.


 This issues are inherent in the design of any application of scale. At
 this point, the file system API does work for the use case you're
 describing.

 It'd be nice to see Blob storage better integrated with Web Storage Apis.
 Ian has already spoken to this, but no followers yet (afaik).

 -Charles



 On Dec 17, 2010, at 3:34 PM, Gregg Tavares (wrk) g...@google.com
 wrote:

  Sorry if this has been covered before.
 
  I've been wanting to write an app to download images from photo sites
 and I'm wondering if this use case has been considered for the FileAPI wrt
 Directories and System.
 
  If I understand the current spec it seems like their are some possible
 issues.
 
  #1) The spec says there is a 5000 file limit or directory.
 
  #2) The spec requires an app to specify how much storage it needs
 
  I understand the desire for the limits.  What about an app being able to
 request unlimited storage and unlimited files? The UA can decide how to
 present something to the user to grant permission if they want.
 
  Arguments against leaving it as is:
 
  The 5000 file limit seems arbitrary. Any app that hits that limit will
 probably require serious re-engineering to work around it. It will not only
 have to some how describe a mapping between files on a server that may not
 have that limit, it also has the issue the user might have something
 organized that way and will require the user to re-organize. I realize that
 5000 is a large number. I'm sure the author of csh though 1700 entires in a
 glob was a reasonable limit as well. We all know how well that turned out
 :-(   It's easy to imagine a video editing app that edits and composites
 still images. If there are a few layers and 1 image per layer it could
 easily add up to more than 5000 files in a single folder.
 
  The size limit also has issues. For many apps the size limit will be no
 problem but for others  Example: You make ray tracing program, it traces
 each frame and saves the files to disc. When it runs out of room, what is
 its options? (1) fail. (2) request a file system of a larger size. The
 problem is (2) will require user input. Imagine you start your render before
 you leave work expecting it to finish by the time you get in only to find
 that 2 hours after you left the UA popped up a confirmation this app
 

Re: [cors] 27 July 2010 CORS feedback

2010-11-22 Thread Keean Schupke
It this spec the place to fix cross site vulnerabilities? Would it not be
better to restrict cookies to only be sent when the domain of the page you
are navigating away from matches the cookie domain, as well as the page you
are navigating to?


Cheers,
Keean.


On 22 November 2010 09:10, Mark Nottingham m...@mnot.net wrote:


 On 22/11/2010, at 7:53 PM, Jonas Sicking wrote:
 
  Practically speaking, the only constrains on form submissions
  request entities is that they contain a '='. Using text/plain encoded
  forms you can submit any content with that restriction.
 
  Further, I believe that flash allows cross site POST submission with
  arbitrary data, i.e. even data without a '='. But I haven't looked
  into that in more detail.

 Perhaps. I still don't think it's great for the W3C to standardise yet
 another method of sending cross-site POSTs without permission.


  3) When a server changes the headers in a response based upon the
 value of the incoming Origin header (as outlined in sections 5.1 and 5.2),
 it must insert Vary: Origin into *all* responses for that resource;
 otherwise, downstream caches will incorrectly store it.
 
  Be aware that doing so will cause many versions of IE not to cache
 those responses at all. Another option would be to disallow varying the
 response based upon the Origin header.
 
  Disallowing varying by origin seems like a bigger problem than IE not
 caching.
 
  Either way, it needs to be addressed.
 
  You mean by adding a note in the spec? Are you adding a similar note
  to http-bis about the Vary header?

 RFC2616 already defines Vary:
  http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44
 ... and bis refines it:
  http://tools.ietf.org/html/draft-ietf-httpbis-p6-cache-12#section-3.5



  5) Using a preflight check in combination with a cache exposes sites
 to DNS rebinding, man-in-the-middle, and potentially other attacks that did
 not exist before. This should be noted.
 
  The DNS rebinding issue is a quality of implementation issue. It's no
  problem simply rerequesting the preflight if the DNS resolves to a
  different IP between the preflight and the actual request. I agree
  that noting this in spec is a good idea.
 
  Could you describe the new man-in-the-middle attacks which did not
  exist before with cross-origin communications?
 
  I suspect we're quibbling over the definition of 'new' here, but can
 agree that CORS is going to be another tool to attack sites with (which to
 be fair isn't really its fault; it's just that we should give people fair
 warning).
 
  I'm not sure adding ominous There might be ways that this spec can be
  used for cross site attacks. Try to take precautions notes to the
  spec are much more useful then the There is a general threat, but we
  don't have any more specific information at this time. People should
  be aware of their surroundings. alerts that the Department of
  Homeland Security sends out :-)

 That's the second straw man you've used, Jonas. Please stop.




  6) Requiring a preflight check per-URI is not an efficient use of
 network resources for applications that use a large number of URIs, as is
 becoming more prevalent; effectively, it introduces another round-trip for
 each unsafe request. Handling OPTIONS is also somewhat specialised on many
 servers. It's also awkward to handle OPTIONS per-URI on many servers. I've
 raised this several times before, and am still not convinced that the
 underlying requirement (#8) justifies such a convoluted and ill-concieved
 design, or indeed is effectively met by this design.
 
  Allowing a site to define a 'map' of where cross-origin requests are
 allowed to go would be more efficient in most cases, would be vastly simpler
 to implement for servers, and would be similar to many other site-wide
 policy mechanisms on the Web.
 
  We had a design in place which allowed preflights to apply to multiple
  URIs. However there were too many issues with servers resolving URIs
  in weird ways which made us drop it. One concrete example was that
  some versions of IIS UTF8 decoded URIs and then ignored bits above the
  lower 8 bits. This made it treat URIs as if they contained .. when
  the browser had no idea of this.
 
  In short, CORS felt like the wrong spec to start relying on servers
  not to do strange URI handling.
 
  I'm not sure what you're referring to, but there are clean ways to do
 this without resorting to depending on how servers interpret URIs.
 
  I'm sure proposals are welcome.


 I'm pretty sure they're not, based upon past experience. We've been through
 this a few times already.

 Regards,

 --
 Mark Nottingham   http://www.mnot.net/







Re: [cors] 27 July 2010 CORS feedback

2010-11-22 Thread Keean Schupke
I guess I didn't put that very well. This is more a general comment on the
discussion rather than a relpy to a specific post.

If I can forge requests to web sites (for example using curl), then any
site that does not impose security checks on its input values is asking for
trouble. Making security depend on headers which can be forged seems like
false security to me.

So I see no problem allowing POSTs to any URL, providing the POST cannot
assume any authority on the part of the user. That way a POST can do no more
harm than a script using curl.

The solution to the POST authority problem would seem to be to apply the
origin rule for cookies (which seems sufficient to me) consistently (perhaps
it already is?)

So CORS seems to me to be about permitting exceptions to a security policy,
not about fixing that underlying security policy. I don't think any changes
to the CORS spec can fix a broken underlying security policy. Perhaps I am
missing something though?


Cheers,
Keean.





On 22 November 2010 09:28, Keean Schupke ke...@fry-it.com wrote:

 It this spec the place to fix cross site vulnerabilities? Would it not be
 better to restrict cookies to only be sent when the domain of the page you
 are navigating away from matches the cookie domain, as well as the page you
 are navigating to?


 Cheers,
 Keean.


 On 22 November 2010 09:10, Mark Nottingham m...@mnot.net wrote:


 On 22/11/2010, at 7:53 PM, Jonas Sicking wrote:
 
  Practically speaking, the only constrains on form submissions
  request entities is that they contain a '='. Using text/plain encoded
  forms you can submit any content with that restriction.
 
  Further, I believe that flash allows cross site POST submission with
  arbitrary data, i.e. even data without a '='. But I haven't looked
  into that in more detail.

 Perhaps. I still don't think it's great for the W3C to standardise yet
 another method of sending cross-site POSTs without permission.


  3) When a server changes the headers in a response based upon the
 value of the incoming Origin header (as outlined in sections 5.1 and 5.2),
 it must insert Vary: Origin into *all* responses for that resource;
 otherwise, downstream caches will incorrectly store it.
 
  Be aware that doing so will cause many versions of IE not to cache
 those responses at all. Another option would be to disallow varying the
 response based upon the Origin header.
 
  Disallowing varying by origin seems like a bigger problem than IE not
 caching.
 
  Either way, it needs to be addressed.
 
  You mean by adding a note in the spec? Are you adding a similar note
  to http-bis about the Vary header?

 RFC2616 already defines Vary:
  http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44
 ... and bis refines it:
  http://tools.ietf.org/html/draft-ietf-httpbis-p6-cache-12#section-3.5



  5) Using a preflight check in combination with a cache exposes
 sites to DNS rebinding, man-in-the-middle, and potentially other attacks
 that did not exist before. This should be noted.
 
  The DNS rebinding issue is a quality of implementation issue. It's no
  problem simply rerequesting the preflight if the DNS resolves to a
  different IP between the preflight and the actual request. I agree
  that noting this in spec is a good idea.
 
  Could you describe the new man-in-the-middle attacks which did not
  exist before with cross-origin communications?
 
  I suspect we're quibbling over the definition of 'new' here, but can
 agree that CORS is going to be another tool to attack sites with (which to
 be fair isn't really its fault; it's just that we should give people fair
 warning).
 
  I'm not sure adding ominous There might be ways that this spec can be
  used for cross site attacks. Try to take precautions notes to the
  spec are much more useful then the There is a general threat, but we
  don't have any more specific information at this time. People should
  be aware of their surroundings. alerts that the Department of
  Homeland Security sends out :-)

 That's the second straw man you've used, Jonas. Please stop.




  6) Requiring a preflight check per-URI is not an efficient use of
 network resources for applications that use a large number of URIs, as is
 becoming more prevalent; effectively, it introduces another round-trip for
 each unsafe request. Handling OPTIONS is also somewhat specialised on many
 servers. It's also awkward to handle OPTIONS per-URI on many servers. I've
 raised this several times before, and am still not convinced that the
 underlying requirement (#8) justifies such a convoluted and ill-concieved
 design, or indeed is effectively met by this design.
 
  Allowing a site to define a 'map' of where cross-origin requests are
 allowed to go would be more efficient in most cases, would be vastly simpler
 to implement for servers, and would be similar to many other site-wide
 policy mechanisms on the Web.
 
  We had a design in place which allowed preflights to apply to multiple
  URIs

Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?

2010-11-20 Thread Keean Schupke
Just a thought, because the spec does not limit the key size, does not mean
the implementation has to index on huge keys. For example you may choose to
index only the first 1000 characters of string keys, and then link the
values of key collisions together in the storage node. This way things are
kept fast and compact for the more normal key size, and there is a sensible
limit.

As long as the implementation behaves like it admits arbitrary key sizes, it
can actually implement things how it likes.

Another example would be one index for keys less than size X, and a separate
oversize key index for keys of size greater than X. These could use a
different internal structure and disk layout.


Cheers,
Keean.


On 20 November 2010 04:13, Bjoern Hoehrmann derhoe...@gmx.net wrote:

 * Jonas Sicking wrote:
 The question is in part where the limit for ridiculous goes. 1K keys
 are sort of ridiculous, though I'm sure it happens.

 By ridiculous I mean that common systems would run out of memory. That
 is different among systems, and I would expect developers to consider it
 up to an order of magnitude, but not beyond that. Clearly, to me, a DB
 system should not fail because I want to store 100 keys á 100KB.

  Note that, since JavaScript does not offer key-value dictionaries for
  complex keys, and now that JSON.stringify is widely implemented, it's
  quite common for people to emulate proper dictionaries by using that to
  work around this particular JavaScript limitation. Which would likely
  extend to more persistent forms of storage.
 
 I don't understand what you mean here.

 I am saying that it's quite natural to want to have string keys that are
 much, much longer than someone might envision the length of string keys,
 mainly because their notion of string keys is different from the key
 length you might get from serializing arbitrary objects.
 --
 Björn Höhrmann · mailto:bjo...@hoehrmann.de · http://bjoern.hoehrmann.de
 Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/




Re: [Bug 11270] New: Interaction between in-line keys and key generators

2010-11-13 Thread Keean Schupke
Why not return the full 64bit ID in an opaque object? Maths and comparing
IDs is meaningless anyway.

Cheers,
Keean.


On 12 November 2010 21:05, Jeremy Orlow jor...@chromium.org wrote:

 On Fri, Nov 12, 2010 at 10:09 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Fri, Nov 12, 2010 at 12:36 AM, Jeremy Orlow jor...@chromium.org
 wrote:
  On Fri, Nov 12, 2010 at 11:27 AM, Keean Schupke ke...@fry-it.com
 wrote:
 
  You can do it in SQL because tables that hold a reference to an ID can
  declare the reference in the schema. I guess without the meta-data to
 do
  this it cannot be done.
 
  Even in SQL, I'd be very hesitant to do this.
 
 
  Why not get the auto-increment to wrap and skip collisions? What about
  signed numbers?
 
  Exactly.  If we're going to support this, let's keep it super simple.
  As
  Jonas mentioned, it's very unlikely that anyone would hit the 64bit
 limit in
  legitimate usage, so it's not worth trying to gracefully handle such a
  situation and adding a lot of surface area.

 Indeed. I'd prefer to fail fatally to trying to do something
 complicated and clever here. I'd be surprised if anyone ever ran into
 this issue unintentionally (i.e. when not explicitly testing to see
 what happens).

 One way to look at it is that before we run into 2^64 limit, we'll run
 into the limit that javascript can't represent all integers above
 2^53. So once IDs get above that you basically won't be able to use
 the object store anyway.


 Good point.  Actually we probably need to spec the limit to be 2^52ish so
 that the auto number is never anything greater than what javascript can
 address.

 J



Re: [Bug 11270] New: Interaction between in-line keys and key generators

2010-11-13 Thread Keean Schupke
Hi,

On 13 November 2010 08:33, Jonas Sicking jo...@sicking.cc wrote:

 On Fri, Nov 12, 2010 at 11:59 PM, Keean Schupke ke...@fry-it.com wrote:
  Why not return the full 64bit ID in an opaque object? Maths and comparing
  IDs is meaningless anyway.

 Then we'd have to overload both the structured clone algorithm and the
 == javascript operator.


Is that a problem? I can't see performance being an issue it has to
determine which type of '==' to use anyway, and JavaScript does not appear
to support Unboxing or Unboxed types.

I accept that 2^53 bits is enough.

To me though there is an advantage in not having the ID as an integer type.
Basically the ID is an unordered sequence type. The only valid operators are
'==' and '!='. ordered comparisons (greater, less) and maths mean nothing. I
would think it better to use an opaque type so that people do not mistakenly
think they can use these operators. It also allows implementers much more
flexibility (and optimisation potential) in how they actually implement the
IDs.


Cheers,
Keean.


Re: [Bug 11270] New: Interaction between in-line keys and key generators

2010-11-13 Thread Keean Schupke
Having said that, if its an opaque type, you could not supply values
yourself, which is where this all started... and I think that is a good idea
(for example when importing data).

So whilst I think all the points I made in favour of an opaque type are true
for this kind of thing in general, for this case I think the need to supply
a value is more important.

Personally I would like to see support for infinite-precision integers like
Python has in JavaScript, but proper integer support would be a start.


Cheers,
Keean.


On 13 November 2010 11:13, Keean Schupke ke...@fry-it.com wrote:

 Hi,

 On 13 November 2010 08:33, Jonas Sicking jo...@sicking.cc wrote:

 On Fri, Nov 12, 2010 at 11:59 PM, Keean Schupke ke...@fry-it.com wrote:
  Why not return the full 64bit ID in an opaque object? Maths and
 comparing
  IDs is meaningless anyway.

 Then we'd have to overload both the structured clone algorithm and the
 == javascript operator.


 Is that a problem? I can't see performance being an issue it has to
 determine which type of '==' to use anyway, and JavaScript does not appear
 to support Unboxing or Unboxed types.

 I accept that 2^53 bits is enough.

 To me though there is an advantage in not having the ID as an integer type.
 Basically the ID is an unordered sequence type. The only valid operators are
 '==' and '!='. ordered comparisons (greater, less) and maths mean nothing. I
 would think it better to use an opaque type so that people do not mistakenly
 think they can use these operators. It also allows implementers much more
 flexibility (and optimisation potential) in how they actually implement the
 IDs.


 Cheers,
 Keean.






Re: [Bug 11270] New: Interaction between in-line keys and key generators

2010-11-12 Thread Keean Schupke
You can do it in SQL because tables that hold a reference to an ID can
declare the reference in the schema. I guess without the meta-data to do
this it cannot be done.

Why not get the auto-increment to wrap and skip collisions? What about
signed numbers?

Cheers,
Keean.

On 12 November 2010 08:23, Jeremy Orlow jor...@chromium.org wrote:

 We can't compact because the developer may be expecting to look items up by
 ID with IDs in another table, on the server, in memory, etc.  There's no way
 to do it.

 J


 On Fri, Nov 12, 2010 at 10:56 AM, Keean Schupke ke...@fry-it.com wrote:

 The other thing you could do is specify that when you get a wrap (IE
 someone inserts a key of MAXINT - 1) you auto-compact the table. If you
 really have run out of indexes there is not a lot you can do.

 The other thing to consider it that because JS uses signed arithmetic, its
 really a 63bit number... unless you want negative indexes appearing? (And
 how would that affect ordering and sorting)?


 Cheers,
 Keean.


 On 12 November 2010 07:36, Jeremy Orlow jor...@chromium.org wrote:

 On Fri, Nov 12, 2010 at 10:08 AM, Jonas Sicking jo...@sicking.ccwrote:

 On Thu, Nov 11, 2010 at 9:22 PM, Jeremy Orlow jor...@chromium.org
 wrote:
  On Fri, Nov 12, 2010 at 12:32 AM, Jonas Sicking jo...@sicking.cc
 wrote:
 
  On Thu, Nov 11, 2010 at 11:41 AM, Jeremy Orlow jor...@chromium.org
  wrote:
   On Thu, Nov 11, 2010 at 6:41 PM, Tab Atkins Jr. 
 jackalm...@gmail.com
   wrote:
  
   On Thu, Nov 11, 2010 at 4:20 AM, Jeremy Orlow 
 jor...@chromium.org
   wrote:
What would we do if what they provided was not an integer?
  
   The behavior isn't very important; throwing would be fine here.
  In
   mySQL, you can only put AUTO_INCREMENT on columns in the integer
   family.
  
  
What happens if
the number they insert is so big that the next one causes
 overflow?
  
   The same thing that happens if you do ++ on a variable holding a
   number that's too large.  Or, more directly, the same thing that
   happens if you somehow fill up a table to the integer limit
 (probably
   deleting rows along the way to free up space), and then try to add
 a
   new row.
  
  
What is
the use case for this?  Do we really think that most of the time
users
do
this it'll be intentional and not just a mistake?
  
   A big one is importing some data into a live table.  Many smaller
 ones
   are related to implicit data constraints that exist in the
 application
   but aren't directly expressed in the table.  I've had several
 times
   when I could normally just rely on auto-numbering for something,
 but
   occasionally, due to other data I was inserting elsewhere, had to
   specify a particular id.
  
   This assumes that your autonumbers aren't going to overlap and is
 going
   to
   behave really badly when they do.
   Honestly, I don't care too much about this, but I'm skeptical we're
   doing
   the right thing here.
 
  Pablo did bring up a good use case, which is wanting to migrate
  existing data to a new object store, for example with a new schema.
  And every database examined so far has some ability to specify
  autonumbered columns.
 
  overlaps aren't a problem in practice since 64bit integers are really
  really big. So unless someone maliciously sets a number close to
 the
  upper bound of that then overlaps won't be a problem.
 
  Yes, but we'd need to spec this, implement it, and test it because
 someone
  will try to do this maliciously.

 I'd say it's fine to treat the range of IDs as a hardware limitation.
 I.e. similarly to how we don't specify how much data a webpage is
 allowed to put into DOMStrings, at some point every implementation is
 going to run out of memory and effectively limit it. In practice this
 isn't a problem since the limit is high enough.

 Another would be to define that the ID is 64 bit and if you run out of
 IDs no more rows can be inserted into the objectStore. At that point
 the page is responsible for creating a new object store and compacting
 down IDs. In practice no page will run into this limitation if they
 use IDs increasing by one. Even if you generate a new ID a million
 times a second, it'll still take you over half a million years to run
 out of 64bit IDs.


 This seems reasonable.  OK, let's do it.


   And, in the email you replied right under, I brought up the point
 that this
  feature won't help someone who's trying to import data into a table
 that
  already has data in it because some of it might clash.  So, just to
 make
  sure we're all on the same page, the use case for this is restoring
 data
  into an _empty_ object store, right?  (Because I don't think this is a
 good
  solution for much else.)

 That's the main scenario I can think of that would require this yes.

 / Jonas







Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist

2010-11-12 Thread Keean Schupke
Yes, I prefer it due to the symmetry, and agree that its a judgment call. I
guess the advantage of allowing it is library's can disallow if they like.
The reverse is not true, if you disallow it a library cannot allow it.

Cheers,
Keean
On 12 Nov 2010 09:00, Jeremy Orlow jor...@chromium.org wrote:
 On Fri, Nov 12, 2010 at 12:06 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, Nov 11, 2010 at 11:44 AM, Jeremy Orlow jor...@chromium.org
 wrote:
  The email I responded to: It would make sense if you make setting a
key
 to
  undefined semantically equivalent to deleting the value (and no error
if
 it
  does not exist), and return undefined on a get when no such key exists.
 That
  way 'undefined' cannot exist as a value in the object store, and is a
 safe
  marker for the key not existing in that index.
  undefined should be symmetric. If something not existing returns
 undefined
  then passing in undefined should make it not exist. Overloading the
 meaning
  of a get returning undefined is ugly. And simply disallowing a value
 also
  seems a bit odd. But I think this is pretty elegant semantically.

 As I've asked previously in the tread. What problem are you trying to
 solve? Can you describe the type of application that gets easier to
 write/possible to write/has cleaner code/runs faster if we make this
 change?

 It seems like deleting on .put(undefined) creates a very unexpected
 behavior just to try to cover a rare edge case, wanting to both store
 undefined,


 This is not correct. The proposal was trying to remove an asymmetry within
 the API.


 and tell it apart from the lack of value.In fact, the
 proposal doesn't even solve that edge case since it no longer is
 possible to store undefined. Which brings me back to the question
 above of what problem you are trying to solve.


 ...this is trying to solve an asymmetry within the API.

 I know this is something I've gone back and forth on, but you'll remember
 that both Pablo and I (and maybe Andrei?) were not very excited about
 the asymmetry to begin with.

 Anyway, I'll differ to you since I think this (along with several other of
 the issues I've raised) are mostly judgement calls rather than issues with
a
 clearly technically superior solution and you have been doing most of the
 hard spec work lately.

 J


Re: [Bug 11270] New: Interaction between in-line keys and key generators

2010-11-11 Thread Keean Schupke
Integers can be big 8bytes is common. It is generally assumed that the
auto-increment counter will be big enough, overflow would wrap, and if the
ID already exists there would be an error. In my experience auto-increment
columns must be integers.

Cheers,
Keean.

On 11 November 2010 12:20, Jeremy Orlow jor...@chromium.org wrote:

 On Thu, Nov 11, 2010 at 2:37 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, Nov 10, 2010 at 3:15 PM, Tab Atkins Jr. jackalm...@gmail.com
 wrote:
  On Wed, Nov 10, 2010 at 2:07 PM, Jonas Sicking jo...@sicking.cc
 wrote:
  On Wed, Nov 10, 2010 at 1:50 PM, Tab Atkins Jr. jackalm...@gmail.com
 wrote:
  On Wed, Nov 10, 2010 at 1:43 PM, Pablo Castro
  pablo.cas...@microsoft.com wrote:
 
  From: public-webapps-requ...@w3.org [mailto:
 public-webapps-requ...@w3.org] On Behalf Of bugzi...@jessica.w3.org
  Sent: Monday, November 08, 2010 5:07 PM
 
  So what happens if trying save in an object store which has the
 following
  keypath, the following value. (The generated key is 4):
 
  foo.bar
  { foo: {} }
 
  Here the resulting object is clearly { foo: { bar: 4 } }
 
  But what about
 
  foo.bar
  { foo: { bar: 10 } }
 
  Does this use the value 10 rather than generate a new key, does it
 throw an
  exception or does it store the value { foo: { bar: 4 } }?
 
  I suspect that all options are somewhat arbitrary here. I'll just
 propose that we error out to ensure that nobody has the wrong expectations
 about the implementation preserving the initial value. I would be open to
 other options except silently overwriting the initial value with a generated
 one, as that's likely to confuse folks.
 
  It's relatively common for me to need to supply a manual value for an
  id field that's automatically generated when working with databases,
  and I don't see any particular reason that my situation would change
  if using IndexedDB.  So I think that a manually-supplied key should be
  kept.
 
  I'm fine with either solution here. My database experience is too weak
  to have strong opinions on this matter.
 
  What do databases usually do with columns that use autoincrement but a
  value is still supplied? My recollection is that that is generally
  allowed?
 
  I can only speak from my experience with mySQL, which is generally
  very permissive, but which has very sensible behavior here imo.
 
  You are allowed to insert values manually into an AUTO_INCREMENT
  column.  The supplied value is stored as normal.  If the value was
  larger than the current autoincrement value, the value is increased so
  that the next auto-numbered row will have an id one higher than the
  row you just inserted.
 
  That is, given the following inserts:
 
  insert row(val) values (1);
  insert row(id,val) values (5,2);
  insert row(val) values (3);
 
  The table will contain [{id:1, val:1}, {id:5, val:2}, {id:6, val:3}].
 
  If you have uniqueness constraints on the field, of course, those are
  also used.  Basically, AUTO_INCREMENT just alters your INSERT before
  it hits the db if there's a missing value; otherwise the query is
  treated exactly as normal.

 This is how sqlite works too. It'd be great if we could make this
 required behavior.


 What would we do if what they provided was not an integer?  What happens if
 the number they insert is so big that the next one causes overflow?  What is
 the use case for this?  Do we really think that most of the time users do
 this it'll be intentional and not just a mistake?

 J



Re: Relational Data Model Example

2010-11-11 Thread Keean Schupke
Hi,

Here are the Mozilla IndexedDB examples converted to us the relational data
model. Points to note:

- The database is validated (that is the schema in the JavaScript is either
used to create the database if it does not exit, or to make sure that the
database conforms to the schema if it does exist. Currently we require an
exact match for validation to succeed, however the final version will use
nullable and default values to allow attributes to be added to existing
relations, or attributes ignored providing the required pre-conditions are
met).

- The 'true' at the end of the validate function tells it to drop the
existing relations, so we always start with an empty database.

- We add more data than the original insert example so there are some
results from the join query.

- There is no single-value-per-group test yet for project. But effectively
when grouping by a unique attribute (like id) any attribute in the same
(pre-join) relation is acceptable, as well as the attribute joined to in the
other relation, but no other attribute if the joined to column is not unique
(the case in the example).


var rdm = new RelationalDataModel;
var rdb = new rdm.WebSQLiteDataAdapter;

var kids = rdm.relation('kids', {
id: rdm.attribute('id', rdm.integer, {auto_increment: true}),
name: rdm.attribute('name', rdm.string)
});

var candy =  rdm.relation('candy', {
id: rdm.attribute('id', rdm.integer, {auto_increment: true}),
name: rdm.attribute('name', rdm.string)
});

var candySales = rdm.relation('candySales', {
kid: rdm.attribute('kid', rdm.integer),
candy: rdm.attribute('candy', rdm.integer),
date: rdm.attribute('date', rdm.string)
});

var v = rdb.validate('CandyDB', 1.0, [kids, candy, candySales],
true).onsuccess = function(db) {
// new database has been created, or existing database has been
_validated_

var i = db.transaction(function(tx) {
[
{id: 1, name: 'Anna'},
{id: 2, name: 'Betty'},
{id: 3, name: 'Christine'}

].forEach(function(k) {
tx.insert(kids, k).onsuccess = function(t, id) {
document.getElementById('display').textContent +=
'\tSaved record for ' + k.name + ' with id ' +
id + '\n';
};
});

[
{id: 1, name: 'toffee-apple'},
{id: 2, name: 'bonbon'}

].forEach(function(c) {
tx.insert(candy, c).onsuccess = function(t, id) {
document.getElementById('display').textContent +=
'\tSaved record for ' + c.name + ' with id ' +
id + '\n';
};
});

[
{kid: 1, candy: 1, date: '1/1/2010'},
{kid: 1, candy: 2, date: '2/1/2010'},
{kid: 2, candy: 2, date: '2/1/2010'},
{kid: 3, candy: 1, date: '1/1/2010'},
{kid: 3, candy: 1, date: '2/1/2010'},
{kid: 3, candy: 1, date: '3/1/2010'}

].forEach(function(s) {
tx.insert(candySales, s).onsuccess = function(t, id) {
document.getElementById('display').textContent +=
'\tSaved record for ' + s.kid + '/' + s.candy +
' with id ' + id + '\n';
};
});
});

i.onsuccess = function() {
var q1 = db.transaction(function(tx) {

tx.query(kids.project(kids.attributes.name)).onsuccess =
function(t, names) {
names.forEach(function(name) {
document.getElementById('kidList').textContent
+= '\t' + name + '\n';
});
};

});

q1.onsuccess = function() {
var q2 = db.transaction(function(tx) {

tx.query(

kids.join(candySales,
kids.attributes.id.eq(candySales.attributes.kid))
.group(candySales.attributes.kid)
.project({name:kids.attributes.name,
count:kids.attributes.name.count()})

).onsuccess = function(t, results) {

var display =
document.getElementById('purchaseList');
results.forEach(function(item) {
display.textContent += '\t' + item.name + '
bought ' + item.count + ' pieces\n';
});

};
});
};
};
}


Cheers,
Keean.


On 9 November 2010 17:13, Keean Schupke ke...@fry

Re: Relational Data Model Example

2010-11-11 Thread Keean Schupke
Well the implementation is not running on IndexedDB yet... however I can see
no fundamental problems that will stop the implementation. I am sure once I
get into the details there will be issues - but I expect these to be
performance related.

The plan is to continue to refine the common abstraction part of the
prototype - I want to complete the relational data model - then start the
IndexedDB backend.

I'll let you know when I have something on IndexedDB.


Cheers,
Keean.


On 11 November 2010 17:35, Jonas Sicking jo...@sicking.cc wrote:

 Hi Keean,

 This is awesome stuff! Very excited to see libraries that can run both
 on top of IndexedDB and on top of WebSQL.

 Would love to hear more about your experience working against the IndexedDB
 API.

 / Jonas

 On Thu, Nov 11, 2010 at 5:42 AM, Keean Schupke ke...@fry-it.com wrote:
  Hi,
  Here are the Mozilla IndexedDB examples converted to us the relational
 data
  model. Points to note:
  - The database is validated (that is the schema in the JavaScript is
 either
  used to create the database if it does not exit, or to make sure that the
  database conforms to the schema if it does exist. Currently we require an
  exact match for validation to succeed, however the final version will use
  nullable and default values to allow attributes to be added to existing
  relations, or attributes ignored providing the required pre-conditions
 are
  met).
  - The 'true' at the end of the validate function tells it to drop the
  existing relations, so we always start with an empty database.
  - We add more data than the original insert example so there are some
  results from the join query.
  - There is no single-value-per-group test yet for project. But
 effectively
  when grouping by a unique attribute (like id) any attribute in the same
  (pre-join) relation is acceptable, as well as the attribute joined to in
 the
  other relation, but no other attribute if the joined to column is not
 unique
  (the case in the example).
 
  var rdm = new RelationalDataModel;
  var rdb = new rdm.WebSQLiteDataAdapter;
  var kids = rdm.relation('kids', {
  id: rdm.attribute('id', rdm.integer, {auto_increment: true}),
  name: rdm.attribute('name', rdm.string)
  });
  var candy =  rdm.relation('candy', {
  id: rdm.attribute('id', rdm.integer, {auto_increment: true}),
  name: rdm.attribute('name', rdm.string)
  });
  var candySales = rdm.relation('candySales', {
  kid: rdm.attribute('kid', rdm.integer),
  candy: rdm.attribute('candy', rdm.integer),
  date: rdm.attribute('date', rdm.string)
  });
  var v = rdb.validate('CandyDB', 1.0, [kids, candy, candySales],
  true).onsuccess = function(db) {
  // new database has been created, or existing database has
 been
  _validated_
 
  var i = db.transaction(function(tx) {
  [
  {id: 1, name: 'Anna'},
  {id: 2, name: 'Betty'},
  {id: 3, name: 'Christine'}
  ].forEach(function(k) {
  tx.insert(kids, k).onsuccess = function(t, id) {
  document.getElementById('display').textContent +=
  '\tSaved record for ' + k.name + ' with id '
 +
  id + '\n';
  };
  });
  [
  {id: 1, name: 'toffee-apple'},
  {id: 2, name: 'bonbon'}
  ].forEach(function(c) {
  tx.insert(candy, c).onsuccess = function(t, id) {
  document.getElementById('display').textContent +=
  '\tSaved record for ' + c.name + ' with id '
 +
  id + '\n';
  };
  });
  [
  {kid: 1, candy: 1, date: '1/1/2010'},
  {kid: 1, candy: 2, date: '2/1/2010'},
  {kid: 2, candy: 2, date: '2/1/2010'},
  {kid: 3, candy: 1, date: '1/1/2010'},
  {kid: 3, candy: 1, date: '2/1/2010'},
  {kid: 3, candy: 1, date: '3/1/2010'}
  ].forEach(function(s) {
  tx.insert(candySales, s).onsuccess = function(t, id)
 {
  document.getElementById('display').textContent +=
  '\tSaved record for ' + s.kid + '/' + s.candy
 +
  ' with id ' + id + '\n';
  };
  });
  });
  i.onsuccess = function() {
  var q1 = db.transaction(function(tx) {
  tx.query(kids.project(kids.attributes.name)).onsuccess
 =
  function(t, names) {
  names.forEach(function(name) {
 
  document.getElementById('kidList').textContent
  += '\t' + name + '\n

Re: [Bug 11270] New: Interaction between in-line keys and key generators

2010-11-11 Thread Keean Schupke
The other thing you could do is specify that when you get a wrap (IE someone
inserts a key of MAXINT - 1) you auto-compact the table. If you really have
run out of indexes there is not a lot you can do.

The other thing to consider it that because JS uses signed arithmetic, its
really a 63bit number... unless you want negative indexes appearing? (And
how would that affect ordering and sorting)?


Cheers,
Keean.


On 12 November 2010 07:36, Jeremy Orlow jor...@chromium.org wrote:

 On Fri, Nov 12, 2010 at 10:08 AM, Jonas Sicking jo...@sicking.cc wrote:

 On Thu, Nov 11, 2010 at 9:22 PM, Jeremy Orlow jor...@chromium.org
 wrote:
  On Fri, Nov 12, 2010 at 12:32 AM, Jonas Sicking jo...@sicking.cc
 wrote:
 
  On Thu, Nov 11, 2010 at 11:41 AM, Jeremy Orlow jor...@chromium.org
  wrote:
   On Thu, Nov 11, 2010 at 6:41 PM, Tab Atkins Jr. 
 jackalm...@gmail.com
   wrote:
  
   On Thu, Nov 11, 2010 at 4:20 AM, Jeremy Orlow jor...@chromium.org
   wrote:
What would we do if what they provided was not an integer?
  
   The behavior isn't very important; throwing would be fine here.  In
   mySQL, you can only put AUTO_INCREMENT on columns in the integer
   family.
  
  
What happens if
the number they insert is so big that the next one causes
 overflow?
  
   The same thing that happens if you do ++ on a variable holding a
   number that's too large.  Or, more directly, the same thing that
   happens if you somehow fill up a table to the integer limit
 (probably
   deleting rows along the way to free up space), and then try to add a
   new row.
  
  
What is
the use case for this?  Do we really think that most of the time
users
do
this it'll be intentional and not just a mistake?
  
   A big one is importing some data into a live table.  Many smaller
 ones
   are related to implicit data constraints that exist in the
 application
   but aren't directly expressed in the table.  I've had several times
   when I could normally just rely on auto-numbering for something, but
   occasionally, due to other data I was inserting elsewhere, had to
   specify a particular id.
  
   This assumes that your autonumbers aren't going to overlap and is
 going
   to
   behave really badly when they do.
   Honestly, I don't care too much about this, but I'm skeptical we're
   doing
   the right thing here.
 
  Pablo did bring up a good use case, which is wanting to migrate
  existing data to a new object store, for example with a new schema.
  And every database examined so far has some ability to specify
  autonumbered columns.
 
  overlaps aren't a problem in practice since 64bit integers are really
  really big. So unless someone maliciously sets a number close to the
  upper bound of that then overlaps won't be a problem.
 
  Yes, but we'd need to spec this, implement it, and test it because
 someone
  will try to do this maliciously.

 I'd say it's fine to treat the range of IDs as a hardware limitation.
 I.e. similarly to how we don't specify how much data a webpage is
 allowed to put into DOMStrings, at some point every implementation is
 going to run out of memory and effectively limit it. In practice this
 isn't a problem since the limit is high enough.

 Another would be to define that the ID is 64 bit and if you run out of
 IDs no more rows can be inserted into the objectStore. At that point
 the page is responsible for creating a new object store and compacting
 down IDs. In practice no page will run into this limitation if they
 use IDs increasing by one. Even if you generate a new ID a million
 times a second, it'll still take you over half a million years to run
 out of 64bit IDs.


 This seems reasonable.  OK, let's do it.


  And, in the email you replied right under, I brought up the point that
 this
  feature won't help someone who's trying to import data into a table that
  already has data in it because some of it might clash.  So, just to make
  sure we're all on the same page, the use case for this is restoring data
  into an _empty_ object store, right?  (Because I don't think this is a
 good
  solution for much else.)

 That's the main scenario I can think of that would require this yes.

 / Jonas





Re: [Bug 11270] New: Interaction between in-line keys and key generators

2010-11-10 Thread Keean Schupke
What do databases usually do with columns that use autoincrement but a
value is still supplied? My recollection is that that is generally
allowed?

You can normally insert with a supplied key providing it is unique.

Cheers,
Keean.



On 10 November 2010 22:07, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, Nov 10, 2010 at 1:50 PM, Tab Atkins Jr. jackalm...@gmail.com
 wrote:
  On Wed, Nov 10, 2010 at 1:43 PM, Pablo Castro
  pablo.cas...@microsoft.com wrote:
 
  From: public-webapps-requ...@w3.org [mailto:
 public-webapps-requ...@w3.org] On Behalf Of bugzi...@jessica.w3.org
  Sent: Monday, November 08, 2010 5:07 PM
 
  So what happens if trying save in an object store which has the
 following
  keypath, the following value. (The generated key is 4):
 
  foo.bar
  { foo: {} }
 
  Here the resulting object is clearly { foo: { bar: 4 } }
 
  But what about
 
  foo.bar
  { foo: { bar: 10 } }
 
  Does this use the value 10 rather than generate a new key, does it
 throw an
  exception or does it store the value { foo: { bar: 4 } }?
 
  I suspect that all options are somewhat arbitrary here. I'll just
 propose that we error out to ensure that nobody has the wrong expectations
 about the implementation preserving the initial value. I would be open to
 other options except silently overwriting the initial value with a generated
 one, as that's likely to confuse folks.
 
  It's relatively common for me to need to supply a manual value for an
  id field that's automatically generated when working with databases,
  and I don't see any particular reason that my situation would change
  if using IndexedDB.  So I think that a manually-supplied key should be
  kept.

 I'm fine with either solution here. My database experience is too weak
 to have strong opinions on this matter.

 What do databases usually do with columns that use autoincrement but a
 value is still supplied? My recollection is that that is generally
 allowed?

  What happens if the property is missing several parents, such as
 
  foo.bar.baz
  { zip: {} }
 
  Does this throw or does it store { zip: {}, foo: { bar: { baz: 4 } } }
 
  We should just complete the object with all the missing parents.
 
  Agreed.

 Works for me.

  If we end up allowing array indexes in key paths (like foo[1].bar)
 what does
  the following keypath/object result in?
 
  I think we can live without array indexing in keys for this round, it's
 probably best to just leave them out and only allow paths.
 
  Agreed.

 Works for me.

 Actually, we could go even further and disallow paths entirely, and
 just allow a property name. That is what the firefox implementation
 currently does. That also sidesteps the issue of missing parents.

 / Jonas




Relational Data Model Example

2010-11-09 Thread Keean Schupke
Hi,

I have completed the first stage of the Relational Data Model prototype.
Error checking is not complete (for example aggregate functions can be
nested currently, and this should not be allowed). So it should work for
correct examples, but may not generate an error (or the correct error) for
incorrect examples.

The library (available at http://keean.fry-it.com/relational.js) only
implements the WebSQL backend at the moment, as this was the quickest to get
up and running. I plan to implement a JavaScript Object backend (IE
relational operations in memory) and the IndexedDB backend.

There is a simple first example (available at
http://keean.fry-it.com/cuboid.html) that shows calculating the average
volume of a collection of cuboids the relational way.

Attached at the end is the JavaScript source for the cuboid example.
Comments appreciated.


Cheers,
Keean.


try {
var rdm = new RelationalDataModel;
var rdb = new rdm.WebSQLiteDataAdapter;

var cuboid_id = rdm.domain('id', rdm.integer, {not_null: true});
var dimension = rdm.domain('dimension', rdm.number, {not_null:
true});

var cuboids = rdm.relation('cuboids', {
id: rdm.attribute('id', cuboid_id, {auto_increment: true}),
length: rdm.attribute('length', dimension),
width: rdm.attribute('width', dimension),
height: rdm.attribute('height', dimension)
});

var v = rdb.validate('cubeoid_db', 1.0, [cuboids]);

v.onerror = function(error) {
alert('ValidateError: ' + error.message);
};

v.onsuccess = function(db) {

var insert = db.transaction(function(tx) {
tx.insert(cuboids, {width:10.0, length:10.0, height:10.0});
tx.insert(cuboids, {width:13.5, length:17.2, height:10.1});
tx.insert(cuboids, {width:23.1, length:7.9, height:9.5});
});

insert.onerror = function(error) {
alert('InsertTransactionError: ' + error.message);
};

insert.onsuccess = function() {
var query = db.transaction(function(tx) {

var average_volume = cuboids.attributes.length
.mul(cuboids.attributes.width)
.mul(cuboids.attributes.height)
.avg();

var q = tx.query(cuboids.project({avg_vol:
average_volume}));

q.onsuccess = function(t, results) {
var s = ;
results.forEach(function(r) {
s += r.avg_vol + '\n';
});
alert(s);
};

});

query.onerror = function(error) {
alert('QueryTransactionError: ' + error.message);
};
};
};

} catch (e) {
alert (e.stack);
}


Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist

2010-11-08 Thread Keean Schupke
It would make sense if you make setting a key to undefined semantically
equivalent to deleting the value (and no error if it does not exist), and
return undefined on a get when no such key exists. That way 'undefined'
cannot exist as a value in the object store, and is a safe marker for the
key not existing in that index.


Cheers,
Keean.


On 8 November 2010 17:52, Tab Atkins Jr. jackalm...@gmail.com wrote:

 On Mon, Nov 8, 2010 at 8:24 AM, Jonas Sicking jo...@sicking.cc wrote:
  Hi All,
 
  One of the things we discussed at TPAC was the fact that
  IDBObjectStore.get() and IDBObjectStore.delete() currently fire an
  error event if no record with the supplied key exists.
 
  Especially for .delete() this seems suboptimal as the author wanted
  the entry with the given key removed anyway. A better alternative here
  seems to be to return (through a success event) true or false to
  indicate if a record was actually removed.
 
  For IDBObjectStore.get() it also seems like it will create an error
  event in situations which aren't unexpected at all. For example
  checking for the existence of certain information, or getting
  information if it's there, but using some type of default if it's not.
  An obvious choice here is to simply return (through a success event)
  undefined if no entry is found. The downside with this is that you
  can't tell the lack of an entry apart from an entry stored with the
  value undefined.
 
  However it seemed more rare to want to tell those apart (you can
  generally store something other than undefined), than to end up in
  situations where you'd want to get() something which possibly didn't
  exist. Additionally, you can still use openCursor() to tell the two
  apart if really desired.
 
  I've for now checked in this change [1], but please speak up if you
  think this is a bad idea for whatever reason.

 In general I'd disagree with you on get(), and point to basically all
 hash-table implementations which all give a way of telling whether you
 got a result or not, but the fact that javascript has false, null,
 *and* undefined makes me okay with this.  I believe it's sufficient to
 use 'undefined' as the flag for there was nothing for this key in the
 objectstore, and just tell authors don't put undefined in an
 objectstore; use false or null instead.

 ~TJ




Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist

2010-11-08 Thread Keean Schupke
Hi,

In code, if:

idbObjectStoreSync.put(key, undefined)  does the same as
 idbObjectStoreSync.remove(key)

then

idbObjectStoreSync.get(key) can safely return undefined for no such key
exists.


Consider:

idbObjectStoreSync.put('mykey', undefined); // deletes the object stored
under mykey or noop.
idbObjectStoreSync.get('mykey'); // returns 'undefined'
idbObjectStoreSync.put('mykey', myobject);
idbObjectStoreSync.get('mykey'); // returns 'myobject'
idbObjectStoreSync.put('mykey', undefined); // deletes the object stored
under mykey or noop.
idbObjectStoreSync.get('mykey'); // returns 'undefined'


Cheers,
Keean.


On 8 November 2010 18:27, Jonas Sicking jo...@sicking.cc wrote:

 On Mon, Nov 8, 2010 at 10:06 AM, Keean Schupke ke...@fry-it.com wrote:
  It would make sense if you make setting a key to undefined semantically
  equivalent to deleting the value (and no error if it does not exist), and
  return undefined on a get when no such key exists. That way 'undefined'
  cannot exist as a value in the object store, and is a safe marker for the
  key not existing in that index.

 I'm not sure I follow. There is no way to set a key on an existing
 entry in an object store. The closest thing would be
 IDBCursor.update(), but it specifically disallow changing the key at
 all.

 / Jonas



Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist

2010-11-08 Thread Keean Schupke
Obviously I need to the key and value the correct way around for 'put'...

Cheers,
Keean.


On 8 November 2010 18:41, Keean Schupke ke...@fry-it.com wrote:

 Hi,

 In code, if:

 idbObjectStoreSync.put(key, undefined)  does the same as
  idbObjectStoreSync.remove(key)

 then

 idbObjectStoreSync.get(key) can safely return undefined for no such key
 exists.


 Consider:

 idbObjectStoreSync.put('mykey', undefined); // deletes the object stored
 under mykey or noop.
 idbObjectStoreSync.get('mykey'); // returns 'undefined'
 idbObjectStoreSync.put('mykey', myobject);
 idbObjectStoreSync.get('mykey'); // returns 'myobject'
 idbObjectStoreSync.put('mykey', undefined); // deletes the object stored
 under mykey or noop.
 idbObjectStoreSync.get('mykey'); // returns 'undefined'


 Cheers,
 Keean.


 On 8 November 2010 18:27, Jonas Sicking jo...@sicking.cc wrote:

 On Mon, Nov 8, 2010 at 10:06 AM, Keean Schupke ke...@fry-it.com wrote:
  It would make sense if you make setting a key to undefined semantically
  equivalent to deleting the value (and no error if it does not exist),
 and
  return undefined on a get when no such key exists. That way 'undefined'
  cannot exist as a value in the object store, and is a safe marker for
 the
  key not existing in that index.

 I'm not sure I follow. There is no way to set a key on an existing
 entry in an object store. The closest thing would be
 IDBCursor.update(), but it specifically disallow changing the key at
 all.

 / Jonas





Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist

2010-11-08 Thread Keean Schupke
Hi,


 Indeed. But I think this is more unexpected and confusing than having
 .get() return the same thing if the entry exists as if it contains
 undefined.

 / Jonas


I don't understand that.

with the proposal, undefined clearly means the entry does not exist as there
is no way to put an undefined into the object store (as .put(undefined, key)
deletes the entry).


Cheers,
Keean.


Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist

2010-11-08 Thread Keean Schupke
I was only suggesting this as it makes the operations symmetrical in the
sense that if get returns undefined for key does not exist,
put(undefined, key) should mean make this key not exist, in a declarative
sense.

For me this is clearer than the alternatives (which may require exceptions
to deal with some cases).

Of course its only a suggestion, and it nobody likes it, feel free to ignore
it.


Cheers,
Keean.


On 8 November 2010 18:57, Keean Schupke ke...@fry-it.com wrote:

 Hi,


 Indeed. But I think this is more unexpected and confusing than having
 .get() return the same thing if the entry exists as if it contains
 undefined.

 / Jonas


 I don't understand that.

 with the proposal, undefined clearly means the entry does not exist as
 there is no way to put an undefined into the object store (as
 .put(undefined, key) deletes the entry).


 Cheers,
 Keean.




Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist

2010-11-08 Thread Keean Schupke
Hi,

 I don't understand that.
  with the proposal, undefined clearly means the entry does not exist as
 there is no way to put an undefined into the object store (as
 .put(undefined, key) deletes the entry).

 The confusing part is that a function called 'put' actually deletes
 something, especially since we also have a 'delete' function.


Sure, you could get rid of the delete function :-) I think the meaning of
put(undefined, key) is pretty clear.



 I would put the question this way: What problem are you trying to
 solve? If the problem is that people can't store undefined and then
 tell undefined apart from not there then your proposal doesn't
 solve that problem as undefined can't be stored at all.


Precisely, the solution I am proposing is based on disallowing storing of
'undefined'. What does it mean to store 'undefined' anyway? People can still
use null.

If you disallow storing 'undefined', put(undefined, key) would need to
throw an exception. I am proposing having put(undefined, key) be the same
as remove(key) to avoid having an exception. After all the initial concern
was avoiding having to handle exceptions.


Cheers,
Keean


Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist

2010-11-08 Thread Keean Schupke
What is the use case for storing undefined in an object-store?


Cheers,
Keean.

On 8 November 2010 20:59, Jonas Sicking jo...@sicking.cc wrote:

 On Mon, Nov 8, 2010 at 12:02 PM, Keean Schupke ke...@fry-it.com wrote:
  Hi,
 
   I don't understand that.
   with the proposal, undefined clearly means the entry does not exist as
   there is no way to put an undefined into the object store (as
   .put(undefined, key) deletes the entry).
 
  The confusing part is that a function called 'put' actually deletes
  something, especially since we also have a 'delete' function.
 
  Sure, you could get rid of the delete function :-) I think the meaning of
  put(undefined, key) is pretty clear.

 I guess we'll have to agree to disagree on that one :)

 My concern with something like this is that we'll see code do stuff like:

 function myStoreFunction(objectStoreName, key, value) {
  os = db.transaction([objectStoreName]).objectStore(objectStoreName);
  if (value === undefined) {
os.put(null, key);
  }
  else {
os.put(value, key);
  }
 }

 which does not seem like a net win for anyone.

  I would put the question this way: What problem are you trying to
  solve? If the problem is that people can't store undefined and then
  tell undefined apart from not there then your proposal doesn't
  solve that problem as undefined can't be stored at all.
 
  Precisely, the solution I am proposing is based on disallowing storing of
  'undefined'. What does it mean to store 'undefined' anyway? People can
 still
  use null.

 Wait, your solution doesn't solve the above described problem. The
 described problem was

 People can't store undefined and then tell undefined apart from
 not there then your proposal doesn't solve that problem as
 undefined can't be stored at all.

 Your solution doesn't solve that problem.

 / Jonas



Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist

2010-11-08 Thread Keean Schupke
Let me put it another way. Why do you want to allow putting 'undefined' into
the object store? All that does is make the API for get ambiguous. What does
it gain you? Why do you want to make 'get' ambiguous?

I think having an unambiguous API for 'get' is worth more than being able to
'put' 'undefined' values into the object store.


Cheers,
Keean.


On 8 November 2010 23:10, Jonas Sicking jo...@sicking.cc wrote:

 On Mon, Nov 8, 2010 at 2:39 PM, Keean Schupke ke...@fry-it.com wrote:
  The problem I am trying to solve is not knowing if get(key) ===
 undefined
  means the key does not exist or there is a key with a value of undefined.
  The solution is to disallow inserting undefined. Now there is no
 ambiguity,
  if get(key) returns undefined, it _must_ be because the key does not
 exist.
  Does this make sense so far?

 But if saying you're not allowed to insert undefined as value is an
 acceptable solution, why isn't you can't tell them apart using get() an
 acceptable solution?

 What use case does the first solution cater to that isn't solved by
 the second solution?

 / Jonas



Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist

2010-11-08 Thread Keean Schupke
If more than one developer are working on a project, there is no way I can
know if the other developer has put 'undefined' objects into the store
(unless the specification enforces it).

So every time I am checking if a key exists (maybe to delete the key) I need
to check if it _really_ exists, or else I can run into problems. For
example:

In module A:
put(undefined, key);

In module B:
if (get(key) !== undefined) {
   remove(key);
}

So the object store will fill up with key = undefined until we run out of
memory.


Cheers,
Keean.


On 8 November 2010 23:24, Jonas Sicking jo...@sicking.cc wrote:

 On Mon, Nov 8, 2010 at 3:18 PM, Keean Schupke ke...@fry-it.com wrote:
  Let me put it another way. Why do you want to allow putting 'undefined'
 into
  the object store? All that does is make the API for get ambiguous. What
 does
  it gain you? Why do you want to make 'get' ambiguous?

 It seems like a loose-loose situation to prevent it. Implementors will
 have to add code to check for 'undefined' all the time, and users of
 the API can't store 'undefined' if they would like to.

  I think having an unambiguous API for 'get' is worth more than being able
 to
  'put' 'undefined' values into the object store.

 Can you describe the application that would be easier to write,
 possible to write, faster to run or have cleaner code if we forbade
 putting 'undefined' in an object store?

 / Jonas



Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist

2010-11-08 Thread Keean Schupke
Sounds good to me...

Cheers,
Keean.


On 9 November 2010 00:16, Jonas Sicking jo...@sicking.cc wrote:

 On Mon, Nov 8, 2010 at 4:04 PM, Keean Schupke ke...@fry-it.com wrote:
  Hi,
 
 
  Why do you want to check that a key exists before you delete it? Why
  not just call delete(key) always and rest assured that it's gone?
 
  because it will throw an exception if the key does not exist...

 That is no longer the case, see the first email in this thread :)

  Similar to Kris, I think worrying about 'undefined' is worrying about
  an edge case. Simplicity is better than trying to cove every possible
  edge case.
 
  I thought edge cases are precisely what a specification is supposed to
 deal
  with.

 A spec can never cover 100% of all use cases. Often covering the last
 10-20% of the use cases adds as much complexity or API surface, if not
 more, as covering the first 80-90%. The trick really is to know when
 to stop.

  Anyway, although I don't agree with the other reasons, I find the array
 case
  compelling. So lets ignore the proposal to disallow storing undefined.
  Perhaps you could add a boolean method exists(key) to IDBObjectStore to
  make it easier to tell the two apart.

 Note that you can easily do this using openCursor already. In the
 synchronous API you could easily implement exists by doing:

 IDBObjectStoreSync.prototype.exists = function(key) {
  return this.openCursor(key) !== undefined;
 }

 I think we should keep exists() in mind for v2 of the interface. It
 has other benefits over get() and openCursor() in that if the stored
 value is very big it doesn't require time to deserialize it out of the
 database. But given how close we are to finishing v1 I'd rather not
 add it now. I have added it to my stuff we should reexamine in v2
 list though.

 / Jonas



Re: Replacing WebSQL with a Relational Data Model.

2010-10-27 Thread Keean Schupke
Hi Nathan,

On 27 October 2010 08:58, Nathan Kitchen w...@nathankitchen.com wrote:

 The most obvious problem was that it was tied so tightly to SQLite (which I
 think everyone would be amazed if MS started shipping with IE10). They'd
 want to use Access/SQL Compact, and suddenly we'd all have different SQL
 dialects to code our offline applications to.


 I am sure you are aware, but the relation API I am proposing would not have
this problem. The relational algebra is defined independently of any SQL
implementation. Infact its not even SQL. However a relational database (like
SQLite, MySQL, Access/SQL Compact) would make the ideal library to use in
its implementation because of the huge amount of work done over may years by
researchers and programmers to make a decent relational database engine that
we do not want to have to replicate in JavaScript on top of IndexedDB.


 Which is why I agree 100% with this statement:


 *The critical point here is that we need only one standardized interface,
 not a perfectly optimized for data-model-x one, not a uses
 query-language-foo one, just something that we can all use to persist data
 from javascript, and wrap in other APIs, that way any optimizations made
 will benefit everybody - regardless of their preferred interface, data model
  query style.*


And I totally agree with this statement, which is why I think it is critical
a _relationally_complete_ API is standardised (either in this, or a later
IndexedDB spec, or another spec entirely).


Cheers,
Keean.


Re: Replacing WebSQL with a Relational Data Model.

2010-10-27 Thread Keean Schupke
Sure, the argument has more weight with real numbers. I have started working
on the relational schema model in JavaScript.

Here is a question:

What is preferred in terms of style for declaring a relation. We can have
something like:

var FarmTable = {
id: {name: 'id', domain: FarmId, type: rdm.schema.serial},
name: {name: 'name', domain: FarmName},
county: {name: 'county', domain: FarmCounty},
owner: {name: 'owner', domain: FarmerId}
};

This is concise, but little checking is done, alternatively:

var FarmTable = new Relation(
new Attribute('id', FarmId, rdm.schema.serial),
new Attribute('name', FarmName),
new Attribute('county', FarmCounty),
new Attribute('owner', FarmerId)
);

Or perhaps something else?


Cheers,
Keean.


On 27 October 2010 09:24, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, Oct 27, 2010 at 1:04 AM, Keean Schupke ke...@fry-it.com wrote:
  So, my point was that although IndexedDB is neither optimal for your
  preferred data model or mine, it does cater for us both, and everybody
 else,
  allowing us to get on and do our jobs, implement APIs, and build HTML5
  client side web applications.
 
 
  This is where we differ, as I think it may allow it, it will not make it
  practical (from the programmers point of view) nor usable (from the end
 user
  tying to use the app).
  Remember we have to perform reasonably against native iPhone / Android
 apps
  or people will not use HTML5 apps.

 I'd encourage you to do some testing, run some performance numbers,
 and report back for cases where things are too slow.

 That good performance is a required in order to consider a use case
 met is hopefully obvious to everyone here. The whole point of
 IndexedDB is good performance, other than performace it doesn't
 provide anything that localStorage doesn't.

 / Jonas



Re: Replacing WebSQL with a Relational Data Model.

2010-10-27 Thread Keean Schupke
On that point it should be possible to build an efficient text search on top
of IndexedDB. You need a word index that links to multiple documents.
Matching documents are found by taking the intersection of the sets of
documents found for each word in the query (for an unstructured query). As
such you would put the documents in localStorage, and build a word index in
IndexedDB, there each record contained a list of document references.


Cheers,
Keean.


On 27 October 2010 09:43, Nathan Kitchen w...@nathankitchen.com wrote:

 Sorry Keean, the main point of my post was to introduce the [featurecreep
 /], not critique your suggestions. I don't honestly care about the
 implementation of persistent browser storage, but I do care that it's
 fully-featured. nat...@webr3.org noted that we just need something to
 persist data from javascript. Although I agree with this, I think we
 additionally need native full-text search *as well as* CRUD. The Gears
 implementation of FTS (or rather, SQLite) exposed useful functionality but
 that needs
 *
 *
 In hindsight it was a little off-topic, but I saw it fly past and thought
 that while we were discussing offline storage features it'd be a good point
 to raise FTS.

 I'm also not sure about persisted JSON structures vs relational objects,
 but happy to see how the current spec pans out. It certainly involves
 thinking about an application's data architecture in a different way though.

 On Wed, Oct 27, 2010 at 9:10 AM, Keean Schupke ke...@fry-it.com wrote:

 Hi Nathan,

 On 27 October 2010 08:58, Nathan Kitchen w...@nathankitchen.com wrote:

 The most obvious problem was that it was tied so tightly to SQLite (which
 I think everyone would be amazed if MS started shipping with IE10). They'd
 want to use Access/SQL Compact, and suddenly we'd all have different SQL
 dialects to code our offline applications to.





 Which is why I agree 100% with this statement:


 *The critical point here is that we need only one standardized
 interface, not a perfectly optimized for data-model-x one, not a uses
 query-language-foo one, just something that we can all use to persist data
 from javascript, and wrap in other APIs, that way any optimizations made
 will benefit everybody - regardless of their preferred interface, data 
 model
  query style.*


 And I totally agree with this statement, which is why I think it is
 critical a _relationally_complete_ API is standardised (either in this, or a
 later IndexedDB spec, or another spec entirely).


 Cheers,
 Keean.






  1   2   >