Re: Allow ... centralized dialog up front
Something is better than nothing, and both the iPhone and Android systems are better than not asking the user at all. The principle of security in depth is that you don't rely on a single security feature that may be flawed, but have a multi layered approach to security. I think that giving a user control of what information is released is a necessary part of that model, and I cannot see anything in the documents you link to that contradicts that. Are you suggesting that users should be denied control of their information? What about information in sensitive environments (politicians, security workers etc) who understand privacy are suggesting we deny then the ability to control the access? For me personally, being able to see all the information an application will access and be able to return to this information and edit my preferences at any time seems the best way to do this. Visibility is key - you must be able to see what permissions have been granted so they can be revoked, or just to reassure that certain permissions are denied. Others may have different ideas, however nearly all on-line services are moving to this model so it must offer some benefits. This model is also essential in any kind of enterprise setting where the IT department will want to audit and approve apps for use. Cheers, Keean. On 6 February 2013 11:03, Robin Berjon ro...@w3.org wrote: On 06/02/2013 08:36 , Keean Schupke wrote: I don't think you can say either an up front dialog or popups do not work. There are clear examples of both working, Android and iPhone respectively. Each has a different set of trade-offs and is better in some circumstances, worse in others. If by working you mean that it is technically feasible and will provide developers with access to features then sure. If however you mean that it succeeds in protecting users against agreeing to escalate privileges to malicious applications then, no, it really, really does not work at all. Security through user prompting is sweeping the problem under the rug. Usually this is the point at which someone will say but we have to *educate* the users!. No. We don't. Users don't want to be educated, and they shouldn't have to be. We're producing technology for *user* agents. It is *our* responsibility to ensure that users remain safe, even in as much as possible against their own mistakes. And I'm sorry to go all Godwin on you, but the prompting approach is the Java applet security model all over again. Let's just not go back there, shall we? It's not as if this debate hasn't been had time and over again. See (old and unfinished): http://darobin.github.com/api-**design-privacy/api-design-** privacy.html#privacy-**enhancing-api-patternshttp://darobin.github.com/api-design-privacy/api-design-privacy.html#privacy-enhancing-api-patterns That includes a short discussion of why the Geolocation model is wrong. All of this has been extensively discussed in the DAP WG, as well as IIRC around the Web Notifications work. There have been a few attempts to work out the details (tl;dr they don't fly): http://w3c-test.org/dap/**proposals/request-feature/http://w3c-test.org/dap/proposals/request-feature/ http://dev.w3.org/2009/dap/**docs/feat-perms/feat-perms.**htmlhttp://dev.w3.org/2009/dap/docs/feat-perms/feat-perms.html That's one of the reasons we have a SysApps WG today. As it happens, they're working on a security model, too. This is not to say that declaring required privileges cannot be useful. There certainly are cases in which it can integrate into a larger system. But that larger system isn't upfront prompting. -- Robin Berjon - http://berjon.com/ - @robinberjon
Re: Allow ... centralized dialog up front
I don't think you can say either an up front dialog or popups do not work. There are clear examples of both working, Android and iPhone respectively. Each has a different set of trade-offs and is better in some circumstances, worse in others. In my opinion an API should allow for both, so that the user experience can be inline with platforms standards. It us clear iPhone users with safari will expect one behavior, android users with chrome will expect another. In order to allow an up front dialog permissions need to be declared up front, preferably statically in markup (as you would do an a manifest file for an android application). Browsers would still be able to delay asking the user for permission until the first use. The opposite of requiring browsers wishing to use an up front dialog to use some kind of code scanning to detect calls to services with permissions seems unreliable and complex. Static permission declarations in markup seems a much simpler and more light weight solution with other benefits like searchability and app stores being able to read and display this. Cheers, Keean. On 6 Feb 2013 05:44, Charles Pritchard ch...@jumis.com wrote: This direction of placing permissions up there in the site info expansion in Chrome feels like the right direction. That spot where they show where an SSL cert is valid/expired. Now I can easily see cookies and flip various settings in one click as I look at site info. I've been working on a web app where I don't need any upfront permissions, but the user can elect to elevate to clipboard, XSS and a high disk quota. I've certainly felt the cost of multiple dialogs vs a one-time grant everything prompt. On Feb 5, 2013, at 5:09 PM, Charles McCathie Nevile cha...@yandex-team.ru wrote: TL;DR: Being able to declare the permissions that an app asks for might be useful. User agents are and should continue to be free to innovate in ways they present the requests to the user, because a block dialogue isn't a universal improvement on current practice (which in turn isn't the same everywhere). On Mon, 04 Feb 2013 01:35:43 +0100, Florian Bösch pya...@gmail.com wrote: So how exactly do you imagine this going down when an application that uses half a dozen such capabilities starts? Clicking trough half a dozen allow - allow - allow - allow - allow - allow, you really think the user's gonna bother what the 5th or sixth allow is about? Where there are multiple permissions required the way to ensure user attention isn't as simple as a list that doesn't get read, witha single button clicked by reflex, or multiple buttons to be clicked by reflex without reading. At least that seems to be what the research shows. You'll end up annoying the user, the developer and scaring people off a page. Somehow I can't see that as the function of new capabilities you can offer on a page. Furthermore, some capabilities (like pointerlock) actively interfere with the idea that when you need it you can click it (such as the concept of pointer-lock-drag which requests pointerlock on mousedown and releases it on mouseup) where your click it when you need it idea will always fail the first usage. This may be true. But pointer-lock is an example of something that needs the entire UX to be thought through. simply switching from one to the other without the user knowing is also poor UX, since it risks making the user think their system is broken. Add to this a user working with e.g. mousekeys, or a magnifier at a few hundred percent plus high-contrast. The problems are not simple, and it is unlikely the solutions will be either. Ian's claim that everything can be done seamlessly without making it seem like a security dialog may be over-confident, and as Robin points out the first UI developed (well, the second actually) might not be the best approach in the long run, but it is certainly the direction we should be aiming. So where are we? The single up front dialogue doesn't work. We know that. Mutliple contextual requests go from being effective to being counter-productive at some magic tipping point that is hard to predict. To take an example, let's say I have a chat application that can use web-cam and geolocation. Some user agents might decide to put the permissions up front when you first load the app. And some users will be fine with that. Some will be happy to let it use geolocation when it wants, but will want to turn the camera on and off explicitly (note that Skype - one of the best-known video chat apps there is - allows this as a matter of course. I don't know of anyone who has ever complained). Some app stores might refuse to offer the service unless you have already accepted that you will let any app from the store use geolocation and camera. Others will be quite happy with a user agent that (like skype - or Opera) puts the permissions interface in front of the user to modify at will. And there are various
Re: Allow ... centralized dialog up front
I would like the permissions to be changeable. Not a one time dialog that appears and irrevocably commits me to my choices, but a page with enable/disable toggles I can return and review the permissions and change at any time. How about instead of a request API the required permissions are in tags so they can be machine readable on page load. The browser can read the required permissions tags as page load and create a settings page for the app where each permission can be toggled. This had the advantage that search engines etc can include permission requirements in searches. (I want a diary app that does not use my camera...) Cheers, Keean. Cheers, Keean. On 2 Feb 2013 09:09, Florian Bösch pya...@gmail.com wrote: I do not particularly care what research you will find to support the UI-flow that the existence of a requestAPIs API will eventually give rise to. I do say simply this, the research presented, and pretty much common sense as well easily shows that the current course is foolhardy and ungainy on both user and developer. On Sat, Feb 2, 2013 at 3:37 AM, Charles McCathie Nevile cha...@yandex-team.ru wrote: ** On Fri, 01 Feb 2013 15:29:16 +0100, Florian Bösch pya...@gmail.com wrote: Repetitive permission dialog popups at random UI-flows will not solve the permission fatique any more than a centralized one does. However a centralized permission dialog will solve the following things fairly effectively: - repeated popup fatique Sure. And that is valuable in principle. - extension of trust towards a site regardless of what they ask for (do I trust that Indie game developer? Yes! Do I trust google? No! or vice versa) I don't think so. As Adrienne said, as I have experienced myself, without understanding what the permission is for trust can be reduced as easily as increased. - make it easy for developers not to add UI-flows into their application leading to things the user didn't want to give (Do we want a menu entry save to local storage if the user checked off the box to allow local storage? I think not.) - make it easy for developers to not waste users time by pretending to have a working application, which requires things the user didn't want to give. (Do we really want to offer our geolocated, web-camera using chat app to users who didn't want to give permission to to either? I think not. Do we want to make him find that out after he's been entering our UI-flow and been pressing buttons 5 minutes later? I think not.) These are not so clear. As a user, I *do* want to have applications to which I will give, and revoke, at my discretion, certain rights. Twitter leaps to mind as something that wants access to geolocation, something I occasionally grant. for specific requests but blanket refuse in general. The hypothetical example you offer is something that in general it seems people are happy to offer to a user who has turned off both capabilities. I think the ability for a page to declare permission requests in a standard way, the same as applications and extensions, is worth pursuing, because there are now a number of vendors using stuff that seems to only differ by syntax. The user agent presentation is a more complex question. I believe there is more research done and being done than you seem to credit, and as Hallvord said, I think this is an area where users evolve too. For the reasons outlined already in the thread I don't think an Android-style here are all the requests is as good a solution in practice as it seems, and there is a need for continued research as well as implementations we can test. cheers Chaals On Fri, Feb 1, 2013 at 3:22 PM, Charles McCathie Nevile cha...@yandex-team.ru wrote: On Fri, 01 Feb 2013 15:16:04 +0100, Florian Bösch pya...@gmail.com wrote: On Fri, Feb 1, 2013 at 3:02 PM, Adrienne Porter Felt adriennef...@gmail.com wrote: My user research on Android found that people have a hard time connecting upfront permission requests to the application feature that needs the permission. This meant that people have no real basis by which to allow or deny the request, except for their own supposition. IMO, this implies that the better plan is to temporally tie the permission request to the feature so that the user can connect the two. In some circumstances this works, in others, it does not. Consider that not every capability has a UI-flow, and that some UI flows are fairly obscure. More often than not a page will initiate a flurry of permission dialogs up front to get it out of the way. Some of the UI-flows to use a capability happen deep inside an application activity and can be severely distracting, or crippling to the application. If a developer wants to use the blow-by-blow popup dialogs, he can still do so by simply not calling an API to get done with the business up front. But for those who know their application will not work without features X, Y, Z, A, B and C there is
Re: Allow ... centralized dialog up front
There are benefits to the user, in that it allows all permissions to be managed from one place. I am not sure I like the idea of making the popups an application thing. I think it should be decided by the browser. In any case you would still need the ...Allow callbacks as the user may have gone to the permission review/edit page and disabled some permissions since the app started. Cheers, Keean. Cheers, Keean. On 2 Feb 2013 10:27, Florian Bösch pya...@gmail.com wrote: On Sat, Feb 2, 2013 at 11:16 AM, Keean Schupke ke...@fry-it.com wrote: I think a static declaration is better for security, so if a permission is not there I don't think it should be allowed to request it later. Of course how this is presented to the user is entirely separate, an the UI could defer the request until the first time the restricted feature is used, or allow all permissions that might be needed to be reviewed and enabled/disabled in one place. That kills any benefit a developer could derive. The very idea is that you can figure out up front what your user is gonna let you do, and take appropriate steps (not adding parts of the UI, presenting a suitable message that the application won't work etc.) as well as that if a user has agreed up front, that you can rely on that API and don't need to double-check at every step and add a gazillion pointless onFeatureYaddaYaddaAllowCallback handlers.
Re: Allow ... centralized dialog up front
I didn't think of that. The app would have to maintain its own set of permission flags updated by the callback. I am not sure that's easier than just chaining an anonymous function... But I guess that's a programming style issue. Cheers, Keean. On 2 Feb 2013 10:47, Florian Bösch pya...@gmail.com wrote: And you can have the *the* callback (singular, centralized) as onAPIPermissionChange just fine. If you want to improve things for the user and the developer, you can't go with a solution that doesn't make it any easier for the developer. Your solution will be ignored, nay ridiculed. If you want developers to play along, you've got to give them some carrot as well. On Sat, Feb 2, 2013 at 11:43 AM, Keean Schupke ke...@fry-it.com wrote: There are benefits to the user, in that it allows all permissions to be managed from one place. I am not sure I like the idea of making the popups an application thing. I think it should be decided by the browser. In any case you would still need the ...Allow callbacks as the user may have gone to the permission review/edit page and disabled some permissions since the app started. Cheers, Keean. Cheers, Keean. On 2 Feb 2013 10:27, Florian Bösch pya...@gmail.com wrote: On Sat, Feb 2, 2013 at 11:16 AM, Keean Schupke ke...@fry-it.com wrote: I think a static declaration is better for security, so if a permission is not there I don't think it should be allowed to request it later. Of course how this is presented to the user is entirely separate, an the UI could defer the request until the first time the restricted feature is used, or allow all permissions that might be needed to be reviewed and enabled/disabled in one place. That kills any benefit a developer could derive. The very idea is that you can figure out up front what your user is gonna let you do, and take appropriate steps (not adding parts of the UI, presenting a suitable message that the application won't work etc.) as well as that if a user has agreed up front, that you can rely on that API and don't need to double-check at every step and add a gazillion pointless onFeatureYaddaYaddaAllowCallback handlers.
Re: [IndexedDB] Closing on bug 9903 (collations)
On 1 June 2011 01:37, Pablo Castro pablo.cas...@microsoft.com wrote: -Original Message- From: simetri...@gmail.com [mailto:simetri...@gmail.com] On Behalf Of Aryeh Gregor Sent: Tuesday, May 31, 2011 3:49 PM On Tue, May 31, 2011 at 6:39 PM, Pablo Castro pablo.cas...@microsoft.com wrote: No, that was poor wording on my part, I keep using locale in the wrong context. I meant to have the API take a proper collation identifier. The identifier can be as specific as the caller wants it to be. The implementation could choose to not honor some specific detail if it can't handle it (to the extent that doing so is allowed by the specification of collation names), or fail because it considers that not handling a particular aspect of the collation identifier would severely deviate from the caller's expectations. I'm not sure I understand you. My personal opinion is that there should be no undefined behavior here. If authors are allowed to pass collation identifiers, the spec needs to say exactly how they're to be interpreted, so the same identifier passed to two different browsers will result in the same collation, i.e., the same strings need to sort the same cross-browser. Having only binary collation is better than having non-binary collations but not defining them, IMO. I thought BCP47 allowed implementations to drop subtags if needed. I just re-read the spec and it seems that it only allows to do that in constrained cases where you can't fit the whole name in your buffer (which wouldn't apply to the context discussed here). My first instinct is that this is quite a bit to guarantee (full consistency in collation), but it seems that that's what the spec is shooting for. Given the amount of debate on this, could we at least agree that we can do binary for v1? We can then have an open item for v2 on taking collation names and sort according to UCA or taking callbacks and such. I'm okay with supporting only binary to start with. Great. I'll still wait a bit to see what other folks think, and then update the bug one way or the other. Thanks -pablo The discussion sounds like it is headed in the right direction. Are there any issues with non-unicode encodings that need to be dealt with (HTTP headers default to ISO-8859 I think). Would people be expected to convert on read into UTF-16 strings or use typed-arrays? Cheers, Keean.
Re: [IndexedDB] Closing on bug 9903 (collations)
On 6 May 2011 03:00, Jonas Sicking jo...@sicking.cc wrote: On Wed, May 4, 2011 at 11:12 PM, Keean Schupke ke...@fry-it.com wrote: On 5 May 2011 00:33, Aryeh Gregor simetrical+...@gmail.com wrote: On Tue, May 3, 2011 at 7:57 PM, Jonas Sicking jo...@sicking.cc wrote: I don't think we should do callbacks for the first version of javascript. It gets very messy since we can't rely on that the script function will be returning stable values. The worst that would happen if it didn't return stable values is that sorting would return unpredictable results. Worst is an infinite loop - no return. So the choice here really is between only supporting some form of binary sorting, or supporting a built-in set of collations. Anything else will have to wait for version 2 in my opinion. I think it would be a mistake to try supporting a limited set of natural-language collations. Binary collation is fine for a first version. MySQL only supported binary collation up through version 4, for instance. A good point about MySQL. On Wed, May 4, 2011 at 3:49 AM, Keean Schupke ke...@fry-it.com wrote: I thought only the app that created the db could open it (for security reasons)... so it becomes the app's responsibility to do version control. The comparison function is not going to change by itself - someone has to go into the code and change it, when they do that they should up the revision of the database, if that change is incompatible. Why should we let such a pitfall exist if we can just store the function and avoid the issue? I don't see it as a pitfall, it is an has the advantage of transparency. There is exactly the same problem with object properties. If the app changes to expect a new property on all objects stored, then the app has to correctly deal with the update. If a requested property doesn't exist, I assume the API will fail immediately with a clear error code. It will not fail silently and mysteriously with no error code. (Again, I haven't looked at it closely, or tried to use it.) What if the new version uses the same property name for a different thing? For example in V1 'Employer' is a string name, and in V2 'Employer' is a reference to another object. You may say 'you should change the column name'? Right thats just the same as me saying you should change the DB version number when you change the collation algorithm. Its the same thing. People seem to be making a big fuss about having a non-persisted collation function defined in user code, when many many things require the code to have the correct model of the data stored in the database to work properly. It seems illogical to make a special case for this function, and not do anything about all the other cases. IMHO either the database should have a stored schema, or it should not. If IndexedDB is going the direction of not having a stored schema, then the designers should have the confidence in their decision to stick with it and at least produce something with a consistent approach to the problem. 2) making things easy for the user - for me a simpler more predictable API is better for the user. Having a function stored inside the database is bad, because you cannot see what function might be stored in there... We could let you query the stored function. Why would you need to read it. Every time you open the database you would need to check the function is the one you expect. The code would have to contain the function so it can compare it with the one in the DB and update it if necessary. If the code contains the function there are two copies of the function, one in the database and one in the code? which one is correct? which one is it using? So sometimes you will write the new function to the database, and sometimes you will not? More paths to test in code coverage, more complexity. Its simpler to just always set the function when opening the database. it might be a function from a previous version of the code and cause all sorts of strange bugs (which will only affect certain users with a certain version of the function stored in their DB). It will cause *much* less strange bugs than if you have one index that used two different collations, which is the alternative possibility. If the function is stored, the worst case will be that the collation function is out of date. In practice, authors will mostly want to use established collation functions like UCA and won't mind if they're out of date. They'll also only very rarely have occasion to deliberately change the function. As I said, you will end up querying the function to see if it is the one you want to use, if you do that you may as well set it every time. Thinking about this a bit more. If you change the collation function you need to re-sort
Re: [IndexedDB] Closing on bug 9903 (collations)
On 6 May 2011 00:22, Aryeh Gregor simetrical+...@gmail.com wrote: On Thu, May 5, 2011 at 2:12 AM, Keean Schupke ke...@fry-it.com wrote: What if the new version uses the same property name for a different thing? Yes, obviously it's going to be possible for code changes to cause hard-to-catch bugs due to not updating the database correctly. We don't have to add more cases where that's possible than necessary, without good reason. Maybe there's good reason here, but the added potential for error can't be neglected as a cost. I have seen many bugs in real databases due to stored procedures. Why would you need to read it. Every time you open the database you would need to check the function is the one you expect. Not if you never intend to change it, or don't care if it's outdated. I expect this to be the most common case. People don't change the language setting in an application? Consider the case of someone using CLDR-tailored UCA and a new version comes out. You want to use the newest version for new indexes, if multiple versions are available, but there's no pressing need to automatically update existing indexes. The old version is almost certainly good enough, unless your users use obscure languages. So in my scheme, you can just update the function in your code and do nothing else. In your scheme, you'd have to either stick to the old version across the board, or include both versions in your code indefinitely and include out-of-band logic to choose between them, or write a script that rebuilds the whole index on update (which would take a long time for a large index). At least then the logic to chose between collations is visible in the code, rather than hidden. This is all about transparency and making sure the programmer has control of what is happening, rather than locking them into limiting patterns, and giving them the ability to see exactly what the code will do by reading and code-reviewing it. With a stored procedure, what happens when a function you call (that is not stored) changes? The only way to be sure is to run a validation check in the index (run from beginning to end checking the order is consistent with the comparison function). That is the same whether you use stores procedures or not. The code would have to contain the function so it can compare it with the one in the DB and update it if necessary. If the code contains the function there are two copies of the function, one in the database and one in the code? which one is correct? which one is it using? So sometimes you will write the new function to the database, and sometimes you will not? More paths to test in code coverage, more complexity. Its simpler to just always set the function when opening the database. If the collation function is stored in the database, then I'd expect setting the function to rebuild the index if the new and old functions differ. This could happen as a background operation, with the existing index still usable (with the old collation function) in the meantime. So if you always wanted collations up-to-date, in my scheme authors could just set the function every time they open the database, as with your scheme. But this could trigger a silent rebuild whenever necessary, so the author doesn't have to worry about it. In your scheme, the author has to do the rebuild himself, and if he gets it wrong, the index will be corrupted. So as I see it, my approach is easier to use across the board. It lets you not update collations on old tables without requiring you to keep track of multiple collation function versions, and it also potentially lets you update collations on old tables to the latest versions with rebuilding done for you in the background. Critically, it does not let you change a sort function without rebuilding, since that will always cause bugs and you never want to do it (to a first approximation). Of course, maybe an initial implementation wouldn't do rebuilds for you, to keep it simple. Then the collation function would be immutable after index creation, so you'd still have to do rebuilds yourself. But it would still be easier and safer: the old index will still work in the interim even if you don't have the old version of your collation function around, and you can't mess up and get a corrupted index. Thinking about this a bit more. If you change the collation function you need to re-sort the index to make sure it will work (and avoid those strange bugs). Storing the function in the DB enables you to compare the function and only change it when you need to, thus optimising the number of re-sorts. That is the _only_ advantage to storing the function - as you still need to check the function stored is the one you expect to guarantee your code will run properly. So with a non-persisted function we need to sort every time we open to make sure the order is correct
Re: [IndexedDB] Closing on bug 9903 (collations)
On 6 May 2011 10:18, Jonas Sicking jo...@sicking.cc wrote: On Thu, May 5, 2011 at 11:36 PM, Keean Schupke ke...@fry-it.com wrote: On 6 May 2011 03:00, Jonas Sicking jo...@sicking.cc wrote: On Wed, May 4, 2011 at 11:12 PM, Keean Schupke ke...@fry-it.com wrote: On 5 May 2011 00:33, Aryeh Gregor simetrical+...@gmail.com wrote: On Tue, May 3, 2011 at 7:57 PM, Jonas Sicking jo...@sicking.cc wrote: I don't think we should do callbacks for the first version of javascript. It gets very messy since we can't rely on that the script function will be returning stable values. The worst that would happen if it didn't return stable values is that sorting would return unpredictable results. Worst is an infinite loop - no return. So the choice here really is between only supporting some form of binary sorting, or supporting a built-in set of collations. Anything else will have to wait for version 2 in my opinion. I think it would be a mistake to try supporting a limited set of natural-language collations. Binary collation is fine for a first version. MySQL only supported binary collation up through version 4, for instance. A good point about MySQL. On Wed, May 4, 2011 at 3:49 AM, Keean Schupke ke...@fry-it.com wrote: I thought only the app that created the db could open it (for security reasons)... so it becomes the app's responsibility to do version control. The comparison function is not going to change by itself - someone has to go into the code and change it, when they do that they should up the revision of the database, if that change is incompatible. Why should we let such a pitfall exist if we can just store the function and avoid the issue? I don't see it as a pitfall, it is an has the advantage of transparency. There is exactly the same problem with object properties. If the app changes to expect a new property on all objects stored, then the app has to correctly deal with the update. If a requested property doesn't exist, I assume the API will fail immediately with a clear error code. It will not fail silently and mysteriously with no error code. (Again, I haven't looked at it closely, or tried to use it.) What if the new version uses the same property name for a different thing? For example in V1 'Employer' is a string name, and in V2 'Employer' is a reference to another object. You may say 'you should change the column name'? Right thats just the same as me saying you should change the DB version number when you change the collation algorithm. Its the same thing. People seem to be making a big fuss about having a non-persisted collation function defined in user code, when many many things require the code to have the correct model of the data stored in the database to work properly. It seems illogical to make a special case for this function, and not do anything about all the other cases. IMHO either the database should have a stored schema, or it should not. If IndexedDB is going the direction of not having a stored schema, then the designers should have the confidence in their decision to stick with it and at least produce something with a consistent approach to the problem. 2) making things easy for the user - for me a simpler more predictable API is better for the user. Having a function stored inside the database is bad, because you cannot see what function might be stored in there... We could let you query the stored function. Why would you need to read it. Every time you open the database you would need to check the function is the one you expect. The code would have to contain the function so it can compare it with the one in the DB and update it if necessary. If the code contains the function there are two copies of the function, one in the database and one in the code? which one is correct? which one is it using? So sometimes you will write the new function to the database, and sometimes you will not? More paths to test in code coverage, more complexity. Its simpler to just always set the function when opening the database. it might be a function from a previous version of the code and cause all sorts of strange bugs (which will only affect certain users with a certain version of the function stored in their DB). It will cause *much* less strange bugs than if you have one index that used two different collations, which is the alternative possibility. If the function is stored, the worst case will be that the collation function is out of date. In practice, authors will mostly want to use established collation functions like UCA and won't mind if they're out of date. They'll also only very rarely
Re: [IndexedDB] Closing on bug 9903 (collations)
On 3 May 2011 23:59, Aryeh Gregor simetrical+...@gmail.com wrote: On Tue, May 3, 2011 at 10:56 AM, Keean Schupke ke...@fry-it.com wrote: Why does it need to be persisted? I would prefer the database to be stateless. Obviously all users of the database need to use the same function. And if they don't use exactly the same function, maybe due to a transient bug, the index is silently and permanently corrupted, until all affected rows happen to be updated again? That doesn't sound like a good idea to me. I thought only the app that created the db could open it (for security reasons)... so it becomes the app's responsibility to do version control. The comparison function is not going to change by itself - someone has to go into the code and change it, when they do that they should up the revision of the database, if that change is incompatible. There is exactly the same problem with object properties. If the app changes to expect a new property on all objects stored, then the app has to correctly deal with the update. There are two issues here: 1) doing things correctly - there is no problem here, providing the closure works. 2) making things easy for the user - for me a simpler more predictable API is better for the user. Having a function stored inside the database is bad, because you cannot see what function might be stored in there... it might be a function from a previous version of the code and cause all sorts of strange bugs (which will only affect certain users with a certain version of the function stored in their DB). By having the sort function in plain sight in the source code it is visible and readable. Yes, there is a risk that the code is changed and the order method is different from that in the DB, which will cause breakage, but so can a function hidden in the database. Of the two I would always choose to have everything clearly visible in the source code where you can check it. Cheers, Keean.
Re: [IndexedDB] Closing on bug 9903 (collations)
On 4 May 2011 00:57, Jonas Sicking jo...@sicking.cc wrote: On Tue, May 3, 2011 at 12:19 AM, Keean Schupke ke...@fry-it.com wrote: The more I think about it, the more I want a user-specified comparison function. Efficiency should not be an issue here - the engines should tweek the JIT compiler to fix any efficiency issues. Just let the user pass a closure (remember functions are first-class in JavaScript so this is not a callback nor an event). I don't think we should do callbacks for the first version of javascript. It gets very messy since we can't rely on that the script function will be returning stable values. Additionally we'd either have to ask that the callback function is re-registered each time the database is opened, or somehow store a serialized copy of the callback function in the browser so that it's available the next time the database is opened. Neither of these things have been done in other APIs in the past, so if we hold up v1 until we solve the challenges involved I think it will delay the release of a stable spec. So the choice here really is between only supporting some form of binary sorting, or supporting a built-in set of collations. Anything else will have to wait for version 2 in my opinion. / Jonas Thats fine with me, providing the other issues around collation orders are solved. If something like the unicode algorithm is used (and if not I would want to be convinced there is a good reason for doing something different than everyone else) there is the issue of what orderings are provided by everyone (maybe DUCET + current CLDR). Then there is how often the CLDR should be updated. Should there be a live fetch / version check every time the DB is started (seems like a sensible route to me, where possible), otherwise the CLDR version could be specified by the standard and updated with each version of the standard? Cheers, Keean.
Re: [IndexedDB] Closing on bug 9903 (collations)
On 4 May 2011 00:57, Jonas Sicking jo...@sicking.cc wrote: On Tue, May 3, 2011 at 12:19 AM, Keean Schupke ke...@fry-it.com wrote: The more I think about it, the more I want a user-specified comparison function. Efficiency should not be an issue here - the engines should tweek the JIT compiler to fix any efficiency issues. Just let the user pass a closure (remember functions are first-class in JavaScript so this is not a callback nor an event). I don't think we should do callbacks for the first version of javascript. It gets very messy since we can't rely on that the script function will be returning stable values. garbage in = garbage out. The programmers job is to write a correct comparison function. All functions have this problem. By this argument we had all better give up programming because there is a risk we may write a function that returns incorrect results. Additionally we'd either have to ask that the callback function is re-registered each time the database is opened, or somehow store a I still think re-registering is a non-issue. It is trivial to declare a local open function openNameIndex than calls openIndex with the correct callback and provide that as a software-module - either in the main code, or in a separate JS file that can be included in each page. Modular programming is a good thing, should be encouraged, and is the traditional software engineering solution to this kind of problem. serialized copy of the callback function in the browser so that it's available the next time the database is opened. Neither of these things have been done in other APIs in the past, so if we hold up v1 until we solve the challenges involved I think it will delay the release of a stable spec. So the choice here really is between only supporting some form of binary sorting, or supporting a built-in set of collations. Anything else will have to wait for version 2 in my opinion. / Jonas Cheers, Keean.
Re: [IndexedDB] Closing on bug 9903 (collations)
On 4 May 2011 21:01, Jonas Sicking jo...@sicking.cc wrote: On Wed, May 4, 2011 at 1:10 AM, Keean Schupke ke...@fry-it.com wrote: On 4 May 2011 00:57, Jonas Sicking jo...@sicking.cc wrote: On Tue, May 3, 2011 at 12:19 AM, Keean Schupke ke...@fry-it.com wrote: The more I think about it, the more I want a user-specified comparison function. Efficiency should not be an issue here - the engines should tweek the JIT compiler to fix any efficiency issues. Just let the user pass a closure (remember functions are first-class in JavaScript so this is not a callback nor an event). I don't think we should do callbacks for the first version of javascript. It gets very messy since we can't rely on that the script function will be returning stable values. garbage in = garbage out. The programmers job is to write a correct comparison function. All functions have this problem. By this argument we had all better give up programming because there is a risk we may write a function that returns incorrect results. Browsers can certainly deal with this, and ensure that the only one suffering is the author of the buggy algorithm. However this comes at a cost in that the browser sorting algorithm can't go into infinite loops or crash even in the face of the most ridiculous comparison algorithm. In other words, the browser will likely have to use a slower sorting implementation in order to be robust. Additionally, there is a significant cost involved in transitioning between the C++ code implementing the sorting algorithm, and the javascript implemented callback. That is on top of the cost of implementing the comparison function in javascript. Even in the best JITs, there is a significant overhead to both these parts. So rather than repeating myself, i'll just quote myself: So the choice here really is between only supporting some form of binary sorting, or supporting a built-in set of collations. Anything else will have to wait for version 2 in my opinion. :) / Jonas I gave my answer, and some follow up questions in a previous email, so I am not avoiding the question. My point was any event handler (onMouseDown?) could have an infinite loop - why so fussy about this one function when so many others have the same problem? The performance point of calling to JavaScript is a valid one, but is this a problem? Perhaps it is fast enough. I have seen no evidence that is will be too slow for people to use - perhaps the bottle neck will be the disk/flash access speed for fetching the blocks and not the JavaScript comparison function. Cheers, Keean.
Re: [IndexedDB] Closing on bug 9903 (collations)
The more I think about it, the more I want a user-specified comparison function. Efficiency should not be an issue here - the engines should tweek the JIT compiler to fix any efficiency issues. Just let the user pass a closure (remember functions are first-class in JavaScript so this is not a callback nor an event). Keean. On 2 May 2011 19:57, Aryeh Gregor simetrical+...@gmail.com wrote: On Fri, Apr 29, 2011 at 3:19 PM, Keean Schupke ke...@fry-it.com wrote: As long as we have a binary mode I am happy. Something I didn't think to mention: what exactly is binary mode for DOMStrings? I guess it means you encode as big-endian UTF-16, then sort bytewise? This is kind of evil, but it matches what sort() does, so I guess it should be the required behavior. (It's kind of evil because it doesn't match code-point order, unlike if you encoded as UTF-8. E.g., U+1 is encoded as 0xd800dc00 and U+E000 is 0xe000, so U+E000 sorts after U+1.) Perhaps this should be spelled out more clearly in the spec.
Re: [IndexedDB] Closing on bug 9903 (collations)
Why does it need to be persisted? I would prefer the database to be stateless. Obviously all users of the database need to use the same function. I would recommend modular programming - create a .js script you can include in all pages that provides 'collated' versions of the method calls by adding the collation argument - Infact for good programming in general make this API your model, so if you were writing a shopping cart, this '.js' would provide methods like 'addToCart', 'removeFromCart', and all collations settings would be hidden in this layer and kept out of individual pages, whilst not needing to be stored in the database at all. Cheers, Keean. On 3 May 2011 15:27, Aryeh Gregor simetrical+...@gmail.com wrote: On Tue, May 3, 2011 at 3:19 AM, Keean Schupke ke...@fry-it.com wrote: The more I think about it, the more I want a user-specified comparison function. Efficiency should not be an issue here - the engines should tweek the JIT compiler to fix any efficiency issues. Just let the user pass a closure (remember functions are first-class in JavaScript so this is not a callback nor an event). Wouldn't it be a bit more complicated than just passing a regular closure? The function has to be persisted in the database across page views, but a JavaScript closure is going to contain references to all sorts of objects (like document, or local variables) that are very specific to the current page view. It makes no sense to persist those objects in general. You'd need to serialize the function somehow, possibly putting restrictions on the sorts of variables it can access, so that it can be sensibly restored later. Is there some established way of doing this yet in JavaScript? It might be useful in other contexts too. I still agree that this is the correct direction to go in, though.
Re: [IndexedDB] Closing on bug 9903 (collations)
On Sunday, 1 May 2011, Aryeh Gregor simetrical+...@gmail.com wrote: On Fri, Apr 29, 2011 at 3:32 PM, Jonas Sicking jo...@sicking.cc wrote: I agree that we will eventually want to standardize the set of allowed collations. Similarly to how we'll want to standardize on one set of charset encodings supported. However I don't think we, in this spec community, have enough experience to come up with a good such set. So it's something that I think we should postpone for now. As I understand it there is work going on in this area in other groups, so hopefully we can lean on that work eventually. (Disclaimer: I never really tried to figure out how IndexedDB works, and I haven't seen the past discussion on this topic. However, I know a decent amount about database collations in practice from my work with MediaWiki, which included adding collation support to category pages last summer on a contract with Wikimedia. Maybe everything I'm saying has already been brought up before and/or everyone knows it and/or it's wrong, in which case I apologize in advance.) The Unicode Collation Algorithm is the standard here: http://www.unicode.org/reports/tr10/ It's pretty stable (I think), and out of the box it provides *vastly* better sorting than binary sort. Binary sort doesn't even work for English unless you normalize case and avoid punctuation marks, and it's basically useless for most non-English languages. Some type of UCA support in browsers would be the way to go here. UCA doesn't work perfectly for all locales, though, because different locales sort the same strings differently (French handling of accents, etc.). The standard database of locale-specific collations is CLDR: http://cldr.unicode.org/ CLDR tends to have several new releases per year. For instance, 1.9.1 was released this March, three versions were released last year, and five were released in 2009. Just looking at the release notes, it seems that most if not all of these releases update collation details. Because of how collations are actually used in databases, any change to the collation version will require rebuilding any index that uses that collation. I don't think it's a good idea for browsers to try packaging such rapidly-changing locale data. If everyone had Chrome's release and support schedule, it might work okay -- if you figured out a way to handle updates gracefully -- but in practice, authors deal with a wide range of browser ages. It's not good if every user has a different implementation of each collation. Nor if browsers just use a frozen and obsolescent collation version. I also don't know how realistic implementers would find it to ship collation support for every language CLDR supports -- the CLDR download is a few megabytes zipped, but I don't know how much of that browsers would need to ship to support all its tailorings. The general solution here would be to allow the creation of indexes based on a user-supplied function. I.e., the user-supplied function would (in SQL terms) take the row's data as input, and output some binary string. That string would be used as the key in the index, instead of any of the column values for the row. PostgreSQL allows this, or so I've heard. Then you could implement UCA (optionally with CLDR tailorings) or any other collation algorithm you liked in JavaScript. Of course, we can't expect authors to reimplement the UCA if they want to get decent sorting. It would make sense for browsers to expose some default sort functions, but I'm not familiar enough with UCA or CLDR to say which ones would be best in practice. It might make sense to expose some medium-level primitives that would allow authors to easily overlay tailoring on the basic UCA algorithm, or something. Or maybe it would really make sense to expose all of CLDR's tailored collations. I'm not familiar enough with the specs to say. But for the sake of flexibility, allowing indexes based on user-defined functions is the way to go. (They're useful for things other than collations, too.) The proposed ECMAScript LocaleInfo.Collator looks like it doesn't currently support this use-case, since it provides only sort functions and not sortkey generation functions: http://wiki.ecmascript.org/doku.php?id=strawman:i18n_api If browsers do provide sortkey generation functions based on UCA, some versioning mechanism will need to be used, particularly if it supports tailored sortkeys. FWIW, MySQL provides some built-in collation support, but MediaWiki doesn't use it, because it supports too few languages and is too inflexible. MediaWiki's stock localization has 99% support for the 500 most-used messages in 175 different languages, and the couple dozen locales that MySQL supports aren't acceptable for us. Instead, we store everything with a binary collation, and are moving to a system where we compute the UCA sortkeys ourselves and put them in
Re: [IndexedDB] Closing on bug 9903 (collations)
On Friday, 29 April 2011, Jonas Sicking jo...@sicking.cc wrote: On Fri, Apr 29, 2011 at 11:16 AM, Pablo Castro pablo.cas...@microsoft.com wrote: We've had quite a bit of debate on this but I don't think we've reached closure. At this point I would be fine with either one of a) postpone to v2 and agree that for now we'll just do binary collation everywhere or b) the last form of the proposal sent around: extra collation argument (following BCP47 plus whatever the UA wants to allow) in createObjectStore/createIndex, plus a collation property to interrogate it; no way to change the collation of a store/index once created. Given that this turned out to be a more elaborate topic than I had originally expected and that it doesn't seem to have a lot of traction right now, my preference would be to postpone to v2. Thoughts? Once we make a call I'll make sure the spec reflects it. I'd be fine with postponing it. However I don't think that the counter proposals that we've received will work, so I don't think that there is a reason to postpone. / Jonas As long as we have a binary mode I am happy. If it is to support other collations, then all browsers must support the same set of options. The question then becomes what set of collation modes to standardise on? Allowing non standard collations will result in apps that will only run correctly on one browser, and that does not seem a good idea to me. Cheers, Keean.
Re: [IndexedDB] Closing on bug 9903 (collations)
There is always something like UCA: http://www.unicode.org/reports/tr10/ which looks interesting. Cheers, Keean. On 29 April 2011 20:32, Jonas Sicking jo...@sicking.cc wrote: On Fri, Apr 29, 2011 at 12:19 PM, Keean Schupke ke...@fry-it.com wrote: On Friday, 29 April 2011, Jonas Sicking jo...@sicking.cc wrote: On Fri, Apr 29, 2011 at 11:16 AM, Pablo Castro pablo.cas...@microsoft.com wrote: We've had quite a bit of debate on this but I don't think we've reached closure. At this point I would be fine with either one of a) postpone to v2 and agree that for now we'll just do binary collation everywhere or b) the last form of the proposal sent around: extra collation argument (following BCP47 plus whatever the UA wants to allow) in createObjectStore/createIndex, plus a collation property to interrogate it; no way to change the collation of a store/index once created. Given that this turned out to be a more elaborate topic than I had originally expected and that it doesn't seem to have a lot of traction right now, my preference would be to postpone to v2. Thoughts? Once we make a call I'll make sure the spec reflects it. I'd be fine with postponing it. However I don't think that the counter proposals that we've received will work, so I don't think that there is a reason to postpone. / Jonas As long as we have a binary mode I am happy. If it is to support other collations, then all browsers must support the same set of options. The question then becomes what set of collation modes to standardise on? Allowing non standard collations will result in apps that will only run correctly on one browser, and that does not seem a good idea to me. I agree that we will eventually want to standardize the set of allowed collations. Similarly to how we'll want to standardize on one set of charset encodings supported. However I don't think we, in this spec community, have enough experience to come up with a good such set. So it's something that I think we should postpone for now. As I understand it there is work going on in this area in other groups, so hopefully we can lean on that work eventually. Of course, we still do need to have a standardized vocabulary for the collations though. / Jonas
Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
This is ignoring the possibility that something like RelationalDB could be used, where a well defined common subset of SQL can be used (and I use well-defined in the formal sense). This would allow a relatively thin wrapper on top of most SQL implementations and would allow SQLite (or BDB) to be used as the backend. As a seasoned C++ programmer, I could even write a Firefox plugin using XPCOM as a reference implementation using the same API as the JavaScript RelationalDB implementation on my GitHub. Although I am not keen on putting in the time to do this if nobody is interested. To me is seems this thread is going in circles. RelationalDB does not have the standardisation problem that WebSQL has, but is still a relatively thin API layer that can be implemented over the top of a fast and well tested SQL implementation. It is based on sound theory and research defining the abstraction layer, and has a relationally complete API, so there should be no need to change the core API in the development of a standard. Cheers, Keean. On 4 April 2011 14:39, Jonas Sicking jo...@sicking.cc wrote: On Saturday, April 2, 2011, Joran Greef jo...@ronomon.com wrote: I am incredibly uncomfortable with the idea of putting the responsibility of the health of the web in the hands of one project. In fact, one of the main reasons I started working at Mozilla was to prevent this. / Jonas I agree with you. All the more reason to support both WebSQL and IndexedDB. It is not a case of either/or. It would be healthy to have competing APIs. Competition might be a great thing. But it doesn't address the issue in the least. It would still be the case that some developers would choose to use WebSQL, and browser makers would still have to support it, including support the SQL dialect it uses. Hence it would still be the case that we would be relying on the SQLite developers to maintain a stable SQL interpretation to keep a healthy and functional web. / Jonas
Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
Yes, it already has well defined set operations. Solid is a matter of testing by enough people (and if you wanted to try it and feed back that would be a start). Fast should not be a problem, as the SQL database does all the heavy lifting. In more detail, Codd's six primitive operators are project, restrict, cross-product, union and difference. Relations are an extension of Sets, so intersection and difference on compatible relations behave like they would on sets. RelationalDB already implements the following 5 methods making it relationally-complete. Meaning it can do anything you could possibly want to do with relations using combinations of these 5 methods. Relation.prototype.project = function(attributes) { // this implements rename as well. Relation.prototype.restrict = function(exp) Relation.prototype.join = function(relation, exp) { Relation.prototype.union = function() {}; Relation.prototype.difference = function() {}; Of course some things can be made easier, so the following methods, although they can be defined in terms of the above 5, will be provided (in future implementations) to keep user code concise and implementations thin and fast. // derived methods Relation.prototype.intersection = function() {}; Relation.prototype.thetajoin = function() {}; Relation.prototype.semijoin = function() {}; Relation.prototype.antijoin = function() {}; Relation.prototype.divide = function() {}; Relation.prototype.leftjoin = function() {}; Relation.prototype.rightjoin = function() {}; Relation.prototype.fulljoin = function() {}; We also hope to provide the lattice operators meet and join: http://en.wikipedia.org/wiki/Lattice_(order) Just these two operators can replace all 5 of Codd's primitives (including all set operations). With just these two you can do anything that you can with _all_ the above. Meet is actually the same as Codd's natural-join (unfortunately terminology in different mathematical fields is not consistent here) and Join is a generalised union operator. See: http://www.arxiv.com/pdf/cs/0501053v2 To see how Meet and Join can be used to construct each of the above operators. Cheers, Keean. On 4 April 2011 15:36, Joran Greef jo...@ronomon.com wrote: On 04 Apr 2011, at 5:26 PM, Keean Schupke wrote: This is ignoring the possibility that something like RelationalDB could be used, where a well defined common subset of SQL can be used (and I use well-defined in the formal sense). This would allow a relatively thin wrapper on top of most SQL implementations and would allow SQLite (or BDB) to be used as the backend. Yes, if an implementation of RelationalDB arrives which is solid and fast with support for set operations that would be great. The important thing is that we have two competing APIs (and preferably a strong API with a great track record).
Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
On 4 April 2011 15:55, Keean Schupke ke...@fry-it.com wrote: Yes, it already has well defined set operations. Solid is a matter of testing by enough people (and if you wanted to try it and feed back that would be a start). Fast should not be a problem, as the SQL database does all the heavy lifting. In more detail, Codd's six primitive operators are project, restrict, cross-product, union and difference. Relations are an extension of Sets, so intersection and difference on compatible relations behave like they would on sets. I missed 'rename' from my list of Codd's operators. Our 'project' function provides both project rename, so I overlooked it. ... Cheers, Keean.
Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
Some thoughts: On 4 April 2011 16:10, Mikeal Rogers mikeal.rog...@gmail.com wrote: i've mostly stayed out of this thread because i felt like i'd just being fanning the flames but i really can't stay out anymore. databases are more that SQL, always have been. SQL is a DSL for relational database access. all implementations of SQL have a similar set of tools they implement first and layer SQL on top of. those tools tend to be a storage engine, btree, and some kind of transactional model between them. under the ugly covers, most databases look like berkeleydb and the layer you live in is just sugar on top. SQL is a standard language (or API) for talking to databases. Why should a developer need to learn a different API for each database. W3 is about standardising APIs. SQL is just an API standardised as a DSL. It is good for all the reasons any standard is good. Add to that the sound mathematical theory of relational-algebra, means it has a lot going for it. Although like any standard is has its problems, most of those seem to be where it has deviated away from the pure relational-algebra. creating an in-browser specification/implementation on top of a given relational/SQL story is a terrible idea. it's unnecessarily limiting to a higher level api and can't be easily extended the way a simple set of tools like IndexedDB can. Its not limiting it provides a more powerful (higher level) interface, that allows developers to concentrate on what to do with the data not how to do it. suggesting that other databases be implemented on top of SQL rather than on top of the tools in which SQL is built is just backwards to anyone who's built a database. RelationalDB is not a database its a relational-data model. it's not very hard to write the abstraction you're talking about on top of IndexedDB, and until you do it i'm going to have a hard time taking you seriously because it's clearly doable. Surely its the API that is important, not how it is implemented? You can try the API now implemented on top of WebSQL. The API will stay the same no matter what underlying technology is used. i implemented a CouchDB compatible datastore on top of IndexedDB, it took me less than a week at a time when there was only one implementation that was still changing and still had bugs. it would be much easier now. https://github.com/mikeal/idbcouch it needs to be updated to use the latest version of the spec which is a day of work i just haven't gotten to yet. I am not overly impressed by CouchDB. the constructs in IndexedDB are pretty low level but sufficient if you know how to implement databases. performance is definitely an issue, but making these constructs faster would be much easier than trying to tweak an off the shelf SQL implementation to your use case. I look at the amount of man hours that have gone into developing SQLite, and BDB and I think, hey if its so easy to write a high performance database, those guys must have been wasting a lot of time? Cheers, Keean.
Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
On 4 April 2011 16:04, Tab Atkins Jr. jackalm...@gmail.com wrote: On Mon, Apr 4, 2011 at 8:07 AM, Joran Greef jo...@ronomon.com wrote: On 04 Apr 2011, at 4:39 PM, Jonas Sicking wrote: Hence it would still be the case that we would be relying on the SQLite developers to maintain a stable SQL interpretation... SQLite has a fantastic track record of maintaining backwards compatibility. IndexedDB has as yet no track record, no consistent implementations, no widespread deployment, It's new. only measurably poor performance Ironically, the poor performance is because it's using sqlite as a backing-store in the current implementation. That's being fixed by replacing sqlite. and a lukewarm indexing and querying API. Kinda the point, in that the power/complexity of SQL confuses a huge number of develoeprs, who end up coding something which doesn't actually use the relational model in any significant way, but still pays the cost of it in syntax. (I found normalization forms and similar things completely trivial when I was learning SQL, but for some reason almost every codebase I've looked at has a horribly-structured db. As far as I can tell, the majority of developers just hack SQL into being a linear object store and do the rest in their application code. We can reduce the friction here by actually giving them a linear object store, which is what IndexedDB is.) ~TJ SQLite has seen really good use in the mobile app community on both iPhone and Android. I would have thought that if we wanted the same kind of thriving app developer community around HTML5 web-apps, taking a few leaves from the mobile developers book would not be a bad idea? IMHO its those kind of developers HTML5 should be trying to attract, in addition to the current web developers. Cheers, Keean.
Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
I would point out that RelationalDB is relationally complete and the api does not depend on the sqlite spec at all. Cheers Keean On Apr 1, 2011 8:58 PM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Apr 1, 2011 at 12:39 PM, Glenn Maynard gl...@zewt.org wrote: Lastly, some vendors have expressed unwillingness to embed SQLite for legal reasons. Embedding other peoples code definitely exposes you to risk of copyright and patent lawsuits. While I can't say that I fully agree with this reasoning, I'm also not the one that would be on the receiving end of a lawsuit. Nor am I a lawyer and so ultimately will have to defer to people that know better. In the end it doesn't really matter as if a browser won't embed SQLite then it doesn't matter why, the end result is that the same SQL dialect won't be available cross browser which is bad for the web. If SQLite was to be used as a web standard, I'd hope that it wouldn't show up in a spec as simply do what SQLite does, but as a complete spec of SQLite's behavior. Browser vendors could then, if their lawyers insisted, implement their own compatible implementation, just as they do with other web APIs. I'd expect large portions of SQLite's test suite to be adaptable as a major starting point for spec tests, too. Have you read the WebSQL spec? Creating such a spec would be a formidable task, of course. Indeed. One that the SQL community has failed in doing so far. And they have a lot more experience with SQL than we do. / Jonas
Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
On 4 April 2011 22:55, Aryeh Gregor simetrical+...@gmail.com wrote: On Fri, Apr 1, 2011 at 2:39 PM, Jonas Sicking jo...@sicking.cc wrote: There are several reasons why we don't want to rely exclusively on SQLite, other than solely W3C formalia. First of all, what should we do once the SQLite team releases a new version which has some modifications in its SQL dialect? We generally always need to embed the latest version of the library since it contains critical bug fixes, however SQLite makes no guarantees that it will always support exactly the same SQL statements. . . . These are good reasons, and I have no problem with them. SQLite is designed with very different compatibility and security needs than the web platform has, and its performance goals might be different in some respects as well. There are various ways that you could address this short of making up something completely different, but I'm not sure whether it would be a good idea. Anyway, I didn't intend to reignite this whole discussion. The decision has been made, now we get to see what comes of it. On Mon, Apr 4, 2011 at 11:07 AM, Joran Greef jo...@ronomon.com wrote: SQLite has a fantastic track record of maintaining backwards compatibility. SQLite does not face anything close to the compatibility requirements that web browsers face. There are perhaps billions of independent web pages, which don't have any control over what browser versions they're being run in. These pages are expected to work in all browsers even if they were written ten years ago and no one has looked at them since, and even if they were written incompetently. Just because something has an excellent compatibility track record by the standards of application libraries doesn't mean it's compatible enough for the web. Something like RelationalDB gives you the power of a relational-db with no dependence on a specific implementation of SQL, so it would be compatible enough for the web. It fixes all the problems with the standardisation of WebSQL that have been talked about so far. I think it would find no technical issues that block its standardisation. As a high level DB API it does not need all the low-level features of IndexedDB, so its API can be much simpler and cleaner. RelationalDB can at least be provided as a library on top of IndexedDB, and it can use WebSQL where it is supported. My concern with the library approach is performance when implemented on top of IndexedDB. Cheers, Keean.
Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
Pity. Anyway RelationalDB defines its API without reference to the underlying SQL or non-SQL database... So as a candidate for replacing WebSQL, it does not suffer from that problem. Cheers, Keean. On 2 April 2011 14:56, Glenn Maynard gl...@zewt.org wrote: On Sat, Apr 2, 2011 at 5:24 AM, Keean Schupke ke...@fry-it.com wrote: Infact now BDB supports the SQLite-3.0 API, you can have two implementation that conform to the same API. So the original reason for abandoning WebSQL seems no longer valid. As there are now more than one implementation of the SQLite-3.0 API it is a de-facto (open) standard. Based on http://download.oracle.com/docs/cd/E17076_02/html/installation/upgrade_11gr2_51_sqlite_ver.html, it's not like an independent implementation. -- Glenn Maynard
Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
Hi Shawn I would be interested in this. What would need to be done to make this a Firefox plugin? I've done XPCOM stuff before in xulrunner if that's any help. Cheers, Keean On Apr 1, 2011 6:09 PM, Shawn Wilsher sdwi...@mozilla.com wrote: On 4/1/2011 5:40 AM, Nathan Kitchen wrote: Are there any browser vendor representatives on the mailing list who would care to comment on the criteria for implementing something akin to Keean's RelationalDBhttps://github.com/keean/RelationalDB idea? What would need to be in place to start work on such an implementation? It wouldn't be terribly difficult to prototype this as an add-on for Firefox, I don't think (and I'd be happy to provide technical assistance to anyone wishing to do so). Doing this would allow web developers to install the add-on and play with it, which can give us useful feedback. I'm not saying we'd move it into the tree at that point, but it's a good first step to building a case to take it. 1. Opportunity to explore more solutions to offline data than *just * IndexedDB. There is also http://dev.w3.org/html5/spec/offline.html and http://dev.w3.org/html5/webstorage/ (even if you don't like them, they are other solutions to the offline problem). Browser vendors are not just looking at IndexedDB. 2. Many web developers have a working knowledge of SQL, so the concepts of a relational database may be more familiar. If adoption could be considered a proxy for the success of a standard, I'd suggest that aiming for something the web development community understands would be a large factor in adoption. I don't really think IndexedDB is that dissimilar to a relational database. There are a lot of one-to-one mappings of concepts of one to the other. 3. It's probably (!) easier to implement RelationalDB than IndexedDB, as it maps fairly cleanly to existing relational database technologies. This would allow vendors to implement it using Sqlite, Access, etc independent of the spec. Given that most vendors already have working implementations of IndexedDB, I don't think this is a good argument ;) Cheers, Shawn
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
I was the one that asked for callbacks. but what do we do if those callbacks don't return consistent results? Or even do evil things like modify the stores where data is being inserted? If the callback maps all values to a sort-order of '1' there could only ever be one entry in the index... its not hard, the callback is passed an immutable copy of the object and returns a sort-order as a binary-blob. If you capture the object store in the closure you of course you could do evil things as side-effects. But that is true in any non-purely-functional language, you can always do evil things with side-effects. In short, I don't think we'll get much further here without a concrete proposal. Which basically means nobody working on the current implementations understands the issues, or thinks the issues are unimportant? Cheers, Keean.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 March 2011 08:38, Joran Greef jo...@ronomon.com wrote: On 31 Mar 2011, at 9:34 AM, Jeremy Orlow wrote: We have made an effort to understand other contributions to the field. I'm not convinced that these are essential database concepts and having personally spent quite some time working with the API in JS and implementing it, I feel pretty confident that what we have for v1 is pretty solid. There are definitely some things I wouldn't mind re-visiting or looking at closer, possibly even for v1, but they all seem reasonable to study further for v2 as well. We've spent a lot of time over the last year and a half talking about IndexedDB. But now it's shipping in Firefox 4 and soon Chrome 11. So realistically v1 is not going to change much unless we are convinced that what's there is fundamentally broken. We intentionally limited the scope of v1, which is why we know there'll be a v2. We can't solve all the problems at once, and the difficulty of speccing something is typically exponential to the size of the API. Maybe a constructive way to discuss this would be to look at what use cases will be difficult or impossible to achieve with the current design? Application-managed indices for starters. I would consider that to be essential when designing indexed key/value stores, and I would consider that to be the contribution made by almost every other indexed key/value store to date. If we have to use IDB the way FriendFeed used MySQL to achieve application-managed indices then I would argue that the API is in fact fundamentally broken and we would be better off with an embedding of SQLite by Mozilla. Regarding the difficulty of speccing something is typically exponential to the size of the API, if people want to build a Rube Goldberg device then they must deal with the spec issues of that. If we were provided with the primitives for an indexed key/value store with application-managed indices (as Nikunj suggested at the time), we would have been well out of the starting blocks by now, and issues such as computed indexes, indexing array values etc. would have been non-issues. Summary: 1. There's a problem. 2. It can still be fixed with a minimum of fuss. I totally agree with everything so far... 3. This requires an adjustment to the putObject and deleteObject interfaces (see previous threads). I disagree that a simple API change is the answer. The problem is architectural, not just a superficial API issue. Cheers, Keean.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 March 2011 12:41, Joran Greef jo...@ronomon.com wrote: On 31 Mar 2011, at 12:52 PM, Keean Schupke wrote: I totally agree with everything so far... 3. This requires an adjustment to the putObject and deleteObject interfaces (see previous threads). I disagree that a simple API change is the answer. The problem is architectural, not just a superficial API issue. Yes, for IndexedDB to be stateless with respect to application schema, one would need to: 1. Provide the application with a first-class means to manage indexes at time of putting/deleting objects. 2. Treat objects as opaque (remove key path, structured clone mechanisms, application must provide an id and JSON value to put/delete calls, reduces serialization/deserialization overhead where application already has the object as a string). 3. Remove setVersion (redundant, application migrates objects and indexes using transactions as it needs to). 4. Remove createIndex. This would rip so much from the spec as to reduce it to a bunch of tatters, defining nothing more than an interface for index/key/value primitives in terms of well-established interfaces. Essentially, we need LocalStorage with asynchronous IO (based on Node's callback style), large quota support, and a BTree API. Failing that, a decent FileSystem API on which to build these. Stateless indexes can be provided differently to how you suggest. You can have a 'validate_index' call that checks the index exists and creates it if it does not. It is stateless in the sense that you call that to open existing index or create one, you dont care if the database has one already or not. Infact you can make SQL stateless by providing a validate_schema call that succeeds if the schema of the database matches the passed schema, can be modified with no data loss to be the same, or needs to be created. The RelationalDB wrapper for WebSQL provides this kind of stateless approach for SQL... you can check it out on github if you like (its a work in progress though): https://github.com/keean/RelationalDB Cheers, Keean.
Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
No real reason - just trying to implement a minimal framework. Date objects would be a definite must have going forward. I was interested in trying to get something like this standardised, as I believe it has none of the issues that stopped WebSQL, as it defines a complete relational API independent of the implementation of SQL behind it. The key thing is to get the browser implementors interested in implementing it. If even one of the main browser implementors is not interested in implementing it, then it will suffer the same fate as WebSQL. Independent of standardisation (which I would like) I intend to try and implement the same API on top of WebSQL and IndexedDB as a library. So people are free to use the backend with the best performance without changing their code. It aims to be as stateless as possible, and to implement relational algebra on relations. Cheers, Keean. On 31 March 2011 15:54, Nathan Kitchen w...@nathankitchen.com wrote: That's nice, pretty much what I was thinking but somewhat more complete : ) Is there not a w3 group progressing something like this? And if not, who would need to be lobbied to get one started?! As an aside, I note you didn't implement date as a supported data type. Was that a conscious decision, and if so what was the reasoning behind it? N On 31 March 2011 16:33, Keean Schupke ke...@fry-it.com wrote: Have a look at my RelationalDB API https://github.com/keean/RelationalDB In particular examples/candy.html A lot of work went into the underlying concepts - Its work originally published by myself and others at the 2004 Haskell Workshop, and follows on from HaskellDB which was the original inspiration behind C#s Linq functionality). It implements the relational-algebra operators as methods that operate on relation objects. Let me know what you think. Cheers, Keean. On 31 March 2011 15:19, Nathan Kitchen w...@nathankitchen.com wrote: Hi. I've been watching discussions on IndexedDB for a while now, and wondered if anyone would mind spending a few moments to explain how IndexedDB is related (or not) to WebSQL. Is IndexedDB seen as replacing the functionality originally offered by WebSQL? If not, are there any plans to make a cross-platform variant of Web SQL? If (?) most web developers know SQL, is there a case to be made for abstracting SQL into JSON/JavaScript rather than moving to IndexedDB document storage? Reasons for asking this: - Many of the posts appearing to come from the dev community rather than W3C seem to expect more SQL-esque functionality from IndexedDB. If the enthusiasts who get involved enough to post to the board are expecting SQL/query type experience, maybe there is a driver for a native database API supporting this. - Several people have noted that third-party frameworks could implement this functionality. This might be a daft question, but isn't it easier to implement an IndexedDB-like framework on top of WebSQL, than a WebSQL-like framework on top of IndexedDB (overuse of quotes to indicate the general concept). I had a ponder on how I'd like to see such a framework implemented (in both Access SQLite :p ), and came up with a stack of pseudo-code below in my lunch break. Might make an interesting discussion point. It's not really IndexedDB, it's WebSQL v2. Or maybe WebJSQL or something. I'd be really interested to understand what advantages IndexedDB has over an implementation like the one below though. // DATABASE // First, open a database with the specified name. The number at the // end denotes the version of the specification that the application // plans to use. This allows forward-compatibility with vNext. var db = window.openDatabase(shoppinglist, 1.0); // MIGRATIONS // Next, create some migrations. These are predefined structures which // are validated by the browser database engine. A migration consists // of two actions: one up, one down. Each action specifies some // operations and parameters. It's up to the browser database to read // these and perform the appropriate action, as defined in the spec. // Other actions may include a batch add for static data. It could // also be valid to have key and index creation and removal as separate // actions. // SHOPPING TRIP var createTripTableAction = { action: create-table, params: { name: trip, columns: [ { name: id, type: whole-number, primaryKey: true }, { name: name, type: string, length: 32, regex: [A-Z]{1,32} } // regex: wouldn't that be nice... ], indexes: [ { columns: [ { name: name, type: full-text } ] } // More indexes here if required ] } }; var removeTripTableAction = { action: remove-table, params: { name: shopping, cleardata: true } }; // SHOPPING var
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 March 2011 18:17, Jeremy Orlow jor...@chromium.org wrote: On Thu, Mar 31, 2011 at 11:09 AM, Keean Schupke ke...@fry-it.com wrote: On 31 March 2011 17:41, Jonas Sicking jo...@sicking.cc wrote: On Thu, Mar 31, 2011 at 1:32 AM, Joran Greef jo...@ronomon.com wrote: On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote: I previously have asked for a detailed proposal, but so far you have not supplied one but instead keep referring to other unnamed database APIs. I have already provided an adequate interface proposal for putObject and deleteObject. That is hardly a comprehensive proposal, but rather just one small part of it. I wanted to make a few comments about these points :- I do really think the idea of not having the implementation keep track of the set of indexes for a objectStore is a really interesting one. As is the idea of not even having a set set of objectStores. However, there are several problems that needs to be solved. In particular how do you deal with collations? no indexes, no object stores... well I for one prefer the validate_object_store, validate_index approach, in that it can hide statefullness if necessary (like I do with RelationalDB) whilst presenting a stateless API. It also keeps the size of the put statements down. I.e. we have concluded that there are important use cases which require using different collations for different indexes and objectStores. Even for different indexes attached to the same objectStore. Additionally, if we're getting rid of setVersion, how do we expect pages dealing with the (application managed) schema changing while the page has a connection open to the database? 1 - there is no schema 2 - dont allow it to change whilst the database is open In reality a schema is implicitly tied to a code version. In other words the source code of the application assumes a certain schema. If the assumed schema and the schema in the DB do not match things are going to go very wrong very quickly. Schema changes _always_ accompany code changes (otherwise they are not schema changes just data changes). As such they never happen when a DB is open. The way I handle this in RelationalDB, by validating the actual schema against the source-code schema in the db-open (actually the method is called validate), is probably the best way to handle this. If the database does not exist we create it according to the schema. If it exists we check it matches the schema. If there is a difference we see if we can 'upgrade' the database automatically (certain changes like adding a new column with a default value can be done automaticall), if we cannot automaticall upgrade, we exit with an error - as allowing the program to run will result in corruption of the data already in the database. At this point it is up to the application to figure out how to upgrade the database (by opening one database with an old schema and another with a new schema)... There is not point in ever allowing a database to be opened with the wrong schema. So pretty please, with sugar on top, please come up with a proposal for the full API rather than bits and pieces. And I should mention that I have as an absolute requirement that you should be able to specify collation by simply saying that you want to use en-US or sv-SV sorting. Using callbacks or other means is ok *in addition to this*, but callback mechanisms tend to be a lot more complex since they have to deal with the callback doing all sorts of evil things such as returning inconsistent results (think return Math.random()), or simply do evil things like navigate the current page, deleting the database, or modifying the record that is in the process of being stored. The core API only needs to deal with sorting binary-blob sort orders. A library wrapper could provide all the collation ordering goodness that people want. For example RelationalDB will have to deal with sorting orders, it does not need the browser to provide that functionality. In fact browser provided functionality may limit what can be done in libraries on top. This is difficult if not impossible to do. See previous threads on the matter. J I can find a lot of stuff on collation, but not a lot about why it could not be done in a library. Could you summerise the reasons why this needs to be core functionality for me? A library could chose to use an object store as meta-data to store the collation orders that it is using for various indexes for example. Cheers, Keean.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
On 31 March 2011 18:36, Jeremy Orlow jor...@chromium.org wrote: On Thu, Mar 31, 2011 at 11:24 AM, Keean Schupke ke...@fry-it.com wrote: On 31 March 2011 18:17, Jeremy Orlow jor...@chromium.org wrote: On Thu, Mar 31, 2011 at 11:09 AM, Keean Schupke ke...@fry-it.comwrote: On 31 March 2011 17:41, Jonas Sicking jo...@sicking.cc wrote: On Thu, Mar 31, 2011 at 1:32 AM, Joran Greef jo...@ronomon.com wrote: On 31 Mar 2011, at 9:53 AM, Jonas Sicking wrote: I previously have asked for a detailed proposal, but so far you have not supplied one but instead keep referring to other unnamed database APIs. I have already provided an adequate interface proposal for putObject and deleteObject. That is hardly a comprehensive proposal, but rather just one small part of it. I wanted to make a few comments about these points :- I do really think the idea of not having the implementation keep track of the set of indexes for a objectStore is a really interesting one. As is the idea of not even having a set set of objectStores. However, there are several problems that needs to be solved. In particular how do you deal with collations? no indexes, no object stores... well I for one prefer the validate_object_store, validate_index approach, in that it can hide statefullness if necessary (like I do with RelationalDB) whilst presenting a stateless API. It also keeps the size of the put statements down. I.e. we have concluded that there are important use cases which require using different collations for different indexes and objectStores. Even for different indexes attached to the same objectStore. Additionally, if we're getting rid of setVersion, how do we expect pages dealing with the (application managed) schema changing while the page has a connection open to the database? 1 - there is no schema 2 - dont allow it to change whilst the database is open In reality a schema is implicitly tied to a code version. In other words the source code of the application assumes a certain schema. If the assumed schema and the schema in the DB do not match things are going to go very wrong very quickly. Schema changes _always_ accompany code changes (otherwise they are not schema changes just data changes). As such they never happen when a DB is open. The way I handle this in RelationalDB, by validating the actual schema against the source-code schema in the db-open (actually the method is called validate), is probably the best way to handle this. If the database does not exist we create it according to the schema. If it exists we check it matches the schema. If there is a difference we see if we can 'upgrade' the database automatically (certain changes like adding a new column with a default value can be done automaticall), if we cannot automaticall upgrade, we exit with an error - as allowing the program to run will result in corruption of the data already in the database. At this point it is up to the application to figure out how to upgrade the database (by opening one database with an old schema and another with a new schema)... There is not point in ever allowing a database to be opened with the wrong schema. So pretty please, with sugar on top, please come up with a proposal for the full API rather than bits and pieces. And I should mention that I have as an absolute requirement that you should be able to specify collation by simply saying that you want to use en-US or sv-SV sorting. Using callbacks or other means is ok *in addition to this*, but callback mechanisms tend to be a lot more complex since they have to deal with the callback doing all sorts of evil things such as returning inconsistent results (think return Math.random()), or simply do evil things like navigate the current page, deleting the database, or modifying the record that is in the process of being stored. The core API only needs to deal with sorting binary-blob sort orders. A library wrapper could provide all the collation ordering goodness that people want. For example RelationalDB will have to deal with sorting orders, it does not need the browser to provide that functionality. In fact browser provided functionality may limit what can be done in libraries on top. This is difficult if not impossible to do. See previous threads on the matter. J I can find a lot of stuff on collation, but not a lot about why it could not be done in a library. Could you summerise the reasons why this needs to be core functionality for me? Sorry, but that stuff is paged out of my brain. Pablo, can you? A library could chose to use an object store as meta-data to store the collation orders that it is using for various indexes for example. Cheers, Keean. Thanks would help me understand. As long as there is a way to turn default collation off and just have a binary string sort order thats fine for my needs. Cheers, Keean.
Re: [WebSQL] Any future plans, or has IndexedDB replaced WebSQL?
On 31 March 2011 19:08, Joran Greef jo...@ronomon.com wrote: This is painful to read. WebSQL development died because SQLite, the most widely-deployed database software in the world, was too good? That sounds like a catastrophic failure of the W3C process. -- Glenn Maynard Hear. I am starting to think that Mozilla will step up and provide an embedding of SQLite, even if it has to only think of it as such. It will have to. People would rather use a working database than something crippled albeit specced (see LocalStorage or IndexedDB). It was things like XHR in all their unspecced glory that brought the web to where it is today. Do you want to take a look at my RelationalDB library - it could form the basis of a replacement for WebSQL, and as it is based on relational algebra not SQL, it has not user visible dependencies on the particular SQL implementation? Have a look at: https://github.com/keean/RelationalDB/blob/master/examples/candy.html For a usage example. This should run in chrome right now (using WebSQL as a backend). I would appreciate any thoughts, comments etc. Cheers, Keean.
Re: [IndexedDB] Design Flaws: Not Stateless, Not Treating Objects As Opaque
Currently there are no APIs in JavaScript to compare strings using specific collations We dont actually need this, just a mapping from UTF-16 string to a sort-score (binary blob). Its true that downloading the collation tables might take time, so we could just provide: var blob = string_to_score('utf-16 string', 'en-US'); as a built in function to make this efficient. I agree with the other points though. Cheers, Keean. On 31 March 2011 22:38, Pablo Castro pablo.cas...@microsoft.com wrote: From: jor...@google.com [mailto:jor...@google.com] On Behalf Of Jeremy Orlow Sent: Thursday, March 31, 2011 11:36 AM I can find a lot of stuff on collation, but not a lot about why it could not be done in a library. Could you summerise the reasons why this needs to be core functionality for me? Sorry, but that stuff is paged out of my brain. Pablo, can you? A library could chose to use an object store as meta-data to store the collation orders that it is using for various indexes for example. - Currently there are no APIs in JavaScript to compare strings using specific collations. There are folks that are looking into this, but it will need time. - I'm far from an expert in the topic, but from talking to folks that understand this well it seems that to actually implement this entirely in JavaScript it would mean you have to download collation tables and apply them as needed in callbacks. Not only this means a hit in download size/time for the app but also that callbacks have to either download stuff or inline collation rules/tables in the callback itself. - In pure practical terms, I suspect the 80% scenario can be covered by implementing this natively, having it be fast and simple to use for common cases. Not pushing back on the callback stuff, just saying that I find it valuable to have users simply say en-US and get what they wanted. - Also from the practical perspective, simple cases that don't require the flexibility and can avoid having to take care of making the callbacks perfectly consistent even as you roll out updates that may hit only some of the pages, use components written by someone else, etc. - By default we would still do binary collation (there was a question in the thread, I forget exactly where). Thanks -pablo
Re: [Bug 12321] New: Add compound keys to IndexedDB
I like BDB's solution. You have one primary key you cannot mess with (say an integer for fast comparisons) you can then add any number of secondary indexes. With a secondary index there is a callback to generate a binary blob that is used for indexing. The callback has access to all the fields of the object plus any info in the closure and can use that to generate the index data any way it likes. This has the advantage of supporting any indexing scheme's the user may wish to implement (by writing a custom callback), whist allowing a few common options to be provided for the user (say a hash of all fields, or a field name, international char set, and direction captured in a closure). The user gets the power, the core implementation is simple, and common cases can be implemented in an easy to use way. var lex_order = function(field, charset, direction) {return function(object) {/* map indexed 'field' to blob in required order */ return key;};}; Then create a new index: object_store.validate_index(1, lex_order('name', 'us', 'ascending')).on_done(function(status) {/* status ok or error */}) validate index checks if the requested secondary index (1) exists, if it does not it creates the index and calls the done callback (with a status code indicating successful creation), if it does and it passes some validation checks it also calls the done callback (with a status code indicating successful validation). If anything goes wrong with either the creation or validation of the secondary index if would call the done callback with an error status code. Cheers, Keean. On 18 March 2011 02:03, Jeremy Orlow jor...@chromium.org wrote: Here's one ugliness with A: There's no way to specify ascending or descending for the individual components of the key. So there's no way for me to open a cursor that looks at one field ascending and the other field descending. In addition, I can't think of any easy/good ways to hack around this. Any thoughts on how we could address this use case? J On Wed, Mar 16, 2011 at 4:50 PM, bugzi...@jessica.w3.org wrote: http://www.w3.org/Bugs/Public/show_bug.cgi?id=12321 Summary: Add compound keys to IndexedDB Product: WebAppsWG Version: unspecified Platform: PC OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Indexed Database API AssignedTo: dave.n...@w3.org ReportedBy: jor...@chromium.org QAContact: member-webapi-...@w3.org CC: m...@w3.org, public-webapps@w3.org From the thread [IndexedDB] Compound and multiple keys by Jonas Sicking, we're going to go with both options A and B. = Hi IndexedDB fans (yay!!), Problem description: One of the current shortcomings of IndexedDB is that it doesn't support compound indexes. I.e. indexing on more than one value. For example it's impossible to index on, and therefor efficiently search for, firstname and lastname in an objectStore which stores people. Or index on to-address and date sent in an objectStore holding emails. The way this is traditionally done is that multiple values are used as key for each individual entry in an index or objectStore. For example the CREATE INDEX statement in SQL can list multiple columns, and CREATE TABLE statment can list several columns as PRIMARY KEY. There have been a couple of suggestions how to do this in IndexedDB Option A) When specifying a key path in createObjectStore and createIndex, allow an array of key-paths to be specified. Such as store = db.createObjectStore(mystore, [firstName, lastName]); store.add({firstName: Benny, lastName: Zysk, age: 28}); store.add({firstName: Benny, lastName: Andersson, age: 63}); store.add({firstName: Charlie, lastName: Brown, age: 8}); The records are stored in the following order Benny, Andersson Benny, Zysk Charlie, Brown Similarly, createIndex accepts the same syntax: store.createIndex(myindex, [lastName, age]); Option B) Allowing arrays as an additional data type for keys. store = db.createObjectStore(mystore, fullName); store.add({fullName: [Benny, Zysk], age: 28}); store.add({fullName: [Benny, Andersson], age: 63}); store.add({fullName: [Charlie, Brown], age: 8}); Also allows out-of-line keys using: store = db.createObjectStore(mystore); store.add({age: 28}, [Benny, Zysk]); store.add({age: 63}, [Benny, Andersson]); store.add({age: 8}, [Charlie, Brown]); (the sort order here is the same as in option A). Similarly, if an index pointed used a keyPath which points to an array, this would create an entry in the index which used a compound key consisting of the values in the array. There are of course advantages and disadvantages with both options. Option A advantages: * Ensures that at objectStore/index creation time the number of keys are known. This allows the implementation to create and optimize the index using this
Re: [IndexedDB] Spec changes for international language support
See my proposal in another thread. The basic idea is to copy BDB. Have a primary index that is based on an integer, something primitive and fast. Allow secondary indexes which use a callback to generate a binary index key. IDB shifts the complexity out into a library. Common use cases can be provided (a hash of all fields in the object, internationalised bidirectional lexicographic etc...), but the user is free to write their own for less usual cases (for example indexing by the last word in a name string to order by surname). Cheers, Keean. On 18 March 2011 02:19, Jonas Sicking jo...@sicking.cc wrote: 2011/3/17 Pablo Castro pablo.cas...@microsoft.com: From: Jonas Sicking [mailto:jo...@sicking.cc] Sent: Tuesday, March 08, 2011 1:11 PM All in all, is there anything preventing adding the API Pablo suggests in this thread to the IndexedDB spec drafts? I wanted to propose a couple of specific tweaks to the initial proposal and then unless I hear pushback start editing this into the spec. From reading the details on this thread I'm starting to realize that per-database collations won't do it. What did it for me was the example that has a fuzzier matching mode (case/accent insensitive). This is exactly the kind of index I would want to sort people's names in my address book, but most likely not the index I'll want to use for my primary key. Refactoring the API to accommodate for this would mean to move the setCollation() method and the collation property to the object store and index objects. If we were willing to live without the ability to change them we could take collation as one of the optional parameters to createObjectStore()/createIndex() and reduce a bit of surface area... Unfortunately I think you bring up good use cases for per-objectStore/index collations. It's definitely tempting to just add it as a optional parameter to createObjectStore/createIndex. The downside is obviously pushing more complexity onto web developers. Complexity which will be duplicated across sites. However there is another problem to consider here. Can switching collation on a objectStore or a unique index can affect its validity? I.e. if you switch from a case sensitive to a case insensitive collation, does that mean that if you have two entries with the primary keys Sweden and sweden they collide and thus the change of collation must result in an error (or aborted transaction)? I do seem to recall that there are ways to do at least case sensitivity such that you generally don't take case into account when sorting, unless two entries are exactly the same, in which case you do look at casing to differentiate them. However I don't really know a whole lot about this and so defer to people that know internationalization better. I don't have a strong preference there. In any case both would use BCP47 names as discussed in this thread (as Jonas pointed out, implementations can also do their thing as long as they don't interfere with BCP47). Another piece of feedback I heard consistently as I discussed this with various folks at Microsoft is the need to be able to pick up what the UA would consider the collation that's most appropriate for the user environment (derived from settings, page language or whatever). We could support this by introducing a special value that you can pass to setCollation that indicates pick whatever is the right for the environment's language right now. Given that there is no other way for people to discover the user preference on this, I think this is pretty important. I would be fine with this as long as it's a explicit opt-in. There is definitely a risk that people will do this and then only do testing in one language, but it seems to me like a useful use case to support, and I don't see a way of supporting this while completely avoiding the risk of internationalization bugs. / Jonas
Re: [IndexedDB] Spec changes for international language support
On 18 March 2011 19:29, Pablo Castro pablo.cas...@microsoft.com wrote: From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On Behalf Of Keean Schupke Sent: Friday, March 18, 2011 1:53 AM See my proposal in another thread. The basic idea is to copy BDB. Have a primary index that is based on an integer, something primitive and fast. Allow secondary indexes which use a callback to generate a binary index key. IDB shifts the complexity out into a library. Common use cases can be provided (a hash of all fields in the object, internationalised bidirectional lexicographic etc...), but the user is free to write their own for less usual cases (for example indexing by the last word in a name string to order by surname). I agree with Jeremy's comments on the other thread for this. Having the callback mechanism definitely sounds interesting but there are a ton of common cases that we can solve by just taking a language identifier, I'm not sure we want to make people work hard to get something that's already supported in most systems. The idea of having a callback to compute the index value feels incremental to this, so we could take on it later on without disrupting the explicit international collation stuff. The idea would be to provide pre-defined implementations of the callback for common use cases, then it is just as simple to register a callback as set any other option. All this means to the API is you pass a function instead of a string. It also is better for modularity as all the code relating to the sort order is kept in the callback functions. The difference comes down to something like: index.set_order_lexicographic('us'); vs index.set_order_method(order_lexicographic('us')); So more than just setting a property like the first case, where presumably all the ordering code is mixed in with the indexing code, the second case encapsulates all the ordering code in the function returned from the execution of order_lexicographic('us'). This function would represent a mapping from the object being indexed to a binary blob that is the actual stored index data. So doing it this was does not necessarily make things harder, and it improves encapsulation, the type-safety, and the flexibility of the API. Cheers, Keean.
Re: [IndexedDB] Compound and multiple keys
Getting pgsql people involved sounds a great idea. Having some more people to argue for formalised and standardised database APIs like SQL, and experience with relational operations and optimisation would be good (That is an assumption on my part, but then they are writing PostgreSQL not CouchDB). Do you know some people you could invite? More generally though, I think BerkeleyDB would make a much better target for IDB. I don't think it should be trying to be PostgreSQL or MySQL. I think that implementing a good low-level API like BerkeleyDB that has enough functionality to allow SQL to be implemented over the top. The problem with trying to implement IDB on top of PostgreSQL is that IDB has a very narrow interface, that does not support any of the powerful features of pgsql. It would give you the worst of both. BDB would make a much implementation. Far more sensible would be to target the feature set of BDB for IDB, then PostgreSQL could be re-implemented in JavaSctipt on top. (a massive and impractical task, but I am trying to express the relationship between high level and low level database APIs). If we wanted to go fully relational, and avoid the correctness problems with string processing SQL commands, take a look at my relational library, currently implemented on top of WebSQL but an IDB version is in the works: https://github.com/keean/RelationalDB Cheers, Keean. On 9 March 2011 04:10, Charles Pritchard ch...@jumis.com wrote: On 3/8/2011 6:12 PM, Jeremy Orlow wrote: On Tue, Mar 8, 2011 at 5:55 PM, Pablo Castro pablo.cas...@microsoft.comwrote: From: public-webapps-requ...@w3.org [mailto:public-webapps-requ...@w3.org] On Behalf Of Keean Schupke Sent: Tuesday, March 08, 2011 3:03 PM No objections here. Keean. On 8 March 2011 21:14, Jonas Sicking jo...@sicking.ccjo...@sicking.ccwrote: On Mon, Mar 7, 2011 at 10:43 PM, Jeremy Orlow jor...@chromium.org wrote: On Fri, Jan 21, 2011 at 1:41 AM, Jeremy Orlow jor...@chromium.org wrote: After thinking about it a bunch and talking to others, I'm actually leaning towards both option A and B. Although this will be a little harder for implementors, it seems like there are solid reasons why some users would want to use A and solid reasons why others would want to use B. Any objections to us going that route? Not from me. If I don't hear objections I'll write up a spec draft and attach it here before committing to the spec. Option A is pretty well understood, I like that one. For option B, at some point we had a debate on whether when indexing an array value we should consider it a single key value or we should unfold it into multiple index records. The first option makes it very similar to A in that an array is just a composite value (it is quite a bit more painful to implement...), the second option is interesting in that allows for new scenarios such as objects with an array for tags, where you want to look up by tag (even after doing options A and B as currently defined, in order support multiple tags you'd need a second store that keeps the tags + key for the objects you want to tag). Is there any interest in that scenario? Yes. Once we're settled on this, I'm going to send an email on that. :-) Option b won't get in the way of my proposal. J At some point, I really would like to get people from the PostgreSQL project involved with indexeddb. They have a wealth of experience to bring to the discussion. For the moment, like many server-side packages, they're at quite a distance from the w3. FWIW, pgsql is a perfectly valid 'host' for idb calls.
Re: [IndexedDB] Compound and multiple keys
I have already said I have no specific concerns regarding this change. Its difficult to predict problems that will emerge when people actually try and use an API. That's why there are so many bad APIs out there. One way to mitigate this risk is to look at well used existing APIs (in languages like 'c') to see what works well. Many people often write different APIs for the same task, and the best win. I would look to existing winners (like BDB) for guidance on the total API, as due to the standardisation process (and the nature of web browsers) there is no opportunity for competition to choose the best API. It would be nice if jsnode was more advanced, then there might be many database API implementations in JavaScript we could look at to see which are preferred and use as a starting point. Looking at the requirements for IDB, BerkeleyDB would seem to be an ideal candidate to port the API, its popular, widely used and has stood the test of time, and is easy to use, and would be even easier in JavaScript with garbage collection. Cheers, Keean. On 9 March 2011 09:41, Jeremy Orlow jor...@chromium.org wrote: Keean/Charles: I definitely think the more people involved the better, but let's not get too hung up on the specifics of PostgreSQL, BDB, etc. Our goal here should be to make a great API for web developers while balancing practical considerations like how difficult it'll be to implement and/or use efficiently. That said, I'm not understanding what your comments have to do with this proposal. Do you have specific concerns? J On Wed, Mar 9, 2011 at 12:55 AM, Keean Schupke ke...@fry-it.com wrote: Getting pgsql people involved sounds a great idea. Having some more people to argue for formalised and standardised database APIs like SQL, and experience with relational operations and optimisation would be good (That is an assumption on my part, but then they are writing PostgreSQL not CouchDB). Do you know some people you could invite? More generally though, I think BerkeleyDB would make a much better target for IDB. I don't think it should be trying to be PostgreSQL or MySQL. I think that implementing a good low-level API like BerkeleyDB that has enough functionality to allow SQL to be implemented over the top. The problem with trying to implement IDB on top of PostgreSQL is that IDB has a very narrow interface, that does not support any of the powerful features of pgsql. It would give you the worst of both. BDB would make a much implementation. Far more sensible would be to target the feature set of BDB for IDB, then PostgreSQL could be re-implemented in JavaSctipt on top. (a massive and impractical task, but I am trying to express the relationship between high level and low level database APIs). If we wanted to go fully relational, and avoid the correctness problems with string processing SQL commands, take a look at my relational library, currently implemented on top of WebSQL but an IDB version is in the works: https://github.com/keean/RelationalDB Cheers, Keean. On 9 March 2011 04:10, Charles Pritchard ch...@jumis.com wrote: On 3/8/2011 6:12 PM, Jeremy Orlow wrote: On Tue, Mar 8, 2011 at 5:55 PM, Pablo Castro pablo.cas...@microsoft.com wrote: From: public-webapps-requ...@w3.org [mailto: public-webapps-requ...@w3.org] On Behalf Of Keean Schupke Sent: Tuesday, March 08, 2011 3:03 PM No objections here. Keean. On 8 March 2011 21:14, Jonas Sicking jo...@sicking.ccjo...@sicking.ccwrote: On Mon, Mar 7, 2011 at 10:43 PM, Jeremy Orlow jor...@chromium.org wrote: On Fri, Jan 21, 2011 at 1:41 AM, Jeremy Orlow jor...@chromium.org wrote: After thinking about it a bunch and talking to others, I'm actually leaning towards both option A and B. Although this will be a little harder for implementors, it seems like there are solid reasons why some users would want to use A and solid reasons why others would want to use B. Any objections to us going that route? Not from me. If I don't hear objections I'll write up a spec draft and attach it here before committing to the spec. Option A is pretty well understood, I like that one. For option B, at some point we had a debate on whether when indexing an array value we should consider it a single key value or we should unfold it into multiple index records. The first option makes it very similar to A in that an array is just a composite value (it is quite a bit more painful to implement...), the second option is interesting in that allows for new scenarios such as objects with an array for tags, where you want to look up by tag (even after doing options A and B as currently defined, in order support multiple tags you'd need a second store that keeps the tags + key for the objects you want to tag). Is there any interest in that scenario? Yes. Once we're settled on this, I'm going to send an email on that. :-) Option b won't get in the way of my
Re: [IndexedDB] Two Real World Use-Cases
On 8 March 2011 06:33, Joran Greef jo...@ronomon.com wrote: On 08 Mar 2011, at 7:23 AM, Dean Landolt wrote: This doesn't seem right. Assuming your WebSQL implementation had all the same indexes isn't it doing pretty much the same things as using separate objectStores in IDB? Why would it be an order of magnitude slower? I'm sure whatever implementation you're using hasn't seen much optimization but you seem to be implying there's something more fundamental? The only thing I can think of to blame would be the fat in the objectStore interface -- like, for instance, the index building facilities. It seems to me your proposed solution is to add yet more fat to the interface (more complex indexing), but wouldn't it be just as suitable to instead strip down objectStores to their bare essentials to make them more suitable to act as indexes? Then the indexing functionality and all the hard decisions could be punted to libraries where they'd be free to innovate. Exactly. It's not what one would expect, and indication of the poor state of the IDB implementation (which is essentially a wrapper around SQLite anyway). If someone is advising that object stores be used to handle indexes then may I be the first to raise a red flag and say that IDB is failing us (and it would have been better for the spec team to provide a locking mechanism for LocalStorage so it could be used in that way). The whole point of IDB as far as I can see is to provide transactional indexed access to a key value store. Why? You wouldn't necessarily have to store the whole object in each index, just the index key, a value and some pointer to the original source object. Something to resolve this pointer to the source would need to be spec'd (a la couchdb's include_docs), but that's simple. Even better, say it were possible to define a link relation on an object store that can resolve to its source object -- you could define a source link relation and the property to use -- and this would have the added bonus of being more broadly applicable than just linking an index record to its source instance. Think of the object creation and JSON serialization/deserialization overhead for putting 50 indexes and you have got more than enough waste there already. We can fix all of this right now very simply: 1. Enable objectStore.put and objectStore.delete to accept a setIndexes option and an unsetIndexes option. The value passed for either option would be an array (string list) of index references. This would only work for indexes arrays of strings, right? Things can get much more complicated than that, and when they do you'd have to use an objectStore to do your indexing anyway, right? No it would work for pretty much anything. The application would be free to determine the indexes, and also to convert query parameters into indexes when querying. It's essentially computed indexes without the hassles of IDB trying to do it (there was an interesting thread last year on the challenges of storing am index computing function in IDB). Why is it more theoretically performant than using objectStores in the raw? It's a more direct interface. Think about it for a second. Using objectStores in the raw is interpolating O(n) complexity with multiple function calls, to give just one reason. If IDB can receive a list of indexes to add and remove an object to and from, then it can also do things like perform a set difference first to save unnecessary IO. I have written a database or two with this technique and it's certainly faster. I don't necessarily understand the stateful vs. stateless distinction here. I don't see how your proposed solution removes the requirement for IDB to enforce constraints when certain indexes are present. Developers would already be able to use IDB statefully (with predefined schemas) -- they'd just use a library that has a schema mechanism. I doubt such a library for IDB already exists, but it'd be quite easy to port perstore, for instance, which is derived from the IDB API and already has this functionality using json-schema. There will no doubt be many ORM-like libraries that will pop up as soon as IDB starts to stabilize (or as soon as it gets a node.js implementation). The trouble is you always think a database would be quite easy until you actually try to do it yourself. At first when I dug into IDB I didn't think there would be any problems that could not be handled in some way. I have actually switched back to WebSQL now and will encourage my users to use Safari or Chrome as long as these browsers support WebSQL (and I hope Chrome will at least finish up by adding a quota interface for WebSQL). IDB right now is like a completely neutered slower SQLite without any of the benefits to be expected of a transactional indexed KV store. It's really sad. For examples of stateless databases see the interfaces for Redis (the best example, and a perfect
Re: [IndexedDB] Two Real World Use-Cases
Actually I am not sure now if SQLite uses BDB now (they might be moving to it though). However BDB definitely now has an SQLite-3.0 compatible API now and supports better concurrency, as well as AES encryption. So at the moment looks like i'm moving to using BDB instead of SQLite, (apart from when size of the app package file is an issue and SQLite is provided as part of the platform). Cheers, Keean. On 8 March 2011 17:54, Dean Landolt d...@deanlandolt.com wrote: On Tue, Mar 8, 2011 at 1:33 AM, Joran Greef jo...@ronomon.com wrote: On 08 Mar 2011, at 7:23 AM, Dean Landolt wrote: This doesn't seem right. Assuming your WebSQL implementation had all the same indexes isn't it doing pretty much the same things as using separate objectStores in IDB? Why would it be an order of magnitude slower? I'm sure whatever implementation you're using hasn't seen much optimization but you seem to be implying there's something more fundamental? The only thing I can think of to blame would be the fat in the objectStore interface -- like, for instance, the index building facilities. It seems to me your proposed solution is to add yet more fat to the interface (more complex indexing), but wouldn't it be just as suitable to instead strip down objectStores to their bare essentials to make them more suitable to act as indexes? Then the indexing functionality and all the hard decisions could be punted to libraries where they'd be free to innovate. Exactly. It's not what one would expect, and indication of the poor state of the IDB implementation (which is essentially a wrapper around SQLite anyway). Which implementation? Why do you think it's a wrapper around SQLite? I doubt it could be implemented efficiently this way (due to its schema-free nature), so that would explain your benchmarks. But why would you judge the spec on one poor implementation? If someone is advising that object stores be used to handle indexes then may I be the first to raise a red flag and say that IDB is failing us (and it would have been better for the spec team to provide a locking mechanism for LocalStorage so it could be used in that way). This is hyperbole. The critical feature IDB gives us is efficient range retrieval -- try that with LocalStorage. The whole point of IDB as far as I can see is to provide transactional indexed access to a key value store. You say indexed, I say ordered. An objectStore is more than a kv store -- the keys are stored and traversed in order. This is the win, what makes IDB objectStores so special. This also makes them look an awful lot like indexes too! (Which reminds me: last time I checked collation is still up in the air -- this could be very problematic for interop. Anyone know of any plans to correct this in the first version?) Why? You wouldn't necessarily have to store the whole object in each index, just the index key, a value and some pointer to the original source object. Something to resolve this pointer to the source would need to be spec'd (a la couchdb's include_docs), but that's simple. Even better, say it were possible to define a link relation on an object store that can resolve to its source object -- you could define a source link relation and the property to use -- and this would have the added bonus of being more broadly applicable than just linking an index record to its source instance. Think of the object creation and JSON serialization/deserialization overhead for putting 50 indexes and you have got more than enough waste there already. How does your proposal avoid this? We can fix all of this right now very simply: 1. Enable objectStore.put and objectStore.delete to accept a setIndexes option and an unsetIndexes option. The value passed for either option would be an array (string list) of index references. This would only work for indexes arrays of strings, right? Things can get much more complicated than that, and when they do you'd have to use an objectStore to do your indexing anyway, right? No it would work for pretty much anything. The application would be free to determine the indexes, and also to convert query parameters into indexes when querying. It's essentially computed indexes without the hassles of IDB trying to do it (there was an interesting thread last year on the challenges of storing am index computing function in IDB). Why is it more theoretically performant than using objectStores in the raw? It's a more direct interface. Think about it for a second. Using objectStores in the raw is interpolating O(n) complexity with multiple function calls, to give just one reason. Huh? If an objectStore is backed by something like a BDB btree, as is implied by the design of the spec, retrieval ought to be O(log base_n) where base_n is the average page size. Writing would have O(n) complexity where n is the number of indexes, but the same is true for your proposal, right? If
Re: [IndexedDB] Compound and multiple keys
No objections here. Keean. On 8 March 2011 21:14, Jonas Sicking jo...@sicking.cc wrote: On Mon, Mar 7, 2011 at 10:43 PM, Jeremy Orlow jor...@chromium.org wrote: On Fri, Jan 21, 2011 at 1:41 AM, Jeremy Orlow jor...@chromium.org wrote: On Thu, Jan 20, 2011 at 6:29 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: On Thu, Jan 20, 2011 at 10:12 AM, Keean Schupke ke...@fry-it.com wrote: Compound primary keys are commonly used afaik. Indeed. It's one of the common themes in the debate between natural and synthetic keys. Fair enough. Should we allow explicit compound keys? I.e myOS.put({...}, ['first name', 'last name'])? I feel pretty strongly that if we do, we should require this be specified up-front when creating the objectStore. I.e. add some additional parameter to the optional options object. Otherwise, we'll force implementations to handle variable compound keys for just this one case, which seems kind of silly. The other option is to just disallow them. After thinking about it a bunch and talking to others, I'm actually leaning towards both option A and B. Although this will be a little harder for implementors, it seems like there are solid reasons why some users would want to use A and solid reasons why others would want to use B. Any objections to us going that route? Not from me. If I don't hear objections I'll write up a spec draft and attach it here before committing to the spec. / Jonas
Re: [IndexedDB] Two Real World Use-Cases
On 3 March 2011 09:15, Joran Greef jo...@ronomon.com wrote: Hi Jonas I have been trying out your suggestion of using a separate object store to do manual indexing (and so support compound indexes or index object properties with arrays as values). There are some problems with this approach: 1. It's far too slow. To put an object and insert 50 index records (typical when updating an inverted index) this way takes 100ms using IDB versus 10ms using WebSQL (with a separate indexes table and compound primary key on index name and object key). For instance, my application has a real requirement to replicate 4,000,000 emails between client and server and I would not be prepared to accept latencies of 100ms to store each object. That's more than the network latency. 2. It's a waste of space. Using a separate object store to do manual indexing may work in theory but it does not work in practice. I do not think it can even be remotely suggested as a panacea, however temporary it may be. We can fix all of this right now very simply: 1. Enable objectStore.put and objectStore.delete to accept a setIndexes option and an unsetIndexes option. The value passed for either option would be an array (string list) of index references. 2. The object would first be removed as a member from any indexes referenced by the unsetIndexes option. Any referenced indexes which would be empty thereafter would be removed. 3. The object would then be added as a member to any indexes referenced by the setIndexes option. Any referenced indexes which do not yet exist would be created. This would provide the much-needed indexing capabilities presently lacking in IDB without sacrificing performance. It would also enable developers to use IDB statefully (MySQL-like pre-defined schemas with the DB taking on the complexities of schema migration and data migration) or statelessly (See Berkeley DB with the application responsible for the complexities of data maintenance) rather than enforcing an assumption at such an early stage. Regards Joran Greef Why would this be faster? Surely most of the time in inserting the 50 indexes is the search time of the index, and the JavaScript function call overhead would be minimal (its only 50 calls)? Cheers, Keean.
Re: [IndexedDB] Two Real World Use-Cases
If you are operating on indexes then you do not have a 'join' language as you are operating on sets. To have a join you need to be operating on relations. A relation is commonly visualised as a row in a table in a relational database, With IDB this would be the union of all the property-sets of the objects in the index. A complete set of relational operators would be: project restrict rename join union difference In most useful syntaxes you don't need rename as the other methods handle renaming attributes already. Join is traditionally a Cartesian-product, but a natural-join can be substituted without losing completeness. Intersection is not included as it is easily derived from union and (symmetric)difference. Cheers, Keean. On 2 March 2011 06:35, Joran Greef jo...@ronomon.com wrote: On 01 Mar 2011, at 7:27 PM, Jeremy Orlow wrote: 1. Be able to put an object and pass an array of index names which must reference the object. This may remove the need for a complicated indexing spec (perhaps the reason why this issue has been pushed into the future) and give developers all the flexibility they need. You're talking about having multiple entries in a single index that point towards the same primary key? If so, then I strongly agree, and I think others agree as well. It's mostly a question of syntax. A while ago we brainstormed a couple possibilities. I'll try to send out a proposal this week. I think this + compound keys should probably be our last v1 features though. (Though they almost certainly won't make Chrome 11 or Firefox 4, unfortunately, hopefully they'll be done in the next version of each, and hopefully that release with be fairly soon after for both.) Yes, for example this user object { name: Joran Greef, emails: [ jo...@ronomon.com, jorangr...@gmail.com] } with indexes on the emails property, would be found in the jo...@ronomon.com index as well as in the jorangr...@gmail.com index. What I've been thinking though is that the problem even with formally specifying indexes in advance of object put calls, is that this pushes too much application model logic into the database layer, making the database enforce a schema (at least in terms of indexes). Of course IDB facilitates migrations in the form of setVersion, but most schema migrations are also coupled with changes to the data itself, and this would still have to be done by the application in any event. So at the moment IDB takes too much responsibility on behalf of the application (computing indexes, pre-defined indexes, pseudo migrations) and not enough responsibility for pure database operations (index intersections and index unions). I would argue that things like migrations and schema's are best handled by the application, even if this is more work for the application, as most people will write wrappers for IDB in any event and IDB is supposed to be a core-level API. The acid-test must be that the database is oblivious to schemas or anything pre-defined or application-specific (i.e. stateless). Otherwise IDB risks being a database for newbies who wouldn't use it, and a database that others would treat as a KV anyway (see MySQL at FriendFeed). A suggested interface then for putting or deleting objects, would be: objectStore.put(object, [indexname1, indexname2, indexname3]) and then IDB would need to ensure that the object would be referenced by the given index names. When removing the object, the application would need to provide the indexes again (or IDB could keep track of the indexes associated with an object). Using a function to compute indexes would not work as this would entrap application-specific schema knowledge within the function (which would need to be persisted) and these may subsequently change in the application, which would then need a way to modify the function again. The key is that these things must be stateless. The objects must be opaque to IDB (no need for serialization/deserialization overhead at the DB layer). Things like key-paths etc. could be removed and the object id just passed in to put or delete calls. 2. Be able to intersect and union indexes. This covers a tremendous amount of ground in terms of authorization and filtering. Our plan was to punt some sort of join language to v2. Could you give a more concrete proposal for what we'd add? It'd make it easier to see if it's something realistic for v1 or not. If you can perform intersect or union operations (and combinations of these) on indexes (which are essentially sets or sorted sets), then this would be the join language. It has the benefit that the interface would then be described in terms of operations on data structures (set operations on sets) rather than a custom language which would take longer to spec out. I've written databases over append-only files, S3, WebSQL and even LocalStorage (!) and from what I've found with my own applications, you could handle
Re: [IndexedDB] Two Real World Use-Cases
On 2 March 2011 11:31, Jonas Sicking jo...@sicking.cc wrote: On Tue, Mar 1, 2011 at 10:35 PM, Joran Greef jo...@ronomon.com wrote: On 01 Mar 2011, at 7:27 PM, Jeremy Orlow wrote: 1. Be able to put an object and pass an array of index names which must reference the object. This may remove the need for a complicated indexing spec (perhaps the reason why this issue has been pushed into the future) and give developers all the flexibility they need. You're talking about having multiple entries in a single index that point towards the same primary key? If so, then I strongly agree, and I think others agree as well. It's mostly a question of syntax. A while ago we brainstormed a couple possibilities. I'll try to send out a proposal this week. I think this + compound keys should probably be our last v1 features though. (Though they almost certainly won't make Chrome 11 or Firefox 4, unfortunately, hopefully they'll be done in the next version of each, and hopefully that release with be fairly soon after for both.) Yes, for example this user object { name: Joran Greef, emails: [ jo...@ronomon.com, jorangr...@gmail.com] } with indexes on the emails property, would be found in the jo...@ronomon.com index as well as in the jorangr...@gmail.com index. What I've been thinking though is that the problem even with formally specifying indexes in advance of object put calls, is that this pushes too much application model logic into the database layer, making the database enforce a schema (at least in terms of indexes). Of course IDB facilitates migrations in the form of setVersion, but most schema migrations are also coupled with changes to the data itself, and this would still have to be done by the application in any event. So at the moment IDB takes too much responsibility on behalf of the application (computing indexes, pre-defined indexes, pseudo migrations) and not enough responsibility for pure database operations (index intersections and index unions). I would argue that things like migrations and schema's are best handled by the application, even if this is more work for the application, as most people will write wrappers for IDB in any event and IDB is supposed to be a core-level API. The acid-test must be that the database is oblivious to schemas or anything pre-defined or application-specific (i.e. stateless). Otherwise IDB risks being a database for newbies who wouldn't use it, and a database that others would treat as a KV anyway (see MySQL at FriendFeed). A suggested interface then for putting or deleting objects, would be: objectStore.put(object, [indexname1, indexname2, indexname3]) and then IDB would need to ensure that the object would be referenced by the given index names. When removing the object, the application would need to provide the indexes again (or IDB could keep track of the indexes associated with an object). Using a function to compute indexes would not work as this would entrap application-specific schema knowledge within the function (which would need to be persisted) and these may subsequently change in the application, which would then need a way to modify the function again. The key is that these things must be stateless. The objects must be opaque to IDB (no need for serialization/deserialization overhead at the DB layer). Things like key-paths etc. could be removed and the object id just passed in to put or delete calls. I agree that we are currently enforcing a bit of schema due to the way indexes work. However I think it's a good approach for an initial version of this API as it covers the most simple use cases. Note that the more complex use cases are still very possible by simply using a separate objectStore as an index and manually add/remove things there. I still believe that using a function, which is persisted in the database, is very doable. And yes, the function needs to be stateless and it needs to be possible to change the set of functions which manage the set of indexes associated with a given objectStore (probably by simply allowing indexes to be created and removed, which is already the case). / Jonas I would recommend against storing functions in the database (not saying it should not be possible, but stored procedures obscure functionality, and cause surprises which are both bad things IMHO). For this kind of thing I would create a master index from object-id to object, and then create multiple secondary indexes from property to object-id. Removing an object is simply removing it from the master index. You would avoid the slow scan of the secondary indexes (slow because you have to visit each object to delete by value) by simply leaving the entries there, they would be filtered out of any results because the object-id is no longer in the master-index (a fast lookup). You would then occasionally do a scan of the secondary indexes to remove several dead references in one
Re: [IndexedDB] Two Real World Use-Cases
On 2 March 2011 12:09, Joran Greef jo...@ronomon.com wrote: On 02 Mar 2011, at 1:31 PM, Jonas Sicking wrote: I agree that we are currently enforcing a bit of schema due to the way indexes work. However I think it's a good approach for an initial version of this API as it covers the most simple use cases. Note that the more complex use cases are still very possible by simply using a separate objectStore as an index and manually add/remove things there. I still believe that using a function, which is persisted in the database, is very doable. And yes, the function needs to be stateless and it needs to be possible to change the set of functions which manage the set of indexes associated with a given objectStore (probably by simply allowing indexes to be created and removed, which is already the case). / Jonas Thank you Jonas, I'm using your multi objectStore trick at the moment to store indexes. It just seems that the most direct way of doing all of this, would just be to let the application pass in the relevant index references when it makes put or delete calls. IDB is almost becoming a Rube Goldberg device trying to find other ways of doing this. The reason I bring it up, is because I just made this same change with my server database, which used to require schema knowledge, so it could compute indexes etc., and then I realized this could all be eliminated completely by just passing indexes per put and delete call. I really don't think IDB should try and dip it's toes into application state in the first place, let alone try and keep up with application state thereafter. What is the motivation for doing that? It's not absolutely necessary. It's an assumption that is bloating almost every part of the spec. It's not the killer feature of IDB, and it's getting in the way of things that could be, such as indexing and querying. If version 1 is done right, there will be no need for version 2. There's been a tremendous amount of discussion regarding IDB and people like yourself and Jeremy have certainly contributed massively, but I do get the feeling (as may you) that version 2 is becoming a stopover for things that have not been thought through completely, for which a solution is not yet clear, something's not right. I only say this from recently re-writing a database after making the same mistake. Personally I think allowing multiple index entries for a single object breaks referential transparency. I would have one index where objects are indexed by a unique object ID, and another index object-ids are indexed by email-address. I suspect this is what you are doing now? To improve on this situation (and keep referential transparency) would require multiple indexes on a single object (so you can have a unique primary key (object-id), and a secondary index on email-address), but as I said earlier you are then well on the way to re-inventing a relational database. IMHO you are then better off implementing relations properly, rather than producing something not entirely quite unlike a relational database. Cheers, Keean.
Re: IndexedDB: updates through cursors on indexes that change the key
Surely the cursor should be atomic, representing the instant in time the query executed. Any updates or deletes etc would not be visible to the cursor, only to later queries. Then you can allow any modifications including to keys and indexes. Cheers, Keean On 2 Feb 2011 00:05, Jeremy Orlow jor...@chromium.org wrote: On Tue, Feb 1, 2011 at 2:56 PM, Jonas Sicking jo...@sicking.cc wrote: On Tue, Feb 1, 2011 at 11:44 AM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Feb 1, 2... Good points (against having it remove the original key if it changes). After some more thought: The original idea behind cursor.delete() and cursor.update() was that they would basically just be aliases for objectStore.delete() and objectStore.put(). Maybe calling .update() with a changed primary key should simply have the same behavior as .put(). Thus the value corresponding to the original key would be left unmodified and the new key would then correspond to the new value. I can't think of any examples where the current behavior would get in someone's way though. So I guess maybe we should just leave it as is. But I still hate the idea of it being subtly different from being a straight up alias to put. J
Re: IndexedDB: updates through cursors on indexes that change the key
That seems to be different from accepted practice in databases. I On 2 Feb 2011 00:39, ben turner bent.mozi...@gmail.com wrote: No, that idea was rejected a while ago. IndexedDB cursors are live, so any change made during the cursor are visible to the cursor as well as later queries. -Ben Turner On Tue, Feb 1, 2011 at 4:35 PM, Keean Schupke ke...@fry-it.com wrote: Surely the cursor should ...
Re: IndexedDB: updates through cursors on indexes that change the key
Sorry, sent that before I was finished. Seems prone to problems in environments with multiple parallel accesses to the same database. I guess I would need to do an atomic copy of the elements to a separate object store to iterate throught? Is there a way of atomically copying a set of objects? Cheers, Keean. On 2 Feb 2011 00:41, Keean Schupke ke...@fry-it.com wrote: That seems to be different from accepted practice in databases. I On 2 Feb 2011 00:39, ben turner bent.mozi...@gmail.com wrote: No, that idea was rejecte... On Tue, Feb 1, 2011 at 4:35 PM, Keean Schupke ke...@fry-it.com wrote: Surely the cursor should ...
Re: IndexedDB: updates through cursors on indexes that change the key
So whats the benefit of allowing a cursor to modify the data under it? Cheers, Keean. On 2 February 2011 01:17, Jonas Sicking jo...@sicking.cc wrote: On Tue, Feb 1, 2011 at 4:48 PM, Keean Schupke ke...@fry-it.com wrote: Sorry, sent that before I was finished. Seems prone to problems in environments with multiple parallel accesses to the same database. As long as you're inside a transaction, no other environments (be they separate tabs running in a separate process, workers running in a separate thread, or separate components running in the same page) will be able to mutate the data under you. / Jonas
Re: IndexedDB: updates through cursors on indexes that change the key
I see. I suppose for the relational stuff that I am doing I will have to copy all the data in the cursor, otherwise it will mess up updates and inserts with nested selects. Cheers, Keean. On 2 Feb 2011 01:32, Jeremy Orlow jor...@chromium.org wrote: Please look at the mail archives. IIRC, it seemed confusing that you could be looking at old data. Iterating on live data seems more consistent with run to completion semantics. J On Tue, Feb 1, 2011 at 5:26 PM, Keean Schupke ke...@fry-it.com wrote: So whats the benefit o...
Re: [IndexedDB] Compound and multiple keys
Out of line keys (B) for me. You can have a key that is not an object property that way... and you can include the key in the object optionally. There is also no need to give the key fields in advance. These two things together make this the best option IMHO. Keean On 20 Jan 2011 10:52, Jeremy Orlow jor...@chromium.org wrote: Ok. So what's the resolution? Let's bug it! On Fri, Dec 10, 2010 at 12:34 PM, Jeremy Orlow jor...@chromium.org wrote: Any other thoughts on this issue? On Thu, Dec 2, 2010 at 7:19 AM, Keean Schupke ke...@fry-it.com wrote: I think I prefer A. Declaring the keys in advance is stating to sound a little like a schema, and when you go down that route you end up at SQL schemas (which is a good thing in my opinion). I understand however that some people are not so comfortable with the idea of a schema, and these people seem to be the kind of people that like IndexedDB. So, although I prefer A for me, I would have to say B for IndexedDB. So in conclusion: I think B is the better choice for IndexedDB, as it is more consistent with the design of IDB. As for the cons of B, sorting an array is just like sorting a string, and it already supports string types. Surely there is also option C: store.add({firstName: Benny, lastName: Zysk, age: 28}, [firstName, lastName]); store.add({firstName: Benny, lastName: Andersson, age: 63}, [firstName, lastName]); Like A, but listing the properties to include in the composite index with each add, therefore avoiding the schema... As for layering the Relational API over the top, It doesn't make any difference, but I would prefer whichever has the best performance. Cheers, Keean. On 2 December 2010 00:57, Jonas Sicking jo...@sicking.cc wrote: Hi IndexedDB fans (yay!!), Problem description: One of the current shortcomings of IndexedDB is that it doesn't support compound indexes. I.e. indexing on more than one value. For example it's impossible to index on, and therefor efficiently search for, firstname and lastname in an objectStore which stores people. Or index on to-address and date sent in an objectStore holding emails. The way this is traditionally done is that multiple values are used as key for each individual entry in an index or objectStore. For example the CREATE INDEX statement in SQL can list multiple columns, and CREATE TABLE statment can list several columns as PRIMARY KEY. There have been a couple of suggestions how to do this in IndexedDB Option A) When specifying a key path in createObjectStore and createIndex, allow an array of key-paths to be specified. Such as store = db.createObjectStore(mystore, [firstName, lastName]); store.add({firstName: Benny, lastName: Zysk, age: 28}); store.add({firstName: Benny, lastName: Andersson, age: 63}); store.add({firstName: Charlie, lastName: Brown, age: 8}); The records are stored in the following order Benny, Andersson Benny, Zysk Charlie, Brown Similarly, createIndex accepts the same syntax: store.createIndex(myindex, [lastName, age]); Option B) Allowing arrays as an additional data type for keys. store = db.createObjectStore(mystore, fullName); store.add({fullName: [Benny, Zysk], age: 28}); store.add({fullName: [Benny, Andersson], age: 63}); store.add({fullName: [Charlie, Brown], age: 8}); Also allows out-of-line keys using: store = db.createObjectStore(mystore); store.add({age: 28}, [Benny, Zysk]); store.add({age: 63}, [Benny, Andersson]); store.add({age: 8}, [Charlie, Brown]); (the sort order here is the same as in option A). Similarly, if an index pointed used a keyPath which points to an array, this would create an entry in the index which used a compound key consisting of the values in the array. There are of course advantages and disadvantages with both options. Option A advantages: * Ensures that at objectStore/index creation time the number of keys are known. This allows the implementation to create and optimize the index using this information. This is especially useful in situations when the indexedDB implementation is backed by a SQL database which uses columns as a way to represent multiple keys. * Easy to use when key values appear as separate properties on the stored object. * Obvious how to sort entries. Option A disadvantages: * Doesn't allow compound out-of-line keys. * Requires multiple properties to be added to stored objects if the components of the key isn't available there (for example if it's out-of-line or stored in an array). Option B advantages: * Allows compound out-of-line keys. * Easy to use when the key values are handled as an array by other code. Both when using in-line and out-of-line keys. * Maximum flexibility since you can combine single-value keys and compound keys in one objectStore, as well as arrays of different length (we couldn't come up with use cases for this though). Option B disadvantages: * Requires defining sorting between single values
Re: [chromium-html5] LocalStorage inside Worker
I think the idea is that JavaScript should not do unexpected things. The suggestion to only make local storage accessible from inside callbacks seems the best suggestion so far. Cheers, Keean. On 11 January 2011 06:20, Felix Halim felix.ha...@gmail.com wrote: On Tue, Jan 11, 2011 at 1:02 PM, Glenn Maynard gl...@zewt.org wrote: localStorage should focus on simplicity and performance and ignore thread safety since, IMHO, localStorage is used for UI purposes or preferences settings (not data itself). If you open two tab, you change settings in one tab, you can just refresh the other tab and I believe both of them will have the same UI state again. It's used for data storage, too, particularly since it's widely available in production; IndexedDB is not. Then, why don't introduce a new storage, like localStorageNTS (NTS = non thread safe), and allow this storage to be used everywhere... Felix Halim
Re: [chromium-html5] LocalStorage inside Worker
I think I already came to the same conclusion... JavaScript has no control over effects, which devalues STM. In the absence of effect control, apparent serialisation (of transactions) is the best you can do. What we need is a purely functional JavaScript, it makes threading so much easier ;-) Cheers, Keean. On 10 January 2011 23:42, Robert O'Callahan rob...@ocallahan.org wrote: STM is not a panacea. Read http://www.bluebytesoftware.com/blog/2010/01/03/ABriefRetrospectiveOnTransactionalMemory.aspxif you haven't already. In Haskell, where you have powerful control over effects, it may work well, but Javascript isn't anything like that. Rob -- Now the Bereans were of more noble character than the Thessalonians, for they received the message with great eagerness and examined the Scriptures every day to see if what Paul said was true. [Acts 17:11]
Re: [IndexedDB] Events and requests
Comments inline: On 11 January 2011 07:11, Axel Rauschmayer a...@rauschma.de wrote: Coming back to the initial message in this thread (at the very bottom): = General rule of thumb: clearly separate input data and output data. Using JavaScript dynamic nature, things could look as follows: indexedDB.open('AddressBook', 'Address Book', { success: function(evt) { }, error: function(evt) { } }); Personally I prefer a single callback passed an object. indexedDB.open('AddressBook', 'Address Book', function(event) { switch(event.status) { case EVENT_SUCCESS: break; case EVENT_ERROR: break; } }); As it allows callbacks to be composed more easily. - The last argument is thus the request and clearly input. - If multiple success handlers are needed, success could be an array of functions (same for error handlers). multiple handlers can be passes using a composition function: // can be defined in the library var all = function(flist) { return function(event) { for (int i = 0; i flist.length; i++) { flist[i](event); } }; }; indexedDB.open('AddressBook', 'Address Book', all([fn1, fn2, fn3])); Cheers, Keean. - I would eliminiate readyState and move abort() to IDBEvent (=output and an interface to the DB client). - With subclasses of IDBEvent one has the choice of eliminating them by making their fields additional parameters of success() and error(). event.result is a prime candidate for this! - This above way eliminates the need of manipulating the request *after* (a reference to) it has been placed in the event queue. Questions: - Is it really necessary to make IDBEvent a subclass of Event and thus drag the DOM (which seems to be universally hated) into IndexedDB? - Are there any other asynchronous DB APIs for dynamic languages that one could learn from (especially from mistakes that they have made)? They must have design principles and rationales one might be able to use. WebDatabase (minus schema plus cursor) looks nice. On Jan 10, 2011, at 23:40 , Keean Schupke wrote: Hi, I did say it was for fun! If you think it should be suggested somewhere I am happy to do so. Note that I renamed 'onsuccess' to 'bind' to show how it works as a monad, there is no need to do this (although I prefer to it to explicitly show it is a Monad). The definition of unit is simply: var unit = function(v) { return { onsuccess: function(f) {f(v);} }; }; And then you can compose callbacks using 'onsuccess'... you might like to keep onsuccess, and use result instead of unit... So simply using the above definition you can compose callbacks: var y = db.transaction([foo]).objectStore(foo).getM(mykey1).onsuccess(function(result1) { db.transaction([foo]).objectStore(foo).getM(mykey2).onsuccess(function(result2) { result(result1 + result2); }); }); Cheers, Keean. On 10 January 2011 22:31, Jonas Sicking jo...@sicking.cc wrote: This seems like something better suggeseted to the lists at ECMA where javascript (or rather ECMAScript) is being standardized. I hardly think that a database API like indexedDB is the place to redefine how javascript should handle asynchronous programming. / Jonas On Mon, Jan 10, 2011 at 2:26 PM, Keean Schupke ke...@fry-it.com wrote: Just to correct my cut and paste error, that was of course supposed to be: var y = do { result1 - db.transaction([foo]).objectStore(foo).getM(mykey1); result2 - db.transaction([foo]).objectStore(foo).getM(mykey2); unit(result1 + result2); } Cheers, Keean. On 10 January 2011 22:24, Keean Schupke ke...@fry-it.com wrote: Okay, sorry, the original change seemed sensible, I guess I didn't see how you got from there to promises. Here's some fun to think about as an alternative though: Interestingly the pattern of multiple callbacks, providing each callback is passed zero or one parameter forms a Monad. So for example if 'unit' is the constructor for the object returned from get then onsuccess it 'bind' and I can show that these obey the 3 monad laws. Allowing composability of callbacks. So you effectively have: var x = db.transaction([foo]).objectStore(foo).getM(mykey); var y = db.transaction([foo]).objectStore(foo).getM(mykey1).bind(function(result1) { db.transaction([foo]).objectStore(foo).getM(mykey2).bind(function(result2) { unit(result1 + result2); }); }); The two objects returned x and y are both the same kind of object. y represents the sum or concatination of the results of the lookups mykey1 and mykey2. You would use it identically to using the result of a single lookup: x.bind(function(result) {... display the result of a single lookup ...}); y.bind(function(result) {... display the result of both lookups ...}); If we could then have some syntactic
Re: [IndexedDB] Events and requests
If one handler changes the state who knows what will happen. I guess the order in which handers are called is significant. That's one advantage to using a function like all to compose callbacks - its very clear what order they get called in. You could call it 'sequence' to make it even clearer (that they are called one at a time left to right, not in parallel). You could make the callback an optional parameter, and use it if supplied, and return an object (for the existing API if none is supplied). Cheers, Keean. On 11 January 2011 09:31, Axel Rauschmayer a...@rauschma.de wrote: Looks great, I just tried to stay as close to the current API as possible. A single handler should definitely be enough. Can, say, a cursor be read multiple times (if there are several success handlers)? Doesn’t that make things more complicated? On Jan 11, 2011, at 10:22 , Keean Schupke wrote: Comments inline: On 11 January 2011 07:11, Axel Rauschmayer a...@rauschma.de wrote: Coming back to the initial message in this thread (at the very bottom): = General rule of thumb: clearly separate input data and output data. Using JavaScript dynamic nature, things could look as follows: indexedDB.open('AddressBook', 'Address Book', { success: function(evt) { }, error: function(evt) { } }); Personally I prefer a single callback passed an object. indexedDB.open('AddressBook', 'Address Book', function(event) { switch(event.status) { case EVENT_SUCCESS: break; case EVENT_ERROR: break; } }); As it allows callbacks to be composed more easily. - The last argument is thus the request and clearly input. - If multiple success handlers are needed, success could be an array of functions (same for error handlers). multiple handlers can be passes using a composition function: // can be defined in the library var all = function(flist) { return function(event) { for (int i = 0; i flist.length; i++) { flist[i](event); } }; }; indexedDB.open('AddressBook', 'Address Book', all([fn1, fn2, fn3])); Cheers, Keean. - I would eliminiate readyState and move abort() to IDBEvent (=output and an interface to the DB client). - With subclasses of IDBEvent one has the choice of eliminating them by making their fields additional parameters of success() and error(). event.result is a prime candidate for this! - This above way eliminates the need of manipulating the request *after* (a reference to) it has been placed in the event queue. Questions: - Is it really necessary to make IDBEvent a subclass of Event and thus drag the DOM (which seems to be universally hated) into IndexedDB? - Are there any other asynchronous DB APIs for dynamic languages that one could learn from (especially from mistakes that they have made)? They must have design principles and rationales one might be able to use. WebDatabase (minus schema plus cursor) looks nice. On Jan 10, 2011, at 23:40 , Keean Schupke wrote: Hi, I did say it was for fun! If you think it should be suggested somewhere I am happy to do so. Note that I renamed 'onsuccess' to 'bind' to show how it works as a monad, there is no need to do this (although I prefer to it to explicitly show it is a Monad). The definition of unit is simply: var unit = function(v) { return { onsuccess: function(f) {f(v);} }; }; And then you can compose callbacks using 'onsuccess'... you might like to keep onsuccess, and use result instead of unit... So simply using the above definition you can compose callbacks: var y = db.transaction([foo]).objectStore(foo).getM(mykey1).onsuccess(function(result1) { db.transaction([foo]).objectStore(foo).getM(mykey2).onsuccess(function(result2) { result(result1 + result2); }); }); Cheers, Keean. On 10 January 2011 22:31, Jonas Sicking jo...@sicking.cc wrote: This seems like something better suggeseted to the lists at ECMA where javascript (or rather ECMAScript) is being standardized. I hardly think that a database API like indexedDB is the place to redefine how javascript should handle asynchronous programming. / Jonas On Mon, Jan 10, 2011 at 2:26 PM, Keean Schupke ke...@fry-it.com wrote: Just to correct my cut and paste error, that was of course supposed to be: var y = do { result1 - db.transaction([foo]).objectStore(foo).getM(mykey1); result2 - db.transaction([foo]).objectStore(foo).getM(mykey2); unit(result1 + result2); } Cheers, Keean. On 10 January 2011 22:24, Keean Schupke ke...@fry-it.com wrote: Okay, sorry, the original change seemed sensible, I guess I didn't see how you got from there to promises. Here's some fun to think about as an alternative though: Interestingly the pattern of multiple callbacks, providing each callback is passed zero or one parameter forms a Monad. So for example
Re: [chromium-html5] LocalStorage inside Worker
Would each 'name' storage have its own thread to improve parallelism? would: withNamedStorage('x', function(store) {...}); make more sense from a naming point of view? Cheers, Keean. On 11 January 2011 20:58, Jonas Sicking jo...@sicking.cc wrote: With localStorage being the way it is, I personally don't think we can ever allow localStorage access in workers. However I do think we can and should provide access to a separate storage area (or several named storage areas) which can only be accessed from callbacks. On the main thread those callbacks would be asynchronous. In workers those callbacks can be either synchronous or asynchronous. Here is the API I'm proposing: getNamedStorage(in DOMString name, in Function callback); getNamedStorageSync(in DOMString name, in Function callback); The latter is only available in workers. The former is available in both workers and in windows. When the callback is called it's given a reference to the Storage object which has the exact same API as localStorage does. Also, you're not allowed to nest getNamedStorageSync and/or IDBDatabaseSync.transaction calls. This has the added advantage that it's much more implementable without threading hazards than localStorage already is. / Jonas On Tue, Jan 11, 2011 at 6:40 AM, Jeremy Orlow jor...@chromium.org wrote: So what's the plan for localStorage in workers? J On Tue, Jan 11, 2011 at 9:10 AM, Keean Schupke ke...@fry-it.com wrote: I think I already came to the same conclusion... JavaScript has no control over effects, which devalues STM. In the absence of effect control, apparent serialisation (of transactions) is the best you can do. What we need is a purely functional JavaScript, it makes threading so much easier ;-) Cheers, Keean. On 10 January 2011 23:42, Robert O'Callahan rob...@ocallahan.org wrote: STM is not a panacea. Read http://www.bluebytesoftware.com/blog/2010/01/03/ABriefRetrospectiveOnTransactionalMemory.aspx if you haven't already. In Haskell, where you have powerful control over effects, it may work well, but Javascript isn't anything like that. Rob -- Now the Bereans were of more noble character than the Thessalonians, for they received the message with great eagerness and examined the Scriptures every day to see if what Paul said was true. [Acts 17:11]
Re: [IndexedDB] Events and requests
Okay, sorry, the original change seemed sensible, I guess I didn't see how you got from there to promises. Here's some fun to think about as an alternative though: Interestingly the pattern of multiple callbacks, providing each callback is passed zero or one parameter forms a Monad. So for example if 'unit' is the constructor for the object returned from get then onsuccess it 'bind' and I can show that these obey the 3 monad laws. Allowing composability of callbacks. So you effectively have: var x = db.transaction([foo]).objectStore(foo).getM(mykey); var y = db.transaction([foo]).objectStore(foo).getM(mykey1).bind(function(result1) { db.transaction([foo]).objectStore(foo).getM(mykey2).bind(function(result2) { unit(result1 + result2); }); }); The two objects returned x and y are both the same kind of object. y represents the sum or concatination of the results of the lookups mykey1 and mykey2. You would use it identically to using the result of a single lookup: x.bind(function(result) {... display the result of a single lookup ...}); y.bind(function(result) {... display the result of both lookups ...}); If we could then have some syntactic sugar for this like haskell's do notation we could write: var y = do { db.transaction([foo]).objectStore(foo).getM(mykey1); result1 - db.transaction([foo]).objectStore(foo).getM(mykey2); result2 - db.transaction([foo]).objectStore(foo).getM(mykey2); unit(result1 + result2); } Which would be a very neat way of chaining callbacks... Cheers, Keean. On 10 January 2011 22:00, Keean Schupke ke...@fry-it.com wrote: Whats wrong with callbacks? To me this seems an unnecessary complication. Presumably you would do: var promise = db.transaction([foo]).objectStore(foo).get(mykey); var result = promise.get(); if (!result) { promise.onsuccess(function(res) {...X...}); } else { ...Y... } So you end up having to duplicate code at X and Y to do the same thing directly or in the context of a callback. Or you define a function to process the result: var f = function(res) {...X...}; var promise = db.transaction([foo]).objectStore(foo).get(mykey); var result = promise.get(); if (!result) { promise.onsuccess(f); } else { f(result) }; But in which case what advantage does all this extra clutter offer over: db.transaction([foo]).objectStore(foo).get(mykey).onsuccess(function(res) {...X...}); I am just wondering whether the change is worth the added complexity? Cheers, Keean. On 10 January 2011 21:31, Jonas Sicking jo...@sicking.cc wrote: I did some outreach to developers and while I didn't get a lot of feedback, what I got was positive to this change. The basic use-case that was brought up was implementing a promises which, as I understand it, works similar to the request model I'm proposing. I.e. you build up these promise objects which represent a result which may or may not have arrived yet. At some point you can either read the value out, or if it hasn't arrived yet, register a callback for when the value arrives. It was pointed out that this is still possible with how the spec is now, but it will probably result in that developers will come up with conventions to set the result on the request themselves. This wouldn't be terribly bad, but also seems nice if we can help them. / Jonas On Mon, Jan 10, 2011 at 8:13 AM, ben turner bent.mozi...@gmail.com wrote: FWIW Jonas' proposed changes have been implemented and will be included in Firefox 4 Beta 9, due out in a few days. -Ben On Fri, Dec 10, 2010 at 12:47 PM, Jonas Sicking jo...@sicking.cc wrote: I've been reaching out to get feedback, but no success yet. Will re-poke. / Jonas On Fri, Dec 10, 2010 at 4:33 AM, Jeremy Orlow jor...@chromium.org wrote: Any additional thoughts on this? If no one else cares, then we can go with Jonas' proposal (and we should file a bug). J On Thu, Nov 11, 2010 at 12:06 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Nov 9, 2010 at 11:35 AM, Jonas Sicking jo...@sicking.cc wrote: Hi All, One of the things we briefly discussed at the summit was that we should make IDBErrorEvents have a .transaction. This since we are allowing you to place new requests from within error handlers, but we currently provide no way to get from an error handler to any useful objects. Instead developers will have to use closures to get to the transaction or other object stores. Another thing that is somewhat strange is that we only make the result available through the success event. There is no way after that to get it from the request. So instead we use special event interfaces with supply access to source, transaction and result. Compare this to how XMLHttpRequests work. Here the result and error code is available on the request object itself. The 'load' event, which is equivalent to our 'success' event didn't supply any
Re: [IndexedDB] Events and requests
Just to correct my cut and paste error, that was of course supposed to be: var y = do { result1 - db.transaction([foo]).objectStore(foo).getM(mykey1); result2 - db.transaction([foo]).objectStore(foo).getM(mykey2); unit(result1 + result2); } Cheers, Keean. On 10 January 2011 22:24, Keean Schupke ke...@fry-it.com wrote: Okay, sorry, the original change seemed sensible, I guess I didn't see how you got from there to promises. Here's some fun to think about as an alternative though: Interestingly the pattern of multiple callbacks, providing each callback is passed zero or one parameter forms a Monad. So for example if 'unit' is the constructor for the object returned from get then onsuccess it 'bind' and I can show that these obey the 3 monad laws. Allowing composability of callbacks. So you effectively have: var x = db.transaction([foo]).objectStore(foo).getM(mykey); var y = db.transaction([foo]).objectStore(foo).getM(mykey1).bind(function(result1) { db.transaction([foo]).objectStore(foo).getM(mykey2).bind(function(result2) { unit(result1 + result2); }); }); The two objects returned x and y are both the same kind of object. y represents the sum or concatination of the results of the lookups mykey1 and mykey2. You would use it identically to using the result of a single lookup: x.bind(function(result) {... display the result of a single lookup ...}); y.bind(function(result) {... display the result of both lookups ...}); If we could then have some syntactic sugar for this like haskell's do notation we could write: var y = do { db.transaction([foo]).objectStore(foo).getM(mykey1); result1 - db.transaction([foo]).objectStore(foo).getM(mykey2); result2 - db.transaction([foo]).objectStore(foo).getM(mykey2); unit(result1 + result2); } Which would be a very neat way of chaining callbacks... Cheers, Keean. On 10 January 2011 22:00, Keean Schupke ke...@fry-it.com wrote: Whats wrong with callbacks? To me this seems an unnecessary complication. Presumably you would do: var promise = db.transaction([foo]).objectStore(foo).get(mykey); var result = promise.get(); if (!result) { promise.onsuccess(function(res) {...X...}); } else { ...Y... } So you end up having to duplicate code at X and Y to do the same thing directly or in the context of a callback. Or you define a function to process the result: var f = function(res) {...X...}; var promise = db.transaction([foo]).objectStore(foo).get(mykey); var result = promise.get(); if (!result) { promise.onsuccess(f); } else { f(result) }; But in which case what advantage does all this extra clutter offer over: db.transaction([foo]).objectStore(foo).get(mykey).onsuccess(function(res) {...X...}); I am just wondering whether the change is worth the added complexity? Cheers, Keean. On 10 January 2011 21:31, Jonas Sicking jo...@sicking.cc wrote: I did some outreach to developers and while I didn't get a lot of feedback, what I got was positive to this change. The basic use-case that was brought up was implementing a promises which, as I understand it, works similar to the request model I'm proposing. I.e. you build up these promise objects which represent a result which may or may not have arrived yet. At some point you can either read the value out, or if it hasn't arrived yet, register a callback for when the value arrives. It was pointed out that this is still possible with how the spec is now, but it will probably result in that developers will come up with conventions to set the result on the request themselves. This wouldn't be terribly bad, but also seems nice if we can help them. / Jonas On Mon, Jan 10, 2011 at 8:13 AM, ben turner bent.mozi...@gmail.com wrote: FWIW Jonas' proposed changes have been implemented and will be included in Firefox 4 Beta 9, due out in a few days. -Ben On Fri, Dec 10, 2010 at 12:47 PM, Jonas Sicking jo...@sicking.cc wrote: I've been reaching out to get feedback, but no success yet. Will re-poke. / Jonas On Fri, Dec 10, 2010 at 4:33 AM, Jeremy Orlow jor...@chromium.org wrote: Any additional thoughts on this? If no one else cares, then we can go with Jonas' proposal (and we should file a bug). J On Thu, Nov 11, 2010 at 12:06 PM, Jeremy Orlow jor...@chromium.org wrote: On Tue, Nov 9, 2010 at 11:35 AM, Jonas Sicking jo...@sicking.cc wrote: Hi All, One of the things we briefly discussed at the summit was that we should make IDBErrorEvents have a .transaction. This since we are allowing you to place new requests from within error handlers, but we currently provide no way to get from an error handler to any useful objects. Instead developers will have to use closures to get to the transaction or other object stores. Another thing that is somewhat strange is that we only make the result available through the success
Re: [chromium-html5] LocalStorage inside Worker
On 8 January 2011 00:57, Glenn Maynard gl...@zewt.org wrote: On Thu, Jan 6, 2011 at 6:06 PM, Charles Pritchardch...@jumis.com wrote: I don't think localStorage should be (to web workers), but sessionStorage seems a reasonable request. It's not arbitrary: the names local and session convey some meaning. localStorage works well enough, out in the wild. sessionStorage is not in wide use. I don't think it's restrictive, it just creates a wider implementation divide between session and local. What I meant was: you said that you don't think localStorage should be available to workers, but I don't understand why. Why should sessionStorage be available, but localStorage not? -- Glenn Maynard There is also the issue that current localStorage implementations may be broken by multiple tabs/windows. To say it works well enough in the wild seems to ignore this brokenness. If access had to be from inside an atomic block (a callback from a single storage-thread) then this would fix access from multiple tabs/windows as well as from worker threads. This could be implemented as a single threaded callback serialising access to the storage, but implementers could choose to use Software Transactional Memory techniques to give their browser a speed advantage. Cheers, Keean.
Re: Limited DOM in Web Workers
Hi, Sorry for this small aside, but it (slightly) relevent. What do you suggest people use instead of e4x in general. For example: var x = tabletrtdsomething/td/tr/table; Is a lot more elegant than: var x2 = document.createTextNode('something'); var x1 = document.createElement('td'); x1.appendChild(x2); var x0 = document.createElement('tr'); x0.appendChild(x1); var x = document.createElement('table'); x.appendChild(x0); The only thing I can think of is having a the table attached to the document but hidden and then copying the html fragment: var x = document.getElementById('hiddentable').cloneNode(true); But how do you ensure the renderer and DOM traversal ignores the hidden node as in a HTML5 app with multiple UI element that need be on screen at different times it could slow things down a lot. Cheers, Keean. On 8 January 2011 09:09, Jonas Sicking jo...@sicking.cc wrote: On Fri, Jan 7, 2011 at 7:34 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 1/7/11 2:29 PM, Jack Coulter wrote: I'm not talking about allowing Worker's to manipulate the main DOM tree of the page, but rather, exposing DOMParser, and XMLHttpRequest.responseXML, and a few other objects to workers, to allow the manipulation of DOM trees which are never actually rendered to the page. Whether they're rendered doesn't necessarily matter if the DOM implementation is not threadsafe (which it's not, in today's UAs). That said... This would allow developers to parse and manipulate XML in workers, freeing the main thread of a page to perform other tasks. ... An example of a use-case, I'd like to hack on the Strope.js XMPP implementation to allow it to run in a worker thread, currently this is impossible, without writing my own XML parser, which would undoubtedly be slower than the native DOMParser) If you think you could do this with your own XML parser, is there a reason you can't do it with e4x (I never thought I'd say that, but this seems like an actually good use case for something like e4x)? That should work fine in workers in Gecko-based browsers that support it, and doesn't drag in the entire DOM implementation. That leaves the problem of convincing developers of those ECMAScript implementations that don't support e4x to support it, of course; while things like http://code.google.com/p/v8/issues/detail?id=235#c42 don't necessarily fill me with hope in that regard it may still be simpler than convincing all browsers to rewrite their DOMs to be threadsafe in the way that would be needed to support exposing an actual DOM in workers. I would strongly advice using e4x. It seems unlikely to be picked up by other browsers, and I'm still hoping that we'll remove support from gecko before long. My question is instead, what part of the DOM is it that you want? One of the most important features of the DOM is modifying what is being displayed to the user. Obviously that isn't the features requested here. Another important feature is simply holding a tree structure. However plain javascript objects do that very well (better than the DOM in many ways). Other features of the DOM include form handling, parsing attribute values in the form of integers, floats, comma-separated lists, etc, URL resolving and more. Much of this doesn't seem very interesting to do on workers, or at least important to have the browser provide an implementation for in workers. Hence I'm asking, why specifically would you like to access a DOM from workers? / Jonas / Jonas
Re: [chromium-html5] LocalStorage inside Worker
On 8 January 2011 10:00, Glenn Maynard gl...@zewt.org wrote: On Sat, Jan 8, 2011 at 4:06 AM, Keean Schupke ke...@fry-it.com wrote: If access had to be from inside an atomic block (a callback from a single storage-thread) then this would fix access from multiple tabs/windows as well as from worker threads. Your suggestion and Jonas's are very similar. I think the difference is that you're suggesting an API that would permit non-serialized access to the objects, by using transactional methods, where Jonas's completely serializes access. Jonas's is much simpler; I don't think the complexity of this type of transactional access is needed, or appropriate for simple Storage objects. -- Glenn Maynard I am suggesting that as the semantics are the same, People can think of this like serialised access, but implementers can use STMs to make their browser faster than the competition (if they want). To the user it will look the same. Cheers, Keean.
Re: [chromium-html5] LocalStorage inside Worker
So long as you only allow asynchronous access the implementation can ensure that a worker and the main thread doesn't have access to the storage at the same time. Then it is safe to allow everyone to modify the storage area. / Jonas This is true, serialising access would have the same semantics as STM. Infact you could consider STM to be a performance enhancement to sequential access by optimistically allowing concurrent modifications and only doing something special if there is a collision (a read from a location written by another thread during the transaction). In which case STM works like a database and rolls back the transaction. It is really putting a thread local log between the user and the storage. The main storage is then only locked during the log commit, reducing resource contention. A rollback is simply discarding the log. But this would behave identically (apart from the extra features in STM like guards and retry) to serialisation of requests. A simple (non STM) implementation would be to have a single thread associated with the localStorage and require all accesses to be executed by that thread (in callbacks). You could use the main UI thread, but it would make worker threads wait for storage access during DOM processing in callbacks etc... Cheers, Keean.
Re: [chromium-html5] LocalStorage inside Worker
Race conditions still happen if you (jarringly) forgot to wrap your shared object inside atomic block :P. So, maybe it's a good idea to only allow localStorage to be accessed inside an atomic block (even in workers)? Yes, that was in my original suggestion. atomic(function(shared) {...}); The callback scoped variable shared is the only way to access the shared namespace. Cheers, Keean.
Re: [chromium-html5] LocalStorage inside Worker
There is always Software Transactional Memory that provides a safe model for memory shared between threads. http://en.wikipedia.org/wiki/Software_transactional_memory This has been used very successfully in Haskell for overcoming threading / state issues. Combined with Haskells Channels (message queues) it provides for very elegant multi-threading. Cheers, Keean. On 6 January 2011 22:44, Jonas Sicking jo...@sicking.cc wrote: On Thu, Jan 6, 2011 at 2:25 PM, João Eiras joao.ei...@gmail.com wrote: On , Jonas Sicking jo...@sicking.cc wrote: On Thu, Jan 6, 2011 at 12:01 PM, Jeremy Orlow jor...@chromium.org wrote: public-webapps is probably the better place for this email On Sat, Jan 1, 2011 at 4:22 AM, Felix Halim felix.ha...@gmail.com wrote: I know this has been discussed 1 year ago: http://www.mail-archive.com/whatwg@lists.whatwg.org/msg14087.html I couldn't find the follow up, so I guess localStorage is still inaccessible from Workers? Yes. I have one other option aside from what mentioned by Jeremy: http://www.mail-archive.com/whatwg@lists.whatwg.org/msg14075.html 5: Why not make localStorage accessible from the Workers as read only ? The use case is as following: First, the user in the main window page (who has read/write access to localStorage), dumps a big data to localStorage. Once all data has been set, then the main page spawns Workers. These workers read the data from localStorage, process it, and returns via message passing (as they cannot alter the localStorage value). What are the benefits? 1. No lock, no deadlock, no data race, fast, and efficient (see #2 below). 2. You only set the data once, read by many Worker threads (as opposed to give the big data again and again from the main page to each of the Workers via message). 3. It is very easy to use compared to using IndexedDB (i'm the big proponent in localStorage). Note: I was not following the discussion on the spec, and I don't know if my proposal has been discussed before? or is too late to change now? I don't think it's too late or has had much discussion any time recently. It's probably worth re-exploring. Unfortunately this is not possible. Since localStorage is synchronously accessed, if we allowed workers to access it that would mean that we no longer have a shared-nothing-message-passing threading model. Instead we'd have a shared memory threading model which would require locks, mutexes, etc. Making it readonly unfortunately doesn't help. Consider worker code like: var x = 0; if (localStorage.foo 10) { x += localStorage.foo; } would you expect x ever being something other than 0 or 1? Not different from two different tabs/windows running the same code. So the same solution for that case would work for Workers. Making the API async would make it more hard to use, which is, I believe, one of the design goals of localStorage: to be simple. Exposing the web platform to shared memory multithreading is the exact opposite of simple. If two consecutive reads of the same localStorage value can yield different values, then that's something that developers have to cope with. If they do code that is sensible to that issue, then they can take a snapshot of the storage object, and apply it back later. Multithreaded shared memory programming is extremely complex. Multithreaded shared memory programming without the use of locks is beyond what I'd ever want to expose anyone to. Much less web developers. We've been down this discussion before. Please read the threads on why workers were designed as a shared-nothing message passing model rather than a pthreads or similar model. / Jonas
Re: [chromium-html5] LocalStorage inside Worker
Did you see section 7 in the link I posted? 7 Implementations 7.1 C/C++ 7.2 C# 7.3 Common Lisp 7.4 Haskell 7.5 Java 7.6 OCaml 7.7 Perl 7.8 Python 7.9 Scala 7.10 Smalltalk JavaScript as a functional language (first class functions, closures, anonymous functions) has a lot in common with Haskell and other functional languages (Lisp)... Although as you can see there are plenty of OO implementations too. Cheers, Keean. 2011/1/6 Jonas Sicking jo...@sicking.cc 2011/1/6 Keean Schupke ke...@fry-it.com: There is always Software Transactional Memory that provides a safe model for memory shared between threads. http://en.wikipedia.org/wiki/Software_transactional_memory This has been used very successfully in Haskell for overcoming threading / state issues. Combined with Haskells Channels (message queues) it provides for very elegant multi-threading. Can you provide a link to the Haskell API which you think has been working well for haskell. Or even better, considering that haskell is a vastly different language from javascript, could you propose a javascript API based on Software Transactional Memory. / Jonas
Re: [chromium-html5] LocalStorage inside Worker
Here's a link to some papers on STM: http://research.microsoft.com/en-us/um/people/simonpj/papers/stm/ A simple example: http://www.haskell.org/haskellwiki/Simple_STM_example Here's a tutorial: http://book.realworldhaskell.org/read/software-transactional-memory.html Here's a link to the docs: http://hackage.haskell.org/package/stm Cheers, Keean. 2011/1/6 Keean Schupke ke...@fry-it.com Did you see section 7 in the link I posted? 7 Implementations 7.1 C/C++ 7.2 C# 7.3 Common Lisp 7.4 Haskell 7.5 Java 7.6 OCaml 7.7 Perl 7.8 Python 7.9 Scala 7.10 Smalltalk JavaScript as a functional language (first class functions, closures, anonymous functions) has a lot in common with Haskell and other functional languages (Lisp)... Although as you can see there are plenty of OO implementations too. Cheers, Keean. 2011/1/6 Jonas Sicking jo...@sicking.cc 2011/1/6 Keean Schupke ke...@fry-it.com: There is always Software Transactional Memory that provides a safe model for memory shared between threads. http://en.wikipedia.org/wiki/Software_transactional_memory This has been used very successfully in Haskell for overcoming threading / state issues. Combined with Haskells Channels (message queues) it provides for very elegant multi-threading. Can you provide a link to the Haskell API which you think has been working well for haskell. Or even better, considering that haskell is a vastly different language from javascript, could you propose a javascript API based on Software Transactional Memory. / Jonas
Re: [chromium-html5] LocalStorage inside Worker
Applying this to JavaScript (ignoring local storage and just implementing an STM) would come up with something like: 1) Objects from one thread should not be visible to another. Global variable test defined in the UI or any worker thread should no be in scope in any other worker-thread. 2) shared objects could be accessed only though the atomic method (implemented natively). atomic(function(shared) { shared.x += 1; shared.y -= 2; }); Here, the callback is the transaction, and shared is the shared namespace... Thats all you need for a basic implementation. The clever stuff is all hidden from the user. We could implement retry by returning true... the guard could just be a boolean function too: atomic(function(shared) { if (queueSize 0) { // remove item from queue and use it return false; // no retry } else { return true; // retry } }); Thats pretty much the entire user visible API that would be needed. Of course the implementation behind the scenes is more complex. Cheers, Keean. 2011/1/6 Keean Schupke ke...@fry-it.com Here's a link to some papers on STM: http://research.microsoft.com/en-us/um/people/simonpj/papers/stm/ A simple example: http://www.haskell.org/haskellwiki/Simple_STM_example Here's a tutorial: http://book.realworldhaskell.org/read/software-transactional-memory.html Here's a link to the docs: http://hackage.haskell.org/package/stm Cheers, Keean. 2011/1/6 Keean Schupke ke...@fry-it.com Did you see section 7 in the link I posted? 7 Implementations 7.1 C/C++ 7.2 C# 7.3 Common Lisp 7.4 Haskell 7.5 Java 7.6 OCaml 7.7 Perl 7.8 Python 7.9 Scala 7.10 Smalltalk JavaScript as a functional language (first class functions, closures, anonymous functions) has a lot in common with Haskell and other functional languages (Lisp)... Although as you can see there are plenty of OO implementations too. Cheers, Keean. 2011/1/6 Jonas Sicking jo...@sicking.cc 2011/1/6 Keean Schupke ke...@fry-it.com: There is always Software Transactional Memory that provides a safe model for memory shared between threads. http://en.wikipedia.org/wiki/Software_transactional_memory This has been used very successfully in Haskell for overcoming threading / state issues. Combined with Haskells Channels (message queues) it provides for very elegant multi-threading. Can you provide a link to the Haskell API which you think has been working well for haskell. Or even better, considering that haskell is a vastly different language from javascript, could you propose a javascript API based on Software Transactional Memory. / Jonas
Re: [IndexedDB] Why rely on run-to-completion?
This is very similar window.indexedDB.open(..., { onsuccess: function(event) { ... }; }); Except it requires an extra level on indenting for the callback definitions. In both this and the current implementation there is the additional overhead of an object creation for every call, when compared to simply having plain function arguments: window.indexedDB.open(..., function(event) { ... }); However just parsing the callback as a function argument does not make it as clear what is happening when reading the code. There may be a point in making this more generic. As there is only a single thread there is no way any callback code can be executed while the current function does not return. Consider: var f = function() { setTimeout(function g() {...}, 1000); while(true) {}; }; In this code the 'g' will never get called... how can it when the single thread is busy in the while loop? Technically this could be possible if the interpreter implemented interrupts and continuations, so that the timeout stops the JS interpreter which saves a continuation allowing it to resume later and then executes the callback in a fresh context. However interpreter level continuations are not a feature of standard JavaScript. If interpreter continuations were implemented they would break the current API... This could be fixed by deferring the execution of the initial function like so: var request = window.indexedDB.open(...); // request object stores parameters (maybe some pre-comutation is done). request.onsuccess = function(event) { ... }; // set callback request.run(); // execute the part of the open function that can cause the callback. So this would keep the current style, but also be more generic. Cheers, Keean. On 30 December 2010 08:45, Axel Rauschmayer a...@rauschma.de wrote: Right. But is there anything one loses by not relying on it, by making the API more generic? On Dec 30, 2010, at 7:58 , Jonas Sicking wrote: On Wed, Dec 29, 2010 at 2:44 PM, Axel Rauschmayer a...@rauschma.de wrote: Can someone explain a bit more about the motivation behind the current design of the async API? var request = window.indexedDB.open(...); request.onsuccess = function(event) { ... }; The pattern of assigning the success continuation after invoking the operation seems to be to closely tied to JavaScript’s current run-to-completion event handling. But what about future JavaScript environments, e.g. a multi-threaded Node.js with IndexedDB built in or Rhino with IndexedDB running in parallel? Wouldn’t a reliance on run-to-completion unnecessarily limit future developments? Maybe it is just me, but I would like it better if the last argument was an object with the error and the success continuations (they could also be individual arguments). That is also how current JavaScript RPC APIs are designed, resulting in a familiar look. Are there any arguments *against* this approach? Whatever the reasoning behind the design, I think it should be explained in the spec, because the current API is a bit tricky to understand for newbies. Note that almost everyone relies on this anyway. I bet that almost all code out there depends on that the code in for example onload handlers for XHR requests run after the current thread of execution has fully finished. Asynchronous events isn't something specific to javascript. / Jonas -- Dr. Axel Rauschmayer axel.rauschma...@ifi.lmu.de http://hypergraphs.de/ ### Hyena: organize your ideas, free at hypergraphs.de/hyena/
Re: [IndexedDB] Why rely on run-to-completion?
The JavaScript engine we have implemented has interpreter continuations. So at bytecode boundaries it is able to process pending events. (not saying it currently does this, but it may in the future). This is not multi-threading, there is only one thread per engine which maintains an interpreter environment and communicates with other engines by message passing (we already have a worker API, although non-standard). This could cause a problem with the current API. The fix for this is to make sure the callbacks are defined before the function using the callbacks is called. I think keeping away from multi-threading in JS is sensible (perhaps Erlang style multi-processing would be good though). However interrupting the interpreter to process callbacks is just a single thread and causes no problems providing the callbacks are initialised before the call that initialises the background process that will generate the asynchronous event. Cheers, Keean. On 30 December 2010 20:44, Jonas Sicking jo...@sicking.cc wrote: Even if we decide to make the environment in which we run webpage script multithreaded the current API will work fine. Generally speaking in multithreaded environments you do callbacks on the same thread as which the initial function is called. Alternatively you'd want to pass in the thread on which you want callbacks, along with the callbacks you want called. But in that case using EventTargets doesn't make sense as you don't know if a callback has already happened by the time you call addEventListener. Likewise, the readyState property also would need to be removed as by the time you check it it can already be out of date. In short, a complete revamping of the API would be needed, the small modification you are proposing would be nowhere near enough. However most of all I'm not terribly worried that we'll make the browser scripting environment multitheaded. Multithreading is extremely complicated. To this day research is still happening on how to implement even the most simple datastructures, such as queues and hash tables, effectively in a mulithreaded environment. See the discussions on the WhatWG list which took place when we designed the workers API. I find it much more likely that we'll stick with the approach that workers have introduced of having separate environments which run on different threads and with no shared state. Communication between threads happen through message passing. This is similar to languages such as Google's Go and Mozilla's Rust. / Jonas On Thu, Dec 30, 2010 at 12:45 AM, Axel Rauschmayer a...@rauschma.de wrote: Right. But is there anything one loses by not relying on it, by making the API more generic? On Dec 30, 2010, at 7:58 , Jonas Sicking wrote: On Wed, Dec 29, 2010 at 2:44 PM, Axel Rauschmayer a...@rauschma.de wrote: Can someone explain a bit more about the motivation behind the current design of the async API? var request = window.indexedDB.open(...); request.onsuccess = function(event) { ... }; The pattern of assigning the success continuation after invoking the operation seems to be to closely tied to JavaScript’s current run-to-completion event handling. But what about future JavaScript environments, e.g. a multi-threaded Node.js with IndexedDB built in or Rhino with IndexedDB running in parallel? Wouldn’t a reliance on run-to-completion unnecessarily limit future developments? Maybe it is just me, but I would like it better if the last argument was an object with the error and the success continuations (they could also be individual arguments). That is also how current JavaScript RPC APIs are designed, resulting in a familiar look. Are there any arguments *against* this approach? Whatever the reasoning behind the design, I think it should be explained in the spec, because the current API is a bit tricky to understand for newbies. Note that almost everyone relies on this anyway. I bet that almost all code out there depends on that the code in for example onload handlers for XHR requests run after the current thread of execution has fully finished. Asynchronous events isn't something specific to javascript. / Jonas -- Dr. Axel Rauschmayer axel.rauschma...@ifi.lmu.de http://hypergraphs.de/ ### Hyena: organize your ideas, free at hypergraphs.de/hyena/
Re: [IndexedDB] Why rely on run-to-completion?
On 30 December 2010 23:08, Jonas Sicking jo...@sicking.cc wrote: On Thu, Dec 30, 2010 at 2:19 PM, Keean Schupke ke...@fry-it.com wrote: The JavaScript engine we have implemented has interpreter continuations. So at bytecode boundaries it is able to process pending events. (not saying it currently does this, but it may in the future). This is not multi-threading, there is only one thread per engine which maintains an interpreter environment and communicates with other engines by message passing (we already have a worker API, although non-standard). This could cause a problem with the current API. The fix for this is to make sure the callbacks are defined before the function using the callbacks is called. I think keeping away from multi-threading in JS is sensible (perhaps Erlang style multi-processing would be good though). However interrupting the interpreter to process callbacks is just a single thread and causes no problems providing the callbacks are initialised before the call that initialises the background process that will generate the asynchronous event. If you are interrupting at arbitrary points in the execution and running other script contexts which can synchronously call into the first javascript context, then you are implementing multithreading. This is in fact exactly how multithreading works on single-core CPUs. It means that you are exposing race conditions and all other threading hazards to webpages. / Jonas That makes complete sense to me, although not all threading hazards would be exposed, as partial writes will not be a problem, all variable accesses will automatically be atomic. But yes, race conditions would be a problem with interrupts so I agree its a bad idea. (The interpreter continuations are currently used to store interpreter state when executing a blocking IO action, so that other engines can carry on running when running in a single threaded environment). In that case I can't see any limitations to the current API. As for the aesthetic considerations, the way JavaScript works is by events, it makes more sense to expose the API as events rather than callbacks, as callbacks give the false impression that the callback can happen at any time. Cheers, Keean.
Re: FileAPI use case: making an image downloading app
On 18 December 2010 17:38, Charles Pritchard ch...@jumis.com wrote: On 12/17/2010 5:03 PM, Gregg Tavares (wrk) wrote: On Fri, Dec 17, 2010 at 4:16 PM, Charles Pritchard ch...@jumis.comwrote: We're actively developing such functionality. The limit per directory is for the sake of the os file system. If you want to create a data store, use indexedDB or use standard file system practices (create a subdirectory tree). I think you're missing the point. If I have a folder with 6000 files on some server and I want to mirror that through the FileAPI to the user's local machine I'm screwed. I can't **mirror** it if it's not a mirror. I'm not missing the point. I'm actively developing an app that downloads images from photo sites. A strict Mirror (lets use a capital M) is something you're not going to pull off with the File System API at this time. You can't set meta data, like permissions/flags and modification/creation dates. Developing a Mirror is not feasible with the current API. You can't create a direct Mirror, one which would work seamlessly with rsync. You can mirror, the data on a remote server, and check that the data already exists on your file system, by using a subdirectory system, much like the two level directory structure that many cache apps use (like Squid). Disk space availability (quota) is an issue no matter what happens. When downloading 1000 images, you'll still only be doing so x at a time. I don't see the point you're trying to make here. I don't know the size of the images before hand. Many internet APIs make getting the sizes prohibitively expensive. One REST call per file. So before I can download a single file I'd have to issue 1000 REST XHRs to find the sizes of the files (waiting the several minutes for that to complete) before I can ask the user for the space needed. That's not a good user experience. If on the other hand the user can give me permission for unlimited space then I can just start downloading the files without having to find out their ultimate size. I suppose I can just request 1 terabyte up front. Or query how much space is free and ask for all of it. Yes, that's correct, you'd hope the space is available. When you run out of space, you can use a limited amount of RAM while waiting. There are few resource management APIs available for memory/bandwidth hungry applications. My point was that these questions you're bringing up are common to all cross-platform applications. Regarding your issue of a ray tracing program : You'd also want to either: create sub directories, or simply create a large file with its own methods. This issues are inherent in the design of any application of scale. At this point, the file system API does work for the use case you're describing. It'd be nice to see Blob storage better integrated with Web Storage Apis. Ian has already spoken to this, but no followers yet (afaik). -Charles On Dec 17, 2010, at 3:34 PM, Gregg Tavares (wrk) g...@google.com wrote: Sorry if this has been covered before. I've been wanting to write an app to download images from photo sites and I'm wondering if this use case has been considered for the FileAPI wrt Directories and System. If I understand the current spec it seems like their are some possible issues. #1) The spec says there is a 5000 file limit or directory. #2) The spec requires an app to specify how much storage it needs I understand the desire for the limits. What about an app being able to request unlimited storage and unlimited files? The UA can decide how to present something to the user to grant permission if they want. Arguments against leaving it as is: The 5000 file limit seems arbitrary. Any app that hits that limit will probably require serious re-engineering to work around it. It will not only have to some how describe a mapping between files on a server that may not have that limit, it also has the issue the user might have something organized that way and will require the user to re-organize. I realize that 5000 is a large number. I'm sure the author of csh though 1700 entires in a glob was a reasonable limit as well. We all know how well that turned out :-( It's easy to imagine a video editing app that edits and composites still images. If there are a few layers and 1 image per layer it could easily add up to more than 5000 files in a single folder. The size limit also has issues. For many apps the size limit will be no problem but for others Example: You make ray tracing program, it traces each frame and saves the files to disc. When it runs out of room, what is its options? (1) fail. (2) request a file system of a larger size. The problem is (2) will require user input. Imagine you start your render before you leave work expecting it to finish by the time you get in only to find that 2 hours after you left the UA popped up a confirmation this app
Re: [cors] 27 July 2010 CORS feedback
It this spec the place to fix cross site vulnerabilities? Would it not be better to restrict cookies to only be sent when the domain of the page you are navigating away from matches the cookie domain, as well as the page you are navigating to? Cheers, Keean. On 22 November 2010 09:10, Mark Nottingham m...@mnot.net wrote: On 22/11/2010, at 7:53 PM, Jonas Sicking wrote: Practically speaking, the only constrains on form submissions request entities is that they contain a '='. Using text/plain encoded forms you can submit any content with that restriction. Further, I believe that flash allows cross site POST submission with arbitrary data, i.e. even data without a '='. But I haven't looked into that in more detail. Perhaps. I still don't think it's great for the W3C to standardise yet another method of sending cross-site POSTs without permission. 3) When a server changes the headers in a response based upon the value of the incoming Origin header (as outlined in sections 5.1 and 5.2), it must insert Vary: Origin into *all* responses for that resource; otherwise, downstream caches will incorrectly store it. Be aware that doing so will cause many versions of IE not to cache those responses at all. Another option would be to disallow varying the response based upon the Origin header. Disallowing varying by origin seems like a bigger problem than IE not caching. Either way, it needs to be addressed. You mean by adding a note in the spec? Are you adding a similar note to http-bis about the Vary header? RFC2616 already defines Vary: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44 ... and bis refines it: http://tools.ietf.org/html/draft-ietf-httpbis-p6-cache-12#section-3.5 5) Using a preflight check in combination with a cache exposes sites to DNS rebinding, man-in-the-middle, and potentially other attacks that did not exist before. This should be noted. The DNS rebinding issue is a quality of implementation issue. It's no problem simply rerequesting the preflight if the DNS resolves to a different IP between the preflight and the actual request. I agree that noting this in spec is a good idea. Could you describe the new man-in-the-middle attacks which did not exist before with cross-origin communications? I suspect we're quibbling over the definition of 'new' here, but can agree that CORS is going to be another tool to attack sites with (which to be fair isn't really its fault; it's just that we should give people fair warning). I'm not sure adding ominous There might be ways that this spec can be used for cross site attacks. Try to take precautions notes to the spec are much more useful then the There is a general threat, but we don't have any more specific information at this time. People should be aware of their surroundings. alerts that the Department of Homeland Security sends out :-) That's the second straw man you've used, Jonas. Please stop. 6) Requiring a preflight check per-URI is not an efficient use of network resources for applications that use a large number of URIs, as is becoming more prevalent; effectively, it introduces another round-trip for each unsafe request. Handling OPTIONS is also somewhat specialised on many servers. It's also awkward to handle OPTIONS per-URI on many servers. I've raised this several times before, and am still not convinced that the underlying requirement (#8) justifies such a convoluted and ill-concieved design, or indeed is effectively met by this design. Allowing a site to define a 'map' of where cross-origin requests are allowed to go would be more efficient in most cases, would be vastly simpler to implement for servers, and would be similar to many other site-wide policy mechanisms on the Web. We had a design in place which allowed preflights to apply to multiple URIs. However there were too many issues with servers resolving URIs in weird ways which made us drop it. One concrete example was that some versions of IIS UTF8 decoded URIs and then ignored bits above the lower 8 bits. This made it treat URIs as if they contained .. when the browser had no idea of this. In short, CORS felt like the wrong spec to start relying on servers not to do strange URI handling. I'm not sure what you're referring to, but there are clean ways to do this without resorting to depending on how servers interpret URIs. I'm sure proposals are welcome. I'm pretty sure they're not, based upon past experience. We've been through this a few times already. Regards, -- Mark Nottingham http://www.mnot.net/
Re: [cors] 27 July 2010 CORS feedback
I guess I didn't put that very well. This is more a general comment on the discussion rather than a relpy to a specific post. If I can forge requests to web sites (for example using curl), then any site that does not impose security checks on its input values is asking for trouble. Making security depend on headers which can be forged seems like false security to me. So I see no problem allowing POSTs to any URL, providing the POST cannot assume any authority on the part of the user. That way a POST can do no more harm than a script using curl. The solution to the POST authority problem would seem to be to apply the origin rule for cookies (which seems sufficient to me) consistently (perhaps it already is?) So CORS seems to me to be about permitting exceptions to a security policy, not about fixing that underlying security policy. I don't think any changes to the CORS spec can fix a broken underlying security policy. Perhaps I am missing something though? Cheers, Keean. On 22 November 2010 09:28, Keean Schupke ke...@fry-it.com wrote: It this spec the place to fix cross site vulnerabilities? Would it not be better to restrict cookies to only be sent when the domain of the page you are navigating away from matches the cookie domain, as well as the page you are navigating to? Cheers, Keean. On 22 November 2010 09:10, Mark Nottingham m...@mnot.net wrote: On 22/11/2010, at 7:53 PM, Jonas Sicking wrote: Practically speaking, the only constrains on form submissions request entities is that they contain a '='. Using text/plain encoded forms you can submit any content with that restriction. Further, I believe that flash allows cross site POST submission with arbitrary data, i.e. even data without a '='. But I haven't looked into that in more detail. Perhaps. I still don't think it's great for the W3C to standardise yet another method of sending cross-site POSTs without permission. 3) When a server changes the headers in a response based upon the value of the incoming Origin header (as outlined in sections 5.1 and 5.2), it must insert Vary: Origin into *all* responses for that resource; otherwise, downstream caches will incorrectly store it. Be aware that doing so will cause many versions of IE not to cache those responses at all. Another option would be to disallow varying the response based upon the Origin header. Disallowing varying by origin seems like a bigger problem than IE not caching. Either way, it needs to be addressed. You mean by adding a note in the spec? Are you adding a similar note to http-bis about the Vary header? RFC2616 already defines Vary: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44 ... and bis refines it: http://tools.ietf.org/html/draft-ietf-httpbis-p6-cache-12#section-3.5 5) Using a preflight check in combination with a cache exposes sites to DNS rebinding, man-in-the-middle, and potentially other attacks that did not exist before. This should be noted. The DNS rebinding issue is a quality of implementation issue. It's no problem simply rerequesting the preflight if the DNS resolves to a different IP between the preflight and the actual request. I agree that noting this in spec is a good idea. Could you describe the new man-in-the-middle attacks which did not exist before with cross-origin communications? I suspect we're quibbling over the definition of 'new' here, but can agree that CORS is going to be another tool to attack sites with (which to be fair isn't really its fault; it's just that we should give people fair warning). I'm not sure adding ominous There might be ways that this spec can be used for cross site attacks. Try to take precautions notes to the spec are much more useful then the There is a general threat, but we don't have any more specific information at this time. People should be aware of their surroundings. alerts that the Department of Homeland Security sends out :-) That's the second straw man you've used, Jonas. Please stop. 6) Requiring a preflight check per-URI is not an efficient use of network resources for applications that use a large number of URIs, as is becoming more prevalent; effectively, it introduces another round-trip for each unsafe request. Handling OPTIONS is also somewhat specialised on many servers. It's also awkward to handle OPTIONS per-URI on many servers. I've raised this several times before, and am still not convinced that the underlying requirement (#8) justifies such a convoluted and ill-concieved design, or indeed is effectively met by this design. Allowing a site to define a 'map' of where cross-origin requests are allowed to go would be more efficient in most cases, would be vastly simpler to implement for servers, and would be similar to many other site-wide policy mechanisms on the Web. We had a design in place which allowed preflights to apply to multiple URIs
Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?
Just a thought, because the spec does not limit the key size, does not mean the implementation has to index on huge keys. For example you may choose to index only the first 1000 characters of string keys, and then link the values of key collisions together in the storage node. This way things are kept fast and compact for the more normal key size, and there is a sensible limit. As long as the implementation behaves like it admits arbitrary key sizes, it can actually implement things how it likes. Another example would be one index for keys less than size X, and a separate oversize key index for keys of size greater than X. These could use a different internal structure and disk layout. Cheers, Keean. On 20 November 2010 04:13, Bjoern Hoehrmann derhoe...@gmx.net wrote: * Jonas Sicking wrote: The question is in part where the limit for ridiculous goes. 1K keys are sort of ridiculous, though I'm sure it happens. By ridiculous I mean that common systems would run out of memory. That is different among systems, and I would expect developers to consider it up to an order of magnitude, but not beyond that. Clearly, to me, a DB system should not fail because I want to store 100 keys á 100KB. Note that, since JavaScript does not offer key-value dictionaries for complex keys, and now that JSON.stringify is widely implemented, it's quite common for people to emulate proper dictionaries by using that to work around this particular JavaScript limitation. Which would likely extend to more persistent forms of storage. I don't understand what you mean here. I am saying that it's quite natural to want to have string keys that are much, much longer than someone might envision the length of string keys, mainly because their notion of string keys is different from the key length you might get from serializing arbitrary objects. -- Björn Höhrmann · mailto:bjo...@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Re: [Bug 11270] New: Interaction between in-line keys and key generators
Why not return the full 64bit ID in an opaque object? Maths and comparing IDs is meaningless anyway. Cheers, Keean. On 12 November 2010 21:05, Jeremy Orlow jor...@chromium.org wrote: On Fri, Nov 12, 2010 at 10:09 PM, Jonas Sicking jo...@sicking.cc wrote: On Fri, Nov 12, 2010 at 12:36 AM, Jeremy Orlow jor...@chromium.org wrote: On Fri, Nov 12, 2010 at 11:27 AM, Keean Schupke ke...@fry-it.com wrote: You can do it in SQL because tables that hold a reference to an ID can declare the reference in the schema. I guess without the meta-data to do this it cannot be done. Even in SQL, I'd be very hesitant to do this. Why not get the auto-increment to wrap and skip collisions? What about signed numbers? Exactly. If we're going to support this, let's keep it super simple. As Jonas mentioned, it's very unlikely that anyone would hit the 64bit limit in legitimate usage, so it's not worth trying to gracefully handle such a situation and adding a lot of surface area. Indeed. I'd prefer to fail fatally to trying to do something complicated and clever here. I'd be surprised if anyone ever ran into this issue unintentionally (i.e. when not explicitly testing to see what happens). One way to look at it is that before we run into 2^64 limit, we'll run into the limit that javascript can't represent all integers above 2^53. So once IDs get above that you basically won't be able to use the object store anyway. Good point. Actually we probably need to spec the limit to be 2^52ish so that the auto number is never anything greater than what javascript can address. J
Re: [Bug 11270] New: Interaction between in-line keys and key generators
Hi, On 13 November 2010 08:33, Jonas Sicking jo...@sicking.cc wrote: On Fri, Nov 12, 2010 at 11:59 PM, Keean Schupke ke...@fry-it.com wrote: Why not return the full 64bit ID in an opaque object? Maths and comparing IDs is meaningless anyway. Then we'd have to overload both the structured clone algorithm and the == javascript operator. Is that a problem? I can't see performance being an issue it has to determine which type of '==' to use anyway, and JavaScript does not appear to support Unboxing or Unboxed types. I accept that 2^53 bits is enough. To me though there is an advantage in not having the ID as an integer type. Basically the ID is an unordered sequence type. The only valid operators are '==' and '!='. ordered comparisons (greater, less) and maths mean nothing. I would think it better to use an opaque type so that people do not mistakenly think they can use these operators. It also allows implementers much more flexibility (and optimisation potential) in how they actually implement the IDs. Cheers, Keean.
Re: [Bug 11270] New: Interaction between in-line keys and key generators
Having said that, if its an opaque type, you could not supply values yourself, which is where this all started... and I think that is a good idea (for example when importing data). So whilst I think all the points I made in favour of an opaque type are true for this kind of thing in general, for this case I think the need to supply a value is more important. Personally I would like to see support for infinite-precision integers like Python has in JavaScript, but proper integer support would be a start. Cheers, Keean. On 13 November 2010 11:13, Keean Schupke ke...@fry-it.com wrote: Hi, On 13 November 2010 08:33, Jonas Sicking jo...@sicking.cc wrote: On Fri, Nov 12, 2010 at 11:59 PM, Keean Schupke ke...@fry-it.com wrote: Why not return the full 64bit ID in an opaque object? Maths and comparing IDs is meaningless anyway. Then we'd have to overload both the structured clone algorithm and the == javascript operator. Is that a problem? I can't see performance being an issue it has to determine which type of '==' to use anyway, and JavaScript does not appear to support Unboxing or Unboxed types. I accept that 2^53 bits is enough. To me though there is an advantage in not having the ID as an integer type. Basically the ID is an unordered sequence type. The only valid operators are '==' and '!='. ordered comparisons (greater, less) and maths mean nothing. I would think it better to use an opaque type so that people do not mistakenly think they can use these operators. It also allows implementers much more flexibility (and optimisation potential) in how they actually implement the IDs. Cheers, Keean.
Re: [Bug 11270] New: Interaction between in-line keys and key generators
You can do it in SQL because tables that hold a reference to an ID can declare the reference in the schema. I guess without the meta-data to do this it cannot be done. Why not get the auto-increment to wrap and skip collisions? What about signed numbers? Cheers, Keean. On 12 November 2010 08:23, Jeremy Orlow jor...@chromium.org wrote: We can't compact because the developer may be expecting to look items up by ID with IDs in another table, on the server, in memory, etc. There's no way to do it. J On Fri, Nov 12, 2010 at 10:56 AM, Keean Schupke ke...@fry-it.com wrote: The other thing you could do is specify that when you get a wrap (IE someone inserts a key of MAXINT - 1) you auto-compact the table. If you really have run out of indexes there is not a lot you can do. The other thing to consider it that because JS uses signed arithmetic, its really a 63bit number... unless you want negative indexes appearing? (And how would that affect ordering and sorting)? Cheers, Keean. On 12 November 2010 07:36, Jeremy Orlow jor...@chromium.org wrote: On Fri, Nov 12, 2010 at 10:08 AM, Jonas Sicking jo...@sicking.ccwrote: On Thu, Nov 11, 2010 at 9:22 PM, Jeremy Orlow jor...@chromium.org wrote: On Fri, Nov 12, 2010 at 12:32 AM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Nov 11, 2010 at 11:41 AM, Jeremy Orlow jor...@chromium.org wrote: On Thu, Nov 11, 2010 at 6:41 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: On Thu, Nov 11, 2010 at 4:20 AM, Jeremy Orlow jor...@chromium.org wrote: What would we do if what they provided was not an integer? The behavior isn't very important; throwing would be fine here. In mySQL, you can only put AUTO_INCREMENT on columns in the integer family. What happens if the number they insert is so big that the next one causes overflow? The same thing that happens if you do ++ on a variable holding a number that's too large. Or, more directly, the same thing that happens if you somehow fill up a table to the integer limit (probably deleting rows along the way to free up space), and then try to add a new row. What is the use case for this? Do we really think that most of the time users do this it'll be intentional and not just a mistake? A big one is importing some data into a live table. Many smaller ones are related to implicit data constraints that exist in the application but aren't directly expressed in the table. I've had several times when I could normally just rely on auto-numbering for something, but occasionally, due to other data I was inserting elsewhere, had to specify a particular id. This assumes that your autonumbers aren't going to overlap and is going to behave really badly when they do. Honestly, I don't care too much about this, but I'm skeptical we're doing the right thing here. Pablo did bring up a good use case, which is wanting to migrate existing data to a new object store, for example with a new schema. And every database examined so far has some ability to specify autonumbered columns. overlaps aren't a problem in practice since 64bit integers are really really big. So unless someone maliciously sets a number close to the upper bound of that then overlaps won't be a problem. Yes, but we'd need to spec this, implement it, and test it because someone will try to do this maliciously. I'd say it's fine to treat the range of IDs as a hardware limitation. I.e. similarly to how we don't specify how much data a webpage is allowed to put into DOMStrings, at some point every implementation is going to run out of memory and effectively limit it. In practice this isn't a problem since the limit is high enough. Another would be to define that the ID is 64 bit and if you run out of IDs no more rows can be inserted into the objectStore. At that point the page is responsible for creating a new object store and compacting down IDs. In practice no page will run into this limitation if they use IDs increasing by one. Even if you generate a new ID a million times a second, it'll still take you over half a million years to run out of 64bit IDs. This seems reasonable. OK, let's do it. And, in the email you replied right under, I brought up the point that this feature won't help someone who's trying to import data into a table that already has data in it because some of it might clash. So, just to make sure we're all on the same page, the use case for this is restoring data into an _empty_ object store, right? (Because I don't think this is a good solution for much else.) That's the main scenario I can think of that would require this yes. / Jonas
Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist
Yes, I prefer it due to the symmetry, and agree that its a judgment call. I guess the advantage of allowing it is library's can disallow if they like. The reverse is not true, if you disallow it a library cannot allow it. Cheers, Keean On 12 Nov 2010 09:00, Jeremy Orlow jor...@chromium.org wrote: On Fri, Nov 12, 2010 at 12:06 AM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Nov 11, 2010 at 11:44 AM, Jeremy Orlow jor...@chromium.org wrote: The email I responded to: It would make sense if you make setting a key to undefined semantically equivalent to deleting the value (and no error if it does not exist), and return undefined on a get when no such key exists. That way 'undefined' cannot exist as a value in the object store, and is a safe marker for the key not existing in that index. undefined should be symmetric. If something not existing returns undefined then passing in undefined should make it not exist. Overloading the meaning of a get returning undefined is ugly. And simply disallowing a value also seems a bit odd. But I think this is pretty elegant semantically. As I've asked previously in the tread. What problem are you trying to solve? Can you describe the type of application that gets easier to write/possible to write/has cleaner code/runs faster if we make this change? It seems like deleting on .put(undefined) creates a very unexpected behavior just to try to cover a rare edge case, wanting to both store undefined, This is not correct. The proposal was trying to remove an asymmetry within the API. and tell it apart from the lack of value.In fact, the proposal doesn't even solve that edge case since it no longer is possible to store undefined. Which brings me back to the question above of what problem you are trying to solve. ...this is trying to solve an asymmetry within the API. I know this is something I've gone back and forth on, but you'll remember that both Pablo and I (and maybe Andrei?) were not very excited about the asymmetry to begin with. Anyway, I'll differ to you since I think this (along with several other of the issues I've raised) are mostly judgement calls rather than issues with a clearly technically superior solution and you have been doing most of the hard spec work lately. J
Re: [Bug 11270] New: Interaction between in-line keys and key generators
Integers can be big 8bytes is common. It is generally assumed that the auto-increment counter will be big enough, overflow would wrap, and if the ID already exists there would be an error. In my experience auto-increment columns must be integers. Cheers, Keean. On 11 November 2010 12:20, Jeremy Orlow jor...@chromium.org wrote: On Thu, Nov 11, 2010 at 2:37 AM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Nov 10, 2010 at 3:15 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: On Wed, Nov 10, 2010 at 2:07 PM, Jonas Sicking jo...@sicking.cc wrote: On Wed, Nov 10, 2010 at 1:50 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: On Wed, Nov 10, 2010 at 1:43 PM, Pablo Castro pablo.cas...@microsoft.com wrote: From: public-webapps-requ...@w3.org [mailto: public-webapps-requ...@w3.org] On Behalf Of bugzi...@jessica.w3.org Sent: Monday, November 08, 2010 5:07 PM So what happens if trying save in an object store which has the following keypath, the following value. (The generated key is 4): foo.bar { foo: {} } Here the resulting object is clearly { foo: { bar: 4 } } But what about foo.bar { foo: { bar: 10 } } Does this use the value 10 rather than generate a new key, does it throw an exception or does it store the value { foo: { bar: 4 } }? I suspect that all options are somewhat arbitrary here. I'll just propose that we error out to ensure that nobody has the wrong expectations about the implementation preserving the initial value. I would be open to other options except silently overwriting the initial value with a generated one, as that's likely to confuse folks. It's relatively common for me to need to supply a manual value for an id field that's automatically generated when working with databases, and I don't see any particular reason that my situation would change if using IndexedDB. So I think that a manually-supplied key should be kept. I'm fine with either solution here. My database experience is too weak to have strong opinions on this matter. What do databases usually do with columns that use autoincrement but a value is still supplied? My recollection is that that is generally allowed? I can only speak from my experience with mySQL, which is generally very permissive, but which has very sensible behavior here imo. You are allowed to insert values manually into an AUTO_INCREMENT column. The supplied value is stored as normal. If the value was larger than the current autoincrement value, the value is increased so that the next auto-numbered row will have an id one higher than the row you just inserted. That is, given the following inserts: insert row(val) values (1); insert row(id,val) values (5,2); insert row(val) values (3); The table will contain [{id:1, val:1}, {id:5, val:2}, {id:6, val:3}]. If you have uniqueness constraints on the field, of course, those are also used. Basically, AUTO_INCREMENT just alters your INSERT before it hits the db if there's a missing value; otherwise the query is treated exactly as normal. This is how sqlite works too. It'd be great if we could make this required behavior. What would we do if what they provided was not an integer? What happens if the number they insert is so big that the next one causes overflow? What is the use case for this? Do we really think that most of the time users do this it'll be intentional and not just a mistake? J
Re: Relational Data Model Example
Hi, Here are the Mozilla IndexedDB examples converted to us the relational data model. Points to note: - The database is validated (that is the schema in the JavaScript is either used to create the database if it does not exit, or to make sure that the database conforms to the schema if it does exist. Currently we require an exact match for validation to succeed, however the final version will use nullable and default values to allow attributes to be added to existing relations, or attributes ignored providing the required pre-conditions are met). - The 'true' at the end of the validate function tells it to drop the existing relations, so we always start with an empty database. - We add more data than the original insert example so there are some results from the join query. - There is no single-value-per-group test yet for project. But effectively when grouping by a unique attribute (like id) any attribute in the same (pre-join) relation is acceptable, as well as the attribute joined to in the other relation, but no other attribute if the joined to column is not unique (the case in the example). var rdm = new RelationalDataModel; var rdb = new rdm.WebSQLiteDataAdapter; var kids = rdm.relation('kids', { id: rdm.attribute('id', rdm.integer, {auto_increment: true}), name: rdm.attribute('name', rdm.string) }); var candy = rdm.relation('candy', { id: rdm.attribute('id', rdm.integer, {auto_increment: true}), name: rdm.attribute('name', rdm.string) }); var candySales = rdm.relation('candySales', { kid: rdm.attribute('kid', rdm.integer), candy: rdm.attribute('candy', rdm.integer), date: rdm.attribute('date', rdm.string) }); var v = rdb.validate('CandyDB', 1.0, [kids, candy, candySales], true).onsuccess = function(db) { // new database has been created, or existing database has been _validated_ var i = db.transaction(function(tx) { [ {id: 1, name: 'Anna'}, {id: 2, name: 'Betty'}, {id: 3, name: 'Christine'} ].forEach(function(k) { tx.insert(kids, k).onsuccess = function(t, id) { document.getElementById('display').textContent += '\tSaved record for ' + k.name + ' with id ' + id + '\n'; }; }); [ {id: 1, name: 'toffee-apple'}, {id: 2, name: 'bonbon'} ].forEach(function(c) { tx.insert(candy, c).onsuccess = function(t, id) { document.getElementById('display').textContent += '\tSaved record for ' + c.name + ' with id ' + id + '\n'; }; }); [ {kid: 1, candy: 1, date: '1/1/2010'}, {kid: 1, candy: 2, date: '2/1/2010'}, {kid: 2, candy: 2, date: '2/1/2010'}, {kid: 3, candy: 1, date: '1/1/2010'}, {kid: 3, candy: 1, date: '2/1/2010'}, {kid: 3, candy: 1, date: '3/1/2010'} ].forEach(function(s) { tx.insert(candySales, s).onsuccess = function(t, id) { document.getElementById('display').textContent += '\tSaved record for ' + s.kid + '/' + s.candy + ' with id ' + id + '\n'; }; }); }); i.onsuccess = function() { var q1 = db.transaction(function(tx) { tx.query(kids.project(kids.attributes.name)).onsuccess = function(t, names) { names.forEach(function(name) { document.getElementById('kidList').textContent += '\t' + name + '\n'; }); }; }); q1.onsuccess = function() { var q2 = db.transaction(function(tx) { tx.query( kids.join(candySales, kids.attributes.id.eq(candySales.attributes.kid)) .group(candySales.attributes.kid) .project({name:kids.attributes.name, count:kids.attributes.name.count()}) ).onsuccess = function(t, results) { var display = document.getElementById('purchaseList'); results.forEach(function(item) { display.textContent += '\t' + item.name + ' bought ' + item.count + ' pieces\n'; }); }; }); }; }; } Cheers, Keean. On 9 November 2010 17:13, Keean Schupke ke...@fry
Re: Relational Data Model Example
Well the implementation is not running on IndexedDB yet... however I can see no fundamental problems that will stop the implementation. I am sure once I get into the details there will be issues - but I expect these to be performance related. The plan is to continue to refine the common abstraction part of the prototype - I want to complete the relational data model - then start the IndexedDB backend. I'll let you know when I have something on IndexedDB. Cheers, Keean. On 11 November 2010 17:35, Jonas Sicking jo...@sicking.cc wrote: Hi Keean, This is awesome stuff! Very excited to see libraries that can run both on top of IndexedDB and on top of WebSQL. Would love to hear more about your experience working against the IndexedDB API. / Jonas On Thu, Nov 11, 2010 at 5:42 AM, Keean Schupke ke...@fry-it.com wrote: Hi, Here are the Mozilla IndexedDB examples converted to us the relational data model. Points to note: - The database is validated (that is the schema in the JavaScript is either used to create the database if it does not exit, or to make sure that the database conforms to the schema if it does exist. Currently we require an exact match for validation to succeed, however the final version will use nullable and default values to allow attributes to be added to existing relations, or attributes ignored providing the required pre-conditions are met). - The 'true' at the end of the validate function tells it to drop the existing relations, so we always start with an empty database. - We add more data than the original insert example so there are some results from the join query. - There is no single-value-per-group test yet for project. But effectively when grouping by a unique attribute (like id) any attribute in the same (pre-join) relation is acceptable, as well as the attribute joined to in the other relation, but no other attribute if the joined to column is not unique (the case in the example). var rdm = new RelationalDataModel; var rdb = new rdm.WebSQLiteDataAdapter; var kids = rdm.relation('kids', { id: rdm.attribute('id', rdm.integer, {auto_increment: true}), name: rdm.attribute('name', rdm.string) }); var candy = rdm.relation('candy', { id: rdm.attribute('id', rdm.integer, {auto_increment: true}), name: rdm.attribute('name', rdm.string) }); var candySales = rdm.relation('candySales', { kid: rdm.attribute('kid', rdm.integer), candy: rdm.attribute('candy', rdm.integer), date: rdm.attribute('date', rdm.string) }); var v = rdb.validate('CandyDB', 1.0, [kids, candy, candySales], true).onsuccess = function(db) { // new database has been created, or existing database has been _validated_ var i = db.transaction(function(tx) { [ {id: 1, name: 'Anna'}, {id: 2, name: 'Betty'}, {id: 3, name: 'Christine'} ].forEach(function(k) { tx.insert(kids, k).onsuccess = function(t, id) { document.getElementById('display').textContent += '\tSaved record for ' + k.name + ' with id ' + id + '\n'; }; }); [ {id: 1, name: 'toffee-apple'}, {id: 2, name: 'bonbon'} ].forEach(function(c) { tx.insert(candy, c).onsuccess = function(t, id) { document.getElementById('display').textContent += '\tSaved record for ' + c.name + ' with id ' + id + '\n'; }; }); [ {kid: 1, candy: 1, date: '1/1/2010'}, {kid: 1, candy: 2, date: '2/1/2010'}, {kid: 2, candy: 2, date: '2/1/2010'}, {kid: 3, candy: 1, date: '1/1/2010'}, {kid: 3, candy: 1, date: '2/1/2010'}, {kid: 3, candy: 1, date: '3/1/2010'} ].forEach(function(s) { tx.insert(candySales, s).onsuccess = function(t, id) { document.getElementById('display').textContent += '\tSaved record for ' + s.kid + '/' + s.candy + ' with id ' + id + '\n'; }; }); }); i.onsuccess = function() { var q1 = db.transaction(function(tx) { tx.query(kids.project(kids.attributes.name)).onsuccess = function(t, names) { names.forEach(function(name) { document.getElementById('kidList').textContent += '\t' + name + '\n
Re: [Bug 11270] New: Interaction between in-line keys and key generators
The other thing you could do is specify that when you get a wrap (IE someone inserts a key of MAXINT - 1) you auto-compact the table. If you really have run out of indexes there is not a lot you can do. The other thing to consider it that because JS uses signed arithmetic, its really a 63bit number... unless you want negative indexes appearing? (And how would that affect ordering and sorting)? Cheers, Keean. On 12 November 2010 07:36, Jeremy Orlow jor...@chromium.org wrote: On Fri, Nov 12, 2010 at 10:08 AM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Nov 11, 2010 at 9:22 PM, Jeremy Orlow jor...@chromium.org wrote: On Fri, Nov 12, 2010 at 12:32 AM, Jonas Sicking jo...@sicking.cc wrote: On Thu, Nov 11, 2010 at 11:41 AM, Jeremy Orlow jor...@chromium.org wrote: On Thu, Nov 11, 2010 at 6:41 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: On Thu, Nov 11, 2010 at 4:20 AM, Jeremy Orlow jor...@chromium.org wrote: What would we do if what they provided was not an integer? The behavior isn't very important; throwing would be fine here. In mySQL, you can only put AUTO_INCREMENT on columns in the integer family. What happens if the number they insert is so big that the next one causes overflow? The same thing that happens if you do ++ on a variable holding a number that's too large. Or, more directly, the same thing that happens if you somehow fill up a table to the integer limit (probably deleting rows along the way to free up space), and then try to add a new row. What is the use case for this? Do we really think that most of the time users do this it'll be intentional and not just a mistake? A big one is importing some data into a live table. Many smaller ones are related to implicit data constraints that exist in the application but aren't directly expressed in the table. I've had several times when I could normally just rely on auto-numbering for something, but occasionally, due to other data I was inserting elsewhere, had to specify a particular id. This assumes that your autonumbers aren't going to overlap and is going to behave really badly when they do. Honestly, I don't care too much about this, but I'm skeptical we're doing the right thing here. Pablo did bring up a good use case, which is wanting to migrate existing data to a new object store, for example with a new schema. And every database examined so far has some ability to specify autonumbered columns. overlaps aren't a problem in practice since 64bit integers are really really big. So unless someone maliciously sets a number close to the upper bound of that then overlaps won't be a problem. Yes, but we'd need to spec this, implement it, and test it because someone will try to do this maliciously. I'd say it's fine to treat the range of IDs as a hardware limitation. I.e. similarly to how we don't specify how much data a webpage is allowed to put into DOMStrings, at some point every implementation is going to run out of memory and effectively limit it. In practice this isn't a problem since the limit is high enough. Another would be to define that the ID is 64 bit and if you run out of IDs no more rows can be inserted into the objectStore. At that point the page is responsible for creating a new object store and compacting down IDs. In practice no page will run into this limitation if they use IDs increasing by one. Even if you generate a new ID a million times a second, it'll still take you over half a million years to run out of 64bit IDs. This seems reasonable. OK, let's do it. And, in the email you replied right under, I brought up the point that this feature won't help someone who's trying to import data into a table that already has data in it because some of it might clash. So, just to make sure we're all on the same page, the use case for this is restoring data into an _empty_ object store, right? (Because I don't think this is a good solution for much else.) That's the main scenario I can think of that would require this yes. / Jonas
Re: [Bug 11270] New: Interaction between in-line keys and key generators
What do databases usually do with columns that use autoincrement but a value is still supplied? My recollection is that that is generally allowed? You can normally insert with a supplied key providing it is unique. Cheers, Keean. On 10 November 2010 22:07, Jonas Sicking jo...@sicking.cc wrote: On Wed, Nov 10, 2010 at 1:50 PM, Tab Atkins Jr. jackalm...@gmail.com wrote: On Wed, Nov 10, 2010 at 1:43 PM, Pablo Castro pablo.cas...@microsoft.com wrote: From: public-webapps-requ...@w3.org [mailto: public-webapps-requ...@w3.org] On Behalf Of bugzi...@jessica.w3.org Sent: Monday, November 08, 2010 5:07 PM So what happens if trying save in an object store which has the following keypath, the following value. (The generated key is 4): foo.bar { foo: {} } Here the resulting object is clearly { foo: { bar: 4 } } But what about foo.bar { foo: { bar: 10 } } Does this use the value 10 rather than generate a new key, does it throw an exception or does it store the value { foo: { bar: 4 } }? I suspect that all options are somewhat arbitrary here. I'll just propose that we error out to ensure that nobody has the wrong expectations about the implementation preserving the initial value. I would be open to other options except silently overwriting the initial value with a generated one, as that's likely to confuse folks. It's relatively common for me to need to supply a manual value for an id field that's automatically generated when working with databases, and I don't see any particular reason that my situation would change if using IndexedDB. So I think that a manually-supplied key should be kept. I'm fine with either solution here. My database experience is too weak to have strong opinions on this matter. What do databases usually do with columns that use autoincrement but a value is still supplied? My recollection is that that is generally allowed? What happens if the property is missing several parents, such as foo.bar.baz { zip: {} } Does this throw or does it store { zip: {}, foo: { bar: { baz: 4 } } } We should just complete the object with all the missing parents. Agreed. Works for me. If we end up allowing array indexes in key paths (like foo[1].bar) what does the following keypath/object result in? I think we can live without array indexing in keys for this round, it's probably best to just leave them out and only allow paths. Agreed. Works for me. Actually, we could go even further and disallow paths entirely, and just allow a property name. That is what the firefox implementation currently does. That also sidesteps the issue of missing parents. / Jonas
Relational Data Model Example
Hi, I have completed the first stage of the Relational Data Model prototype. Error checking is not complete (for example aggregate functions can be nested currently, and this should not be allowed). So it should work for correct examples, but may not generate an error (or the correct error) for incorrect examples. The library (available at http://keean.fry-it.com/relational.js) only implements the WebSQL backend at the moment, as this was the quickest to get up and running. I plan to implement a JavaScript Object backend (IE relational operations in memory) and the IndexedDB backend. There is a simple first example (available at http://keean.fry-it.com/cuboid.html) that shows calculating the average volume of a collection of cuboids the relational way. Attached at the end is the JavaScript source for the cuboid example. Comments appreciated. Cheers, Keean. try { var rdm = new RelationalDataModel; var rdb = new rdm.WebSQLiteDataAdapter; var cuboid_id = rdm.domain('id', rdm.integer, {not_null: true}); var dimension = rdm.domain('dimension', rdm.number, {not_null: true}); var cuboids = rdm.relation('cuboids', { id: rdm.attribute('id', cuboid_id, {auto_increment: true}), length: rdm.attribute('length', dimension), width: rdm.attribute('width', dimension), height: rdm.attribute('height', dimension) }); var v = rdb.validate('cubeoid_db', 1.0, [cuboids]); v.onerror = function(error) { alert('ValidateError: ' + error.message); }; v.onsuccess = function(db) { var insert = db.transaction(function(tx) { tx.insert(cuboids, {width:10.0, length:10.0, height:10.0}); tx.insert(cuboids, {width:13.5, length:17.2, height:10.1}); tx.insert(cuboids, {width:23.1, length:7.9, height:9.5}); }); insert.onerror = function(error) { alert('InsertTransactionError: ' + error.message); }; insert.onsuccess = function() { var query = db.transaction(function(tx) { var average_volume = cuboids.attributes.length .mul(cuboids.attributes.width) .mul(cuboids.attributes.height) .avg(); var q = tx.query(cuboids.project({avg_vol: average_volume})); q.onsuccess = function(t, results) { var s = ; results.forEach(function(r) { s += r.avg_vol + '\n'; }); alert(s); }; }); query.onerror = function(error) { alert('QueryTransactionError: ' + error.message); }; }; }; } catch (e) { alert (e.stack); }
Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist
It would make sense if you make setting a key to undefined semantically equivalent to deleting the value (and no error if it does not exist), and return undefined on a get when no such key exists. That way 'undefined' cannot exist as a value in the object store, and is a safe marker for the key not existing in that index. Cheers, Keean. On 8 November 2010 17:52, Tab Atkins Jr. jackalm...@gmail.com wrote: On Mon, Nov 8, 2010 at 8:24 AM, Jonas Sicking jo...@sicking.cc wrote: Hi All, One of the things we discussed at TPAC was the fact that IDBObjectStore.get() and IDBObjectStore.delete() currently fire an error event if no record with the supplied key exists. Especially for .delete() this seems suboptimal as the author wanted the entry with the given key removed anyway. A better alternative here seems to be to return (through a success event) true or false to indicate if a record was actually removed. For IDBObjectStore.get() it also seems like it will create an error event in situations which aren't unexpected at all. For example checking for the existence of certain information, or getting information if it's there, but using some type of default if it's not. An obvious choice here is to simply return (through a success event) undefined if no entry is found. The downside with this is that you can't tell the lack of an entry apart from an entry stored with the value undefined. However it seemed more rare to want to tell those apart (you can generally store something other than undefined), than to end up in situations where you'd want to get() something which possibly didn't exist. Additionally, you can still use openCursor() to tell the two apart if really desired. I've for now checked in this change [1], but please speak up if you think this is a bad idea for whatever reason. In general I'd disagree with you on get(), and point to basically all hash-table implementations which all give a way of telling whether you got a result or not, but the fact that javascript has false, null, *and* undefined makes me okay with this. I believe it's sufficient to use 'undefined' as the flag for there was nothing for this key in the objectstore, and just tell authors don't put undefined in an objectstore; use false or null instead. ~TJ
Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist
Hi, In code, if: idbObjectStoreSync.put(key, undefined) does the same as idbObjectStoreSync.remove(key) then idbObjectStoreSync.get(key) can safely return undefined for no such key exists. Consider: idbObjectStoreSync.put('mykey', undefined); // deletes the object stored under mykey or noop. idbObjectStoreSync.get('mykey'); // returns 'undefined' idbObjectStoreSync.put('mykey', myobject); idbObjectStoreSync.get('mykey'); // returns 'myobject' idbObjectStoreSync.put('mykey', undefined); // deletes the object stored under mykey or noop. idbObjectStoreSync.get('mykey'); // returns 'undefined' Cheers, Keean. On 8 November 2010 18:27, Jonas Sicking jo...@sicking.cc wrote: On Mon, Nov 8, 2010 at 10:06 AM, Keean Schupke ke...@fry-it.com wrote: It would make sense if you make setting a key to undefined semantically equivalent to deleting the value (and no error if it does not exist), and return undefined on a get when no such key exists. That way 'undefined' cannot exist as a value in the object store, and is a safe marker for the key not existing in that index. I'm not sure I follow. There is no way to set a key on an existing entry in an object store. The closest thing would be IDBCursor.update(), but it specifically disallow changing the key at all. / Jonas
Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist
Obviously I need to the key and value the correct way around for 'put'... Cheers, Keean. On 8 November 2010 18:41, Keean Schupke ke...@fry-it.com wrote: Hi, In code, if: idbObjectStoreSync.put(key, undefined) does the same as idbObjectStoreSync.remove(key) then idbObjectStoreSync.get(key) can safely return undefined for no such key exists. Consider: idbObjectStoreSync.put('mykey', undefined); // deletes the object stored under mykey or noop. idbObjectStoreSync.get('mykey'); // returns 'undefined' idbObjectStoreSync.put('mykey', myobject); idbObjectStoreSync.get('mykey'); // returns 'myobject' idbObjectStoreSync.put('mykey', undefined); // deletes the object stored under mykey or noop. idbObjectStoreSync.get('mykey'); // returns 'undefined' Cheers, Keean. On 8 November 2010 18:27, Jonas Sicking jo...@sicking.cc wrote: On Mon, Nov 8, 2010 at 10:06 AM, Keean Schupke ke...@fry-it.com wrote: It would make sense if you make setting a key to undefined semantically equivalent to deleting the value (and no error if it does not exist), and return undefined on a get when no such key exists. That way 'undefined' cannot exist as a value in the object store, and is a safe marker for the key not existing in that index. I'm not sure I follow. There is no way to set a key on an existing entry in an object store. The closest thing would be IDBCursor.update(), but it specifically disallow changing the key at all. / Jonas
Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist
Hi, Indeed. But I think this is more unexpected and confusing than having .get() return the same thing if the entry exists as if it contains undefined. / Jonas I don't understand that. with the proposal, undefined clearly means the entry does not exist as there is no way to put an undefined into the object store (as .put(undefined, key) deletes the entry). Cheers, Keean.
Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist
I was only suggesting this as it makes the operations symmetrical in the sense that if get returns undefined for key does not exist, put(undefined, key) should mean make this key not exist, in a declarative sense. For me this is clearer than the alternatives (which may require exceptions to deal with some cases). Of course its only a suggestion, and it nobody likes it, feel free to ignore it. Cheers, Keean. On 8 November 2010 18:57, Keean Schupke ke...@fry-it.com wrote: Hi, Indeed. But I think this is more unexpected and confusing than having .get() return the same thing if the entry exists as if it contains undefined. / Jonas I don't understand that. with the proposal, undefined clearly means the entry does not exist as there is no way to put an undefined into the object store (as .put(undefined, key) deletes the entry). Cheers, Keean.
Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist
Hi, I don't understand that. with the proposal, undefined clearly means the entry does not exist as there is no way to put an undefined into the object store (as .put(undefined, key) deletes the entry). The confusing part is that a function called 'put' actually deletes something, especially since we also have a 'delete' function. Sure, you could get rid of the delete function :-) I think the meaning of put(undefined, key) is pretty clear. I would put the question this way: What problem are you trying to solve? If the problem is that people can't store undefined and then tell undefined apart from not there then your proposal doesn't solve that problem as undefined can't be stored at all. Precisely, the solution I am proposing is based on disallowing storing of 'undefined'. What does it mean to store 'undefined' anyway? People can still use null. If you disallow storing 'undefined', put(undefined, key) would need to throw an exception. I am proposing having put(undefined, key) be the same as remove(key) to avoid having an exception. After all the initial concern was avoiding having to handle exceptions. Cheers, Keean
Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist
What is the use case for storing undefined in an object-store? Cheers, Keean. On 8 November 2010 20:59, Jonas Sicking jo...@sicking.cc wrote: On Mon, Nov 8, 2010 at 12:02 PM, Keean Schupke ke...@fry-it.com wrote: Hi, I don't understand that. with the proposal, undefined clearly means the entry does not exist as there is no way to put an undefined into the object store (as .put(undefined, key) deletes the entry). The confusing part is that a function called 'put' actually deletes something, especially since we also have a 'delete' function. Sure, you could get rid of the delete function :-) I think the meaning of put(undefined, key) is pretty clear. I guess we'll have to agree to disagree on that one :) My concern with something like this is that we'll see code do stuff like: function myStoreFunction(objectStoreName, key, value) { os = db.transaction([objectStoreName]).objectStore(objectStoreName); if (value === undefined) { os.put(null, key); } else { os.put(value, key); } } which does not seem like a net win for anyone. I would put the question this way: What problem are you trying to solve? If the problem is that people can't store undefined and then tell undefined apart from not there then your proposal doesn't solve that problem as undefined can't be stored at all. Precisely, the solution I am proposing is based on disallowing storing of 'undefined'. What does it mean to store 'undefined' anyway? People can still use null. Wait, your solution doesn't solve the above described problem. The described problem was People can't store undefined and then tell undefined apart from not there then your proposal doesn't solve that problem as undefined can't be stored at all. Your solution doesn't solve that problem. / Jonas
Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist
Let me put it another way. Why do you want to allow putting 'undefined' into the object store? All that does is make the API for get ambiguous. What does it gain you? Why do you want to make 'get' ambiguous? I think having an unambiguous API for 'get' is worth more than being able to 'put' 'undefined' values into the object store. Cheers, Keean. On 8 November 2010 23:10, Jonas Sicking jo...@sicking.cc wrote: On Mon, Nov 8, 2010 at 2:39 PM, Keean Schupke ke...@fry-it.com wrote: The problem I am trying to solve is not knowing if get(key) === undefined means the key does not exist or there is a key with a value of undefined. The solution is to disallow inserting undefined. Now there is no ambiguity, if get(key) returns undefined, it _must_ be because the key does not exist. Does this make sense so far? But if saying you're not allowed to insert undefined as value is an acceptable solution, why isn't you can't tell them apart using get() an acceptable solution? What use case does the first solution cater to that isn't solved by the second solution? / Jonas
Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist
If more than one developer are working on a project, there is no way I can know if the other developer has put 'undefined' objects into the store (unless the specification enforces it). So every time I am checking if a key exists (maybe to delete the key) I need to check if it _really_ exists, or else I can run into problems. For example: In module A: put(undefined, key); In module B: if (get(key) !== undefined) { remove(key); } So the object store will fill up with key = undefined until we run out of memory. Cheers, Keean. On 8 November 2010 23:24, Jonas Sicking jo...@sicking.cc wrote: On Mon, Nov 8, 2010 at 3:18 PM, Keean Schupke ke...@fry-it.com wrote: Let me put it another way. Why do you want to allow putting 'undefined' into the object store? All that does is make the API for get ambiguous. What does it gain you? Why do you want to make 'get' ambiguous? It seems like a loose-loose situation to prevent it. Implementors will have to add code to check for 'undefined' all the time, and users of the API can't store 'undefined' if they would like to. I think having an unambiguous API for 'get' is worth more than being able to 'put' 'undefined' values into the object store. Can you describe the application that would be easier to write, possible to write, faster to run or have cleaner code if we forbade putting 'undefined' in an object store? / Jonas
Re: [IndexedDB] Behavior of IDBObjectStore.get() and IDBObjectStore.delete() when record doesn't exist
Sounds good to me... Cheers, Keean. On 9 November 2010 00:16, Jonas Sicking jo...@sicking.cc wrote: On Mon, Nov 8, 2010 at 4:04 PM, Keean Schupke ke...@fry-it.com wrote: Hi, Why do you want to check that a key exists before you delete it? Why not just call delete(key) always and rest assured that it's gone? because it will throw an exception if the key does not exist... That is no longer the case, see the first email in this thread :) Similar to Kris, I think worrying about 'undefined' is worrying about an edge case. Simplicity is better than trying to cove every possible edge case. I thought edge cases are precisely what a specification is supposed to deal with. A spec can never cover 100% of all use cases. Often covering the last 10-20% of the use cases adds as much complexity or API surface, if not more, as covering the first 80-90%. The trick really is to know when to stop. Anyway, although I don't agree with the other reasons, I find the array case compelling. So lets ignore the proposal to disallow storing undefined. Perhaps you could add a boolean method exists(key) to IDBObjectStore to make it easier to tell the two apart. Note that you can easily do this using openCursor already. In the synchronous API you could easily implement exists by doing: IDBObjectStoreSync.prototype.exists = function(key) { return this.openCursor(key) !== undefined; } I think we should keep exists() in mind for v2 of the interface. It has other benefits over get() and openCursor() in that if the stored value is very big it doesn't require time to deserialize it out of the database. But given how close we are to finishing v1 I'd rather not add it now. I have added it to my stuff we should reexamine in v2 list though. / Jonas
Re: Replacing WebSQL with a Relational Data Model.
Hi Nathan, On 27 October 2010 08:58, Nathan Kitchen w...@nathankitchen.com wrote: The most obvious problem was that it was tied so tightly to SQLite (which I think everyone would be amazed if MS started shipping with IE10). They'd want to use Access/SQL Compact, and suddenly we'd all have different SQL dialects to code our offline applications to. I am sure you are aware, but the relation API I am proposing would not have this problem. The relational algebra is defined independently of any SQL implementation. Infact its not even SQL. However a relational database (like SQLite, MySQL, Access/SQL Compact) would make the ideal library to use in its implementation because of the huge amount of work done over may years by researchers and programmers to make a decent relational database engine that we do not want to have to replicate in JavaScript on top of IndexedDB. Which is why I agree 100% with this statement: *The critical point here is that we need only one standardized interface, not a perfectly optimized for data-model-x one, not a uses query-language-foo one, just something that we can all use to persist data from javascript, and wrap in other APIs, that way any optimizations made will benefit everybody - regardless of their preferred interface, data model query style.* And I totally agree with this statement, which is why I think it is critical a _relationally_complete_ API is standardised (either in this, or a later IndexedDB spec, or another spec entirely). Cheers, Keean.
Re: Replacing WebSQL with a Relational Data Model.
Sure, the argument has more weight with real numbers. I have started working on the relational schema model in JavaScript. Here is a question: What is preferred in terms of style for declaring a relation. We can have something like: var FarmTable = { id: {name: 'id', domain: FarmId, type: rdm.schema.serial}, name: {name: 'name', domain: FarmName}, county: {name: 'county', domain: FarmCounty}, owner: {name: 'owner', domain: FarmerId} }; This is concise, but little checking is done, alternatively: var FarmTable = new Relation( new Attribute('id', FarmId, rdm.schema.serial), new Attribute('name', FarmName), new Attribute('county', FarmCounty), new Attribute('owner', FarmerId) ); Or perhaps something else? Cheers, Keean. On 27 October 2010 09:24, Jonas Sicking jo...@sicking.cc wrote: On Wed, Oct 27, 2010 at 1:04 AM, Keean Schupke ke...@fry-it.com wrote: So, my point was that although IndexedDB is neither optimal for your preferred data model or mine, it does cater for us both, and everybody else, allowing us to get on and do our jobs, implement APIs, and build HTML5 client side web applications. This is where we differ, as I think it may allow it, it will not make it practical (from the programmers point of view) nor usable (from the end user tying to use the app). Remember we have to perform reasonably against native iPhone / Android apps or people will not use HTML5 apps. I'd encourage you to do some testing, run some performance numbers, and report back for cases where things are too slow. That good performance is a required in order to consider a use case met is hopefully obvious to everyone here. The whole point of IndexedDB is good performance, other than performace it doesn't provide anything that localStorage doesn't. / Jonas
Re: Replacing WebSQL with a Relational Data Model.
On that point it should be possible to build an efficient text search on top of IndexedDB. You need a word index that links to multiple documents. Matching documents are found by taking the intersection of the sets of documents found for each word in the query (for an unstructured query). As such you would put the documents in localStorage, and build a word index in IndexedDB, there each record contained a list of document references. Cheers, Keean. On 27 October 2010 09:43, Nathan Kitchen w...@nathankitchen.com wrote: Sorry Keean, the main point of my post was to introduce the [featurecreep /], not critique your suggestions. I don't honestly care about the implementation of persistent browser storage, but I do care that it's fully-featured. nat...@webr3.org noted that we just need something to persist data from javascript. Although I agree with this, I think we additionally need native full-text search *as well as* CRUD. The Gears implementation of FTS (or rather, SQLite) exposed useful functionality but that needs * * In hindsight it was a little off-topic, but I saw it fly past and thought that while we were discussing offline storage features it'd be a good point to raise FTS. I'm also not sure about persisted JSON structures vs relational objects, but happy to see how the current spec pans out. It certainly involves thinking about an application's data architecture in a different way though. On Wed, Oct 27, 2010 at 9:10 AM, Keean Schupke ke...@fry-it.com wrote: Hi Nathan, On 27 October 2010 08:58, Nathan Kitchen w...@nathankitchen.com wrote: The most obvious problem was that it was tied so tightly to SQLite (which I think everyone would be amazed if MS started shipping with IE10). They'd want to use Access/SQL Compact, and suddenly we'd all have different SQL dialects to code our offline applications to. Which is why I agree 100% with this statement: *The critical point here is that we need only one standardized interface, not a perfectly optimized for data-model-x one, not a uses query-language-foo one, just something that we can all use to persist data from javascript, and wrap in other APIs, that way any optimizations made will benefit everybody - regardless of their preferred interface, data model query style.* And I totally agree with this statement, which is why I think it is critical a _relationally_complete_ API is standardised (either in this, or a later IndexedDB spec, or another spec entirely). Cheers, Keean.