Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?

2011-02-07 Thread Glenn Maynard
On Mon, Feb 7, 2011 at 2:38 AM, Jonas Sicking jo...@sicking.cc wrote:

 One problem with putting a limit is that it basically forces
 implementations to use a specific encoding, or pay a hefty price. For
 example if we choose a 64K limit, is that of UTF8 data or of UTF16
 data? If it is of UTF8 data, and the implementation uses something
 else to store the date, you risk having to convert the data just to
 measure the size. Possibly this would be different if we measured size
 using UTF16 as javascript more or less enforces that the source string
 is UTF16 which means that you can measure utf16 size on the cheap,
 even if the stored data uses a different format.


Is that a safe assumption to design around?  The API might later be bound to
other languages fortunate enough not to be stuck in UTF-16.

-- 
Glenn Maynard


Re: [Bug 11348] New: [IndexedDB] Overhaul of the event model

2011-02-07 Thread Simon Pieters

On Wed, 02 Feb 2011 23:28:56 +0100, Jonas Sicking jo...@sicking.cc wrote:


On Wed, Feb 2, 2011 at 2:10 PM, Jeremy Orlow jor...@chromium.org wrote:
Just to confirm, we don't want the events to propagate to the window  
itself,

right?


Correct. Sort of. Here's what we did in gecko:

The event propagation path is request-transaction-database. This
goes for both success and error events. However success doesn't
bubble so normal event handlers doesn't fire on the transaction or
database for success. But if you really want you can attach a
capturing listener using .addEventListener and listen to them there.
This matches events fired on nodes.

For abort events the propagation path is just transaction-database
since the target of abort events is the transaction.

So far this matches what you said.

However, we also wanted to integrate the window.onerror feature in
HTML5. So after we've fired an error event, if .preventDefault() was
never called on the event, we fire an error event on the window (can't
remember if this happens before or after we abort the transaction).
This is a separate event, which for example means that even if you
attach a capturing error handler on window, you won't see any events
unless an error really went unhandled. And you also can't call
.preventDefault on the error event fired on the window in order to
prevent the transaction from being aborted. It's purely there for
error reporting and distinctly different from the event propagating to
the window.


Hmm. I'm not sure what to think of IndexedDB using window.onerror.  
window.onerror is used for catching JS syntax errors and uncaught  
exceptions in scripts. Also, window.onerror is invoked directly without  
firing an actual event. What's the use case for firing an error event on  
window for IndexedDB?




This is similar to how error events are handled in workers.


Not really. Workers have their own onerror handler in the worker script  
itself, and if the error is still not handled, an error event is fired  
on the worker object, but it stops there; an error event is never fired on  
window.


--
Simon Pieters
Opera Software



Re: Quota API to query/request quota for offline storages (e.g. IndexedDB, FileSystem)

2011-02-07 Thread Kinuko Yasuda
On Sat, Feb 5, 2011 at 7:29 AM, Glenn Maynard gl...@zewt.org wrote:
 On Fri, Feb 4, 2011 at 12:07 AM, Kinuko Yasuda kin...@chromium.org wrote:

 If we want to make the quota API treat each API differently this would
 make a lot sense, but I'm not fully convinced by the idea.
 Putting aside the localStorage for now, do you still see significant
 issues in having a sshared single quota?  Also let me note that
 this API does not and should not guarantee that the app can actually
 *write* that amount of data into the storage, even after the quota is
 granted, and the UA could still stop an API to write further even if
 it's within the quota.

 I suppose that even the 2-3x difference--requesting 256 MB and actually
 getting 512 MB over different APIs--is acceptable, since to users,
 requesting storage is an order-of-magnitude question more than a precise
 number.  As long as implementations are still allowed to implement separate
 quotas if they want, it's probably acceptable for this API to not reflect
 them precisely and to be biased towards a shared quota.

If we think that most of users/developers wouldn't be interested in
specifying 'giving X bytes to storage A' and 'giving Y bytes to
storage B' or such that while both storage A and B eats the user's
disk, then probably UA should evolve in that direction.  That's my
assumption and why the proposed API look like that (i.e. biased
towards a shared quota).

 2011/2/4 Ian Fette (イアンフェッティ) ife...@google.com
 For instance, if a user has been using a site for months, uses it
 frequently, and the site hits its 5GB limit but there's still 300GB free on
 the drive, perhaps we just give the site another 5GB and give the user a
 passive indication that we've done so, and let them do something if they
 actually care.

 That's interesting; reducing the amount users are nagged about things that
 they probably don't care about is important.  It would also need to suppress
 prompting from calls to requestQuota if the quota increase would have been
 allowed automatically.

The proposing API itself doesn't specify the frequency of
notifications or the behavior of out-of-quota scenarios, but probably
it might be worth adding some note about calling 'requestQuota()' does
not (and should not) always need to result in the UA prompting, and it
must be generally prohibited prompting the user too frequently.

The bottom line of whether we should prompt or not is, I suppose, if
UA ask for the user's permission to store some data in the storage,
the UA shouldn't delete the data without the user's permission.

 --
 Glenn Maynard




Re: [widgets] New Widget Update Types: Kill Switch and Patch

2011-02-07 Thread Scott Wilson
I really like the Kill Switch/EOL idea and having a type attribute to specify 
it, but I'm concerned that the Patch type could be a bit more problematic to 
get consistently implemented.

On 6 Feb 2011, at 17:15, Marcos Caceres wrote:

 Opera would like to discuss adding the following attribute to the update-info 
 element of the widget Updates specification: type.
 
 Details below...
 
 == The type attribute==
 
 The type attribute serves to inform the user of the type of update that will 
 potentially be performed on a widget. The type range from update, patch, 
 or eol (end of life/kill switch). For backwards compatibility, when the 
 attribute is missing or in error, the default behavior is to behave as an 
 update - like we currently do today (see Update below).
 
 update-info xmlns=http://www.w3.org/ns/widgets;
 type=update|patch|eol/
 
 
 === Update ==
 An update is a completely new version of the widget, where all the files of 
 the widget are replaced with the files contained in update. Effectively, an 
 update causes all the files in an installed widget to be deleted, and a new 
 widget to be installed in its place. Only the widget's id and Storage data 
 remain from one version to the next. This is the current and default behavior.
 
 Requirement: when the type attribute is missing, the user agent assumes this 
 an update. Updates are always applied when the mime type of an update is 
 application/widget.
 
 Example:
 update-info xmlns   = http://www.w3.org/ns/widgets;
 src = https://w.example.com/2.1/RC/app.wgt;
 version = 2.0
type=update
  details
Totally awesome new version!
  /details
 /update-info
 
 == Patch ==
 A patch is a partial update to only some files in a widget. Consider the 
 use case below.
 
 Patch Use Case: I have a cookbook extension that contains a bunch of videos, 
 audio, and graphics inside the widget (~500Mb). I've updated the javascript, 
 in only one file (say ~5kb worth of changes) and added/updated localized 
 content. As a developer, I only want to patch the affected file without 
 having to send the whole widget package as an update. A patch would only 
 add or replace files already contained in the widget package.
 
 Requirements:
 1. Must work with the digital signing scheme for widgets. If the update is 
 patching a digitally signed widget, then the patch must contain a new 
 signature over every file in the widget that is equivalent to the widget 
 having been updated.
 
 Question: Do we need a new mime type for this? (e.g., 
 application/widget-patch).
 
 Example:
 update-info xmlns   = http://www.w3.org/ns/widgets;
 src = https://w.example.com/2.1/RC/app.wgt;
 version = 2.1
type=patch
  detailsFixed bugs and localized some content/details
 /update-info
 
 
 === End of Life - Kill Switch ===
 The eol (end of life) update allows developers to indicate that they are no 
 longer maintaining a widget or provides a means for developers and web site 
 owners to warn users of malicious widgets (or widgets that may have some 
 other issue). In any case, it serves as a kind of kill switch.
 
 Use case - end of life: As a developer, I create widget X for user Y that 
 allows them to access temporary service Z. Service Z is only around for 24 
 hours and widget X is useless without service Z. When widget X updates itself 
 after 24 hours, I send an eof update informing that user that the widget's 
 usefulness has run out. The user can then uninstall the widget.
 
 Use case - kill switch: As someone that runs a catalog, I discover that 
 widget X is malware. Because widget X is served from my catalog and gets its 
 updates from my repo, I can mark the next update to be eol. I also include 
 a description for the author informing them about what issues where found.
 
 Example:
 update-info xmlns   = http://www.w3.org/ns/widgets;
 version = 2.0
type=eol
  details
A serious security issue was found in this widget.
It is highly recommended you uninstall it.
  /details
 /update-info
 
 
 -- 
 Marcos Caceres
 Opera Software
 



smime.p7s
Description: S/MIME cryptographic signature


Re: [widgets] New Widget Update Types: Kill Switch and Patch

2011-02-07 Thread Marcos Caceres
On Mon, Feb 7, 2011 at 1:46 PM, Scott Wilson
scott.bradley.wil...@gmail.com wrote:
 I really like the Kill Switch/EOL idea and having a type attribute to 
 specify it, but I'm concerned that the Patch type could be a bit more 
 problematic to get consistently implemented.


Understood. What concerns are you having or what interop issues do you foresee?



-- 
Marcos Caceres
Opera Software ASA, http://www.opera.com/
http://datadriven.com.au



Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?

2011-02-07 Thread Shawn Wilsher

On 2/7/2011 12:32 AM, Glenn Maynard wrote:

Is that a safe assumption to design around?  The API might later be bound to
other languages fortunate enough not to be stuck in UTF-16.
As I recall, we've already made design decisions based on the fact that 
the primary consumer of this API is going to be JavaScript on the web. 
(What those decisions were about, I don't recall offhand, however.)


Cheers,

Shawn



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [widgets] New Widget Update Types: Kill Switch and Patch

2011-02-07 Thread Scott Wilson

On 7 Feb 2011, at 14:22, Marcos Caceres wrote:

 On Mon, Feb 7, 2011 at 1:46 PM, Scott Wilson
 scott.bradley.wil...@gmail.com wrote:
 I really like the Kill Switch/EOL idea and having a type attribute to 
 specify it, but I'm concerned that the Patch type could be a bit more 
 problematic to get consistently implemented.
 
 
 Understood. What concerns are you having or what interop issues do you 
 foresee?

Principally the handling of the various update states, rollbacks after failing 
to apply patches, problems with multiple-version-spanning patch updates that 
kind of thing.

Also when we unpack a widget and ready it, its no longer exactly the same as 
the input .wgt so we'd have to apply the patch against the originally imported 
package rather than the actual installed instance and then load it again or the 
patch won't take - so we may as well update the whole package anyway.

Its not a bad idea in principle, but potentially a lot of code to save a few kb 
of downloading.

 -- 
 Marcos Caceres
 Opera Software ASA, http://www.opera.com/
 http://datadriven.com.au



smime.p7s
Description: S/MIME cryptographic signature


Re: [Bug 11348] New: [IndexedDB] Overhaul of the event model

2011-02-07 Thread Jonas Sicking
On Mon, Feb 7, 2011 at 2:22 AM, Simon Pieters sim...@opera.com wrote:
 On Wed, 02 Feb 2011 23:28:56 +0100, Jonas Sicking jo...@sicking.cc wrote:

 On Wed, Feb 2, 2011 at 2:10 PM, Jeremy Orlow jor...@chromium.org wrote:

 Just to confirm, we don't want the events to propagate to the window
 itself,
 right?

 Correct. Sort of. Here's what we did in gecko:

 The event propagation path is request-transaction-database. This
 goes for both success and error events. However success doesn't
 bubble so normal event handlers doesn't fire on the transaction or
 database for success. But if you really want you can attach a
 capturing listener using .addEventListener and listen to them there.
 This matches events fired on nodes.

 For abort events the propagation path is just transaction-database
 since the target of abort events is the transaction.

 So far this matches what you said.

 However, we also wanted to integrate the window.onerror feature in
 HTML5. So after we've fired an error event, if .preventDefault() was
 never called on the event, we fire an error event on the window (can't
 remember if this happens before or after we abort the transaction).
 This is a separate event, which for example means that even if you
 attach a capturing error handler on window, you won't see any events
 unless an error really went unhandled. And you also can't call
 .preventDefault on the error event fired on the window in order to
 prevent the transaction from being aborted. It's purely there for
 error reporting and distinctly different from the event propagating to
 the window.

 Hmm. I'm not sure what to think of IndexedDB using window.onerror.
 window.onerror is used for catching JS syntax errors and uncaught exceptions
 in scripts. Also, window.onerror is invoked directly without firing an
 actual event.

Not just syntax errors. At least in firefox it also fires for uncaught
exceptions.

So basically we fire all javascript errors which goes unhandled by the
page (there is no way to handle syntax errors so they always goes
unhandled). That is very much the case here, however since the error
reporting must be asynchronous we report it using a event rather than
an exception.

 What's the use case for firing an error event on window for IndexedDB?

What is the use case for error events? I've always thought of it as a
choke point where pages can catch JS errors and either display to the
user or report back to the server for debugging. If that is the case
then this is just another case where errors can arise.

Do you have another use case in mind?

 This is similar to how error events are handled in workers.

 Not really. Workers have their own onerror handler in the worker script
 itself, and if the error is still not handled, an error event is fired on
 the worker object, but it stops there; an error event is never fired on
 window.

That's not the case in the gecko implementation. But I see the spec
doesn't call for this yet. I'll file a bug on the spec.

/ Jonas



Re: [widgets] New Widget Update Types: Kill Switch and Patch

2011-02-07 Thread Marcos Caceres



On 2/7/11 4:43 PM, Scott Wilson wrote:


On 7 Feb 2011, at 14:22, Marcos Caceres wrote:


On Mon, Feb 7, 2011 at 1:46 PM, Scott Wilson
scott.bradley.wil...@gmail.com  wrote:

I really like the Kill Switch/EOL idea and having a type
attribute to specify it, but I'm concerned that the Patch type
could be a bit more problematic to get consistently implemented.



Understood. What concerns are you having or what interop issues do
you foresee?


Principally the handling of the various update states, rollbacks
after failing to apply patches, problems with
multiple-version-spanning patch updates that kind of thing.

Also when we unpack a widget and ready it, its no longer exactly the
same as the input .wgt so we'd have to apply the patch against the
originally imported package rather than the actual installed instance
and then load it again or the patch won't take - so we may as well
update the whole package anyway.


Both excellent issues.


Its not a bad idea in principle, but potentially a lot of code to
save a few kb of downloading.


I agree. For small widgets this is not an issue. It's for big widgets 
where it becomes a problem. It might be that the user agent could do 
negotiation (e.g., I don't support patches and have plenty of 
bandwidth, just send me the whole thing).


--
Marcos Caceres
Opera Software



[Bug 11669] Section: Parsing an event stream It looks like their is a mistake in this ABNF description, end-of-line is repeated: stream = [ bom ] *event event = *( comment / fiel

2011-02-07 Thread bugzilla
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11669

Ian 'Hixie' Hickson i...@hixie.ch changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||i...@hixie.ch
 Resolution||WONTFIX

--- Comment #1 from Ian 'Hixie' Hickson i...@hixie.ch 2011-02-07 22:37:11 UTC 
---
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the tracker issue; or you may create a tracker issue
yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: It's repeated because events are separated from each other by a
blank line. Your proposed BNF puts all the comments and fields on one line,
which is unparseable.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are on the CC list for the bug.



Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?

2011-02-07 Thread Jonas Sicking
On Sun, Feb 6, 2011 at 11:41 PM, Jeremy Orlow jor...@chromium.org wrote:
 On Sun, Feb 6, 2011 at 11:38 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Sun, Feb 6, 2011 at 2:31 PM, Jeremy Orlow jor...@chromium.org wrote:
  On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher sdwi...@mozilla.com
  wrote:
 
  On 2/6/2011 12:42 PM, Jeremy Orlow wrote:
 
  My current thinking is that we should have some relatively large
  limitmaybe on the order of 64k?  It seems like it'd be very
  difficult
  to
  hit such a limit with any sort of legitimate use case, and the chances
  of
  some subtle data-dependent error would be much less.  But a 1GB key is
  just
  not going to work well in any implementation (if it doesn't simply oom
  the
  process!).  So despite what I said earlier, I guess I think we should
  have
  some limit...but keep it an order of magnitude or two larger than what
  we
  expect any legitimate usage to hit just to keep the system as flexible
  as
  possible.
 
  Does that sound reasonable to people?
 
  Are we thinking about making this a MUST requirement, or a SHOULD?  I'm
  hesitant to spec an exact size as a MUST given how technology has a way
  of
  changing in unexpected ways that makes old constraints obsolete.  But
  then,
  I may just be overly concerned about this too.
 
  If we put a limit, it'd be a MUST for sure.  Otherwise people would
  develop
  against one of the implementations that don't place a limit and then
  their
  app would break on the others.
  The reason that I suggested 64K is that it seems outrageously big for
  the
  data types that we're looking at.  But it's too small to do much with
  base64
  encoding binary blobs into it or anything else like that that I could
  see
  becoming rather large.  So it seems like a limit that'd avoid major
  abuses
  (where someone is probably approaching the problem wrong) but would not
  come
  close to limiting any practical use I can imagine.
  With our architecture in Chrome, we will probably need to have some
  limit.
   We haven't decided what that is yet, but since I remember others saying
  similar things when we talked about this at TPAC, it seems like it might
  be
  best to standardize it--even though it does feel a bit dirty.

 One problem with putting a limit is that it basically forces
 implementations to use a specific encoding, or pay a hefty price. For
 example if we choose a 64K limit, is that of UTF8 data or of UTF16
 data? If it is of UTF8 data, and the implementation uses something
 else to store the date, you risk having to convert the data just to
 measure the size. Possibly this would be different if we measured size
 using UTF16 as javascript more or less enforces that the source string
 is UTF16 which means that you can measure utf16 size on the cheap,
 even if the stored data uses a different format.

 That's a very good point.  What's your suggestion then?  Spec unlimited
 storage and have non-normative text saying that most implementations will
 likely have some limit?  Maybe we can at least spec a minimum limit in terms
 of a particular character encoding?  (Implementations could translate this
 into the worst case size for their own native encoding and then ensure their
 limit is higher.)

I'm fine with relying on UTF16 encoding size and specifying a 64K
limit. Like Shawn points out, this API is fairly geared towards
JavaScript anyway (and I personally don't think that's a bad thing).
One thing that I just thought of is that even if implementations use
other encodings, you can in the vast majority of cases do a worst-case
estimate and easily see that the key that is used is below 64K.

That said, does having a 64K limit really help anyone? In SQLite we
can easily store vastly more than that, enough that we don't have to
specify a limit. And my understanding is that in the Microsoft
implementation, the limits for what they can store without resorting
to various tricks, is much lower. So since that implementation will
have to implement special handling of long keys anyway, is there a
difference between saying a 64K limit vs. saying unlimited?

Pablo: Would love to get your input on the above.

/ Jonas



Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?

2011-02-07 Thread Jeremy Orlow
On Mon, Feb 7, 2011 at 2:49 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Sun, Feb 6, 2011 at 11:41 PM, Jeremy Orlow jor...@chromium.org wrote:
  On Sun, Feb 6, 2011 at 11:38 PM, Jonas Sicking jo...@sicking.cc wrote:
 
  On Sun, Feb 6, 2011 at 2:31 PM, Jeremy Orlow jor...@chromium.org
 wrote:
   On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher sdwi...@mozilla.com
   wrote:
  
   On 2/6/2011 12:42 PM, Jeremy Orlow wrote:
  
   My current thinking is that we should have some relatively large
   limitmaybe on the order of 64k?  It seems like it'd be very
   difficult
   to
   hit such a limit with any sort of legitimate use case, and the
 chances
   of
   some subtle data-dependent error would be much less.  But a 1GB key
 is
   just
   not going to work well in any implementation (if it doesn't simply
 oom
   the
   process!).  So despite what I said earlier, I guess I think we
 should
   have
   some limit...but keep it an order of magnitude or two larger than
 what
   we
   expect any legitimate usage to hit just to keep the system as
 flexible
   as
   possible.
  
   Does that sound reasonable to people?
  
   Are we thinking about making this a MUST requirement, or a SHOULD?
  I'm
   hesitant to spec an exact size as a MUST given how technology has a
 way
   of
   changing in unexpected ways that makes old constraints obsolete.  But
   then,
   I may just be overly concerned about this too.
  
   If we put a limit, it'd be a MUST for sure.  Otherwise people would
   develop
   against one of the implementations that don't place a limit and then
   their
   app would break on the others.
   The reason that I suggested 64K is that it seems outrageously big for
   the
   data types that we're looking at.  But it's too small to do much with
   base64
   encoding binary blobs into it or anything else like that that I could
   see
   becoming rather large.  So it seems like a limit that'd avoid major
   abuses
   (where someone is probably approaching the problem wrong) but would
 not
   come
   close to limiting any practical use I can imagine.
   With our architecture in Chrome, we will probably need to have some
   limit.
We haven't decided what that is yet, but since I remember others
 saying
   similar things when we talked about this at TPAC, it seems like it
 might
   be
   best to standardize it--even though it does feel a bit dirty.
 
  One problem with putting a limit is that it basically forces
  implementations to use a specific encoding, or pay a hefty price. For
  example if we choose a 64K limit, is that of UTF8 data or of UTF16
  data? If it is of UTF8 data, and the implementation uses something
  else to store the date, you risk having to convert the data just to
  measure the size. Possibly this would be different if we measured size
  using UTF16 as javascript more or less enforces that the source string
  is UTF16 which means that you can measure utf16 size on the cheap,
  even if the stored data uses a different format.
 
  That's a very good point.  What's your suggestion then?  Spec unlimited
  storage and have non-normative text saying that most implementations will
  likely have some limit?  Maybe we can at least spec a minimum limit in
 terms
  of a particular character encoding?  (Implementations could translate
 this
  into the worst case size for their own native encoding and then ensure
 their
  limit is higher.)

 I'm fine with relying on UTF16 encoding size and specifying a 64K
 limit. Like Shawn points out, this API is fairly geared towards
 JavaScript anyway (and I personally don't think that's a bad thing).
 One thing that I just thought of is that even if implementations use
 other encodings, you can in the vast majority of cases do a worst-case
 estimate and easily see that the key that is used is below 64K.

 That said, does having a 64K limit really help anyone? In SQLite we
 can easily store vastly more than that, enough that we don't have to
 specify a limit. And my understanding is that in the Microsoft
 implementation, the limits for what they can store without resorting
 to various tricks, is much lower. So since that implementation will
 have to implement special handling of long keys anyway, is there a
 difference between saying a 64K limit vs. saying unlimited?


As I explained earlier: The reason that I suggested 64K is that it seems
outrageously big for the data types that we're looking at.  But it's too
small to do much with base64 encoding binary blobs into it or anything else
like that that I could see becoming rather large.  So it seems like a limit
that'd avoid major abuses (where someone is probably approaching the problem
wrong) but would not come close to limiting any practical use I can imagine.


Since Chrome sandboxes the rendering process, if a web page allocates tons
of memory and OOMs the process, you just get a sad tab or two.  But since
IndexedDB is partially in the browser process, I need to make sure a large
key is not going to OOM that 

Re: [Bug 11351] New: [IndexedDB] Should we have a maximum key size (or something like that)?

2011-02-07 Thread Jonas Sicking
On Mon, Feb 7, 2011 at 3:07 PM, Jeremy Orlow jor...@chromium.org wrote:
 On Mon, Feb 7, 2011 at 2:49 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Sun, Feb 6, 2011 at 11:41 PM, Jeremy Orlow jor...@chromium.org wrote:
  On Sun, Feb 6, 2011 at 11:38 PM, Jonas Sicking jo...@sicking.cc wrote:
 
  On Sun, Feb 6, 2011 at 2:31 PM, Jeremy Orlow jor...@chromium.org
  wrote:
   On Sun, Feb 6, 2011 at 2:03 PM, Shawn Wilsher sdwi...@mozilla.com
   wrote:
  
   On 2/6/2011 12:42 PM, Jeremy Orlow wrote:
  
   My current thinking is that we should have some relatively large
   limitmaybe on the order of 64k?  It seems like it'd be very
   difficult
   to
   hit such a limit with any sort of legitimate use case, and the
   chances
   of
   some subtle data-dependent error would be much less.  But a 1GB key
   is
   just
   not going to work well in any implementation (if it doesn't simply
   oom
   the
   process!).  So despite what I said earlier, I guess I think we
   should
   have
   some limit...but keep it an order of magnitude or two larger than
   what
   we
   expect any legitimate usage to hit just to keep the system as
   flexible
   as
   possible.
  
   Does that sound reasonable to people?
  
   Are we thinking about making this a MUST requirement, or a SHOULD?
    I'm
   hesitant to spec an exact size as a MUST given how technology has a
   way
   of
   changing in unexpected ways that makes old constraints obsolete.
    But
   then,
   I may just be overly concerned about this too.
  
   If we put a limit, it'd be a MUST for sure.  Otherwise people would
   develop
   against one of the implementations that don't place a limit and then
   their
   app would break on the others.
   The reason that I suggested 64K is that it seems outrageously big for
   the
   data types that we're looking at.  But it's too small to do much with
   base64
   encoding binary blobs into it or anything else like that that I could
   see
   becoming rather large.  So it seems like a limit that'd avoid major
   abuses
   (where someone is probably approaching the problem wrong) but would
   not
   come
   close to limiting any practical use I can imagine.
   With our architecture in Chrome, we will probably need to have some
   limit.
    We haven't decided what that is yet, but since I remember others
   saying
   similar things when we talked about this at TPAC, it seems like it
   might
   be
   best to standardize it--even though it does feel a bit dirty.
 
  One problem with putting a limit is that it basically forces
  implementations to use a specific encoding, or pay a hefty price. For
  example if we choose a 64K limit, is that of UTF8 data or of UTF16
  data? If it is of UTF8 data, and the implementation uses something
  else to store the date, you risk having to convert the data just to
  measure the size. Possibly this would be different if we measured size
  using UTF16 as javascript more or less enforces that the source string
  is UTF16 which means that you can measure utf16 size on the cheap,
  even if the stored data uses a different format.
 
  That's a very good point.  What's your suggestion then?  Spec unlimited
  storage and have non-normative text saying that
  most implementations will
  likely have some limit?  Maybe we can at least spec a minimum limit in
  terms
  of a particular character encoding?  (Implementations could translate
  this
  into the worst case size for their own native encoding and then ensure
  their
  limit is higher.)

 I'm fine with relying on UTF16 encoding size and specifying a 64K
 limit. Like Shawn points out, this API is fairly geared towards
 JavaScript anyway (and I personally don't think that's a bad thing).
 One thing that I just thought of is that even if implementations use
 other encodings, you can in the vast majority of cases do a worst-case
 estimate and easily see that the key that is used is below 64K.

 That said, does having a 64K limit really help anyone? In SQLite we
 can easily store vastly more than that, enough that we don't have to
 specify a limit. And my understanding is that in the Microsoft
 implementation, the limits for what they can store without resorting
 to various tricks, is much lower. So since that implementation will
 have to implement special handling of long keys anyway, is there a
 difference between saying a 64K limit vs. saying unlimited?

 As I explained earlier: The reason that I suggested 64K is that it seems
 outrageously big for the data types that we're looking at.  But it's too
 small to do much with base64 encoding binary blobs into it or anything else
 like that that I could see becoming rather large.  So it seems like a limit
 that'd avoid major abuses (where someone is probably approaching the problem
 wrong) but would not come close to limiting any practical use I can
 imagine.
 Since Chrome sandboxes the rendering process, if a web page allocates tons
 of memory and OOMs the process, you just get a sad tab or two.  But 

Re: [Bug 11948] New: index.openCursor's cursor should have a way to access the index's value (in addition to the index's key and objectStore's value)

2011-02-07 Thread Jonas Sicking
On Sat, Feb 5, 2011 at 11:02 AM, Jeremy Orlow jor...@chromium.org wrote:
 On Fri, Feb 4, 2011 at 11:50 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Fri, Feb 4, 2011 at 3:30 PM, Jeremy Orlow jor...@chromium.org wrote:
  We haven't used the term primary key too much in the spec, but I think a
  lot
  might actually be more clear if we used it more.  And I think it'd also
  make
  a good name here.  So I'm OK with that being the name we choose.
  Here's another question: what do we set primaryKey to for cursors opened
  via
  index.openKeyCursor and objectStore.openCursor?  It seems as though
  setting
  them to null/undefined could be confusing.  One possibility is to have
  .value and .primaryKey be the same thing for the former and .key and
  .primaryKey be the same for the latter, but that too could be confusing.
   (I
  think we have this problem no matter what we name it, but if there were
  some
  name that was more clear in these contexts, then that'd be a good reason
  to
  consider it instead.)
  J
 
  For objectStore.openCursor, if we went with primaryKey, then would we
  set
  both key and primaryKey to be the same thing?  Leaving it undefined/null
  seems odd.

 I've been pondering the same questions but so far no answer seems
 obviously best.

 One way to think about it is that it's good if you can use the same
 code to iterate over an index cursor as a objectStore cursor. For
 example to display a list of results in a table. This would indicate
 that for objectStore cursors .key and .primaryKey should have the same
 value. This sort of makes sense too since it means that a objectStore
 cursor is just a special case of an index cursor where the iterated
 index just happens to be the primary index.

 This would leave the index key-cursor. Here it would actually make
 sense to me to let .key be the index key, .primaryKey be the key in
 the objectStore, and .value be empty. This means that index cursors
 and index key-cursors work the same, with just .value being empty for
 the latter.

 So in summary

 objectStore.openCursor:
 .key = entry key
 .primaryKey = entry key
 .value = entry value

 index.openCursor:
 .key = index key
 .primaryKey = entry key
 .value = entry value

 index.openKeyCursor:
 .key = index key
 .primaryKey = entry key
 .value = undefined


 There are two bad things with this:
 1. for an objectStore cursor .key and .primaryKey are the same. This
 does seem unneccesary, but I doubt it'll be a source of bugs or
 require people to write more code. I'm less worried about confusion
 since both properties are in fact keys.

 As long as we're breaking backwards compatibility in the name of clarity, we
 might as well change key to indexKey and keep it null undefined for
 objectStore.openCursor I think.  This would eliminate the confusion.
 If we do break compat, is it possible for FF4 to include these changes?  If
 not, then I would actually lean towards leaving .key and .value as is and
 having .primaryKey duplicate info for index.openKeyCursor and
 objectStore.openCursor.

Actually, I quite like the idea of having objectStore-cursors just be
a special case of index-cursors. Which also allows us to keep the nice
and short name key of being the key that you are iterating (be that
a primary key or an index key).

 2. You can't use the same code to iterate over a key-cursor and a
 normal cursor and display the result in a table. However I suspect
 that in most cases key cursors will be used for different things, such
 as joins, rather than reusing code that would normally use values.

 I'm not super worried about this.  I think it it's more important to be
 clear than make it easy to share code between the different types of
 cursors.
 On the other hand, it would be nice if there were some way for code to be
 able to figure out what type of cursor they're working with.  Since values
 can be undefined, they won't be able to just look at .key, .primaryKey, and
 .value to figure it out though.  Maybe we need some attribute that says what
 type of cursor it is?

You can always tell objectStore cursors apart by looking at the
.source property which we've discussed adding to cursors. One solution
for telling index-cursors from index-key-cursors is to make the latter
simply not have a .value property (rather than having one that returns
undefined).

It's not the most convenient way of telling the cursor types apart,
but I'm also not sure the use case is important to make terribly easy.

I'll have to look into how much of this, if any, we can do for FF4.

/ Jonas



Re: [Bug 11948] New: index.openCursor's cursor should have a way to access the index's value (in addition to the index's key and objectStore's value)

2011-02-07 Thread Jeremy Orlow
On Mon, Feb 7, 2011 at 3:47 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Sat, Feb 5, 2011 at 11:02 AM, Jeremy Orlow jor...@chromium.org wrote:
  On Fri, Feb 4, 2011 at 11:50 PM, Jonas Sicking jo...@sicking.cc wrote:
 
  On Fri, Feb 4, 2011 at 3:30 PM, Jeremy Orlow jor...@chromium.org
 wrote:
   We haven't used the term primary key too much in the spec, but I think
 a
   lot
   might actually be more clear if we used it more.  And I think it'd
 also
   make
   a good name here.  So I'm OK with that being the name we choose.
   Here's another question: what do we set primaryKey to for cursors
 opened
   via
   index.openKeyCursor and objectStore.openCursor?  It seems as though
   setting
   them to null/undefined could be confusing.  One possibility is to have
   .value and .primaryKey be the same thing for the former and .key and
   .primaryKey be the same for the latter, but that too could be
 confusing.
(I
   think we have this problem no matter what we name it, but if there
 were
   some
   name that was more clear in these contexts, then that'd be a good
 reason
   to
   consider it instead.)
   J
  
   For objectStore.openCursor, if we went with primaryKey, then would we
   set
   both key and primaryKey to be the same thing?  Leaving it
 undefined/null
   seems odd.
 
  I've been pondering the same questions but so far no answer seems
  obviously best.
 
  One way to think about it is that it's good if you can use the same
  code to iterate over an index cursor as a objectStore cursor. For
  example to display a list of results in a table. This would indicate
  that for objectStore cursors .key and .primaryKey should have the same
  value. This sort of makes sense too since it means that a objectStore
  cursor is just a special case of an index cursor where the iterated
  index just happens to be the primary index.
 
  This would leave the index key-cursor. Here it would actually make
  sense to me to let .key be the index key, .primaryKey be the key in
  the objectStore, and .value be empty. This means that index cursors
  and index key-cursors work the same, with just .value being empty for
  the latter.
 
  So in summary
 
  objectStore.openCursor:
  .key = entry key
  .primaryKey = entry key
  .value = entry value
 
  index.openCursor:
  .key = index key
  .primaryKey = entry key
  .value = entry value
 
  index.openKeyCursor:
  .key = index key
  .primaryKey = entry key
  .value = undefined
 
 
  There are two bad things with this:
  1. for an objectStore cursor .key and .primaryKey are the same. This
  does seem unneccesary, but I doubt it'll be a source of bugs or
  require people to write more code. I'm less worried about confusion
  since both properties are in fact keys.
 
  As long as we're breaking backwards compatibility in the name of clarity,
 we
  might as well change key to indexKey and keep it null undefined for
  objectStore.openCursor I think.  This would eliminate the confusion.
  If we do break compat, is it possible for FF4 to include these changes?
  If
  not, then I would actually lean towards leaving .key and .value as is and
  having .primaryKey duplicate info for index.openKeyCursor and
  objectStore.openCursor.

 Actually, I quite like the idea of having objectStore-cursors just be
 a special case of index-cursors. Which also allows us to keep the nice
 and short name key of being the key that you are iterating (be that
 a primary key or an index key).


Can you explain further?  I don't fully understand you.

Here's another proposal (which is maybe what you meant?):

objectStore.openCursor:
.key = entry key
.value = entry value

index.openCursor:
.indexKey = index key
.key = entry key
.value = entry value

index.openKeyCursor:
.indexKey = index key
.key = entry key

Note that I'm thinking we should probably sub-class IDBCursor for each type
so that attributes don't show up if we're not going to populate them.

Which we maybe should do for IDBRequest as well?

J


Re: [IndexedDB] Reason for aborting transactions

2011-02-07 Thread Jonas Sicking
On Fri, Jan 28, 2011 at 4:33 PM, Jeremy Orlow jor...@chromium.org wrote:
 We do that as well.
 What's the best way to do it API wise?  Do we need to add an
 IDBTransactionError object with error codes and such?

I don't actually know. I can't think of a precedence. Usually you use
different error codes for different errors, but here we want to
distinguish a particular type of error (aborts) into several sub
categories.

To make this more complicated, I actually think we're going to end up
having to change a lot of error handling when things are all said and
done. Error handling right now is sort of a mess since DOM exceptions
are vastly different from JavaScript exceptions. Also DOM exceptions
have a messy situation of error codes overlapping making it very easy
to confuse a IDBDatabaseException with a DOMException with an
overlapping error code.

For details, see
http://lists.w3.org/Archives/Public/public-script-coord/2010OctDec/0112.html

So my gut feeling is that we'll have to revamp exceptions quite a bit
before we unprefix our implementation. This is very unfortunate, but
shouldn't be as big deal of a deal as many other changes as most of
the time people don't have error handling code. Or at least not error
handling code that differentiates the various errors.

Unfortunately we can't make any changes to the spec here until WebIDL
prescribes what the new exceptions should look like :(

So to loop back to your original question, I think that the best way
to expose the different types of aborts is by adding a .reason (or
better named) property which returns a string or enum which describes
the reason for the abort.

/ Jonas



Re: [IndexedDB] Reason for aborting transactions

2011-02-07 Thread Jeremy Orlow
On Mon, Feb 7, 2011 at 7:36 PM, Jonas Sicking jo...@sicking.cc wrote:

 On Fri, Jan 28, 2011 at 4:33 PM, Jeremy Orlow jor...@chromium.org wrote:
  We do that as well.
  What's the best way to do it API wise?  Do we need to add an
  IDBTransactionError object with error codes and such?

 I don't actually know. I can't think of a precedence. Usually you use
 different error codes for different errors, but here we want to
 distinguish a particular type of error (aborts) into several sub
 categories.


I don't see how that's any different than what we're doing with the onerror
error codes though?


 To make this more complicated, I actually think we're going to end up
 having to change a lot of error handling when things are all said and
 done. Error handling right now is sort of a mess since DOM exceptions
 are vastly different from JavaScript exceptions. Also DOM exceptions
 have a messy situation of error codes overlapping making it very easy
 to confuse a IDBDatabaseException with a DOMException with an
 overlapping error code.

 For details, see

 http://lists.w3.org/Archives/Public/public-script-coord/2010OctDec/0112.html

 So my gut feeling is that we'll have to revamp exceptions quite a bit
 before we unprefix our implementation. This is very unfortunate, but
 shouldn't be as big deal of a deal as many other changes as most of
 the time people don't have error handling code. Or at least not error
 handling code that differentiates the various errors.

 Unfortunately we can't make any changes to the spec here until WebIDL
 prescribes what the new exceptions should look like :(

 So to loop back to your original question, I think that the best way
 to expose the different types of aborts is by adding a .reason (or
 better named) property which returns a string or enum which describes
 the reason for the abort.


Could we just add .abortCode, .abortReason, and constants for each code to
IDBTransaction?  And maybe revisit in the future?

J


[IndexedDB] setVersion blocked on uncollected garbage IDBDatabases

2011-02-07 Thread Jeremy Orlow
We're currently implementing the onblocked/setVersion semantics and ran into
an interesting problem: if you don't call .close() on a database and simply
expect it to be collected, then you ever being able to run a setVersion
transaction is at the mercy of the garbage collecter doing a collection.
 Otherwise implementations will assume the database is still open...right?

If so, this seems bad.  But I can't think of any way to get around it.
 Thoughts?

J