[WebSQLDatabase] Adding a vacuum() call

2010-03-04 Thread Dumitru Daniliuc
Hi,

We (Chromium) would like to add a vacuum() call on the Database object. The
idea is to allow the SQL engine to vacuum/compress/clean-up/do some
potentially expensive maintenance work on the respective database. In
SQLite's case, in particular, the vacuum() call would issue a VACUUM command
(which cannot run in a transaction, hence the web app developers do not have
access to it currently), which would make SQLite compact the database by
reusing the empty space on fragmented pages. We think this call is
necessary, because fragmented databases could lead to apps prematurely
reaching their quota limits.

Letting the UAs deal with this problem is not always an option. For example,
big apps like Gmail could always be "on" (loaded as soon as the browser
comes up and closed only when the browser shuts down), which means the UAs
would never have an opportunity to clean-up/compact the databases of those
apps without blocking them. In these cases, we believe the apps should be
responsible (and should have the opportunity) to clean up their own
databases when they feel that it is the best time to do so.

So we were wondering if it makes sense to add this call to the
WebSQLDatabases spec too.

The call would look like this:

interface Database {
  // the methods and properties currently in the spec
  void vacuum(in optional SQLVacuumErrorCallback errorCallback, in optional
SQLVoidCallback successCallback);
};

[Callback=FunctionOnly, NoInterfaceObject]
interface SQLVacuumErrorCallback {
  void handleEvent(in SQLError error);
};

And the steps would be the following:

   1. If vacuum() is called inside a transaction on the same database, jump
   to the "in case of error" steps below.
   2. Queue a task to vacuum, compress, clean-up or do any other necessary
   maintenance work on the given database.
   3. If the operation succeeds and a successCallback is provided, queue a
   task to invoke the successCallback.
   4. If the clean-up operation fails in any way, jump to the "in case of
   error" steps below.

In case of error:

   1. In an errorCallback is provided, queue a task to invoke the
   errorCallback.


Thanks,
Dumi


[WebSQLDatabase] Adding a vacuum() call

2010-03-05 Thread Dumitru Daniliuc
(reposting from the right email account)

hi joao,

thanks for your comments!


> I would argue about having something a bit more generic for naming like
> "defragment()".
>

that's totally fine with me.


> I don't see how the callbacks are useful though. Vacuum works
> transparently, its effects are not visible, and what should the page do in
> case of error ?
>

i was thinking of something like:

db.defragment(errorCallback, successCallback);
showAPrettyImageAndAskTheUserToWait();

function errorCallback(error) {}
function successCallback() {
  getRidOfThePrettyImageAndRestartTheApp();
}

just like you, i'm not sure if the error callback is useful at all, but i
thought i'd add it to make the defragment() call look more like a
transaction. maybe we don't need it.


> Given that an operation like vacuum would be disk IO intensive, would it be
> good to give the webpage control over a feature that can have significant
> performance concerns ?
>

first, i think it's better to give big, always-on apps access to a
potentially expensive clean-up call, than not give them any way at all to
defragment their databases. second, the app which called defragment() is the
only one that should notice the performance overhead (in practice, other web
apps might notice it too because of slower IO, but this is no different from
having an external, non-web app doing some intensive IO work and slowing
down all web apps). and third, most apps will probably never need to use
this call; it is mostly intended for apps with big databases that are always
open on a user's PC. those apps would usually know when they can call
defragment() without slowing down the user (for example, they could do it
when the app is idle).


> While computers benefit from good file IO performance, that's not quite
> true in many mobile devices. So, the API would be more like an hint ? How
> can the webpage know the level of fragmentation of the data file ?
>

on mobile devices i'm imagining the quotas and DBs are pretty small, in
which case defragmenting databases probably wouldn't save much space anyway.
so maybe those platforms should decide that they never need to do any kind
of maintenance on their DBs and make defragment() a no-op?

i don't think a webapp can figure out the level of fragmentation (maybe we
need an API to expose some quota/DB size information to the webapp?).
however, it could use other criteria to decide when to call defragment():
once every couple of weeks, when the system is idle, etc.


> Sqlite supports incremental vacuum http://www.sqlite.org/pragma.html so
> this kind of feature should be left for sqlite to handle for itself.
>

in addition to what jeremy said, having auto-vacuum on is going to add some
performance overhead to every transaction. this is not acceptable to some
apps.

thanks,
dumi


Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-04 Thread João Eiras
On Fri, 05 Mar 2010 03:22:00 +0100, Dumitru Daniliuc   
wrote:



Hi,

We (Chromium) would like to add a vacuum() call on the Database object.


[...]

I would argue about having something a bit more generic for naming like  
"defragment()".


I don't see how the callbacks are useful though. Vacuum works  
transparently, its effects are not visible, and what should the page do in  
case of error ?


Given that an operation like vacuum would be disk IO intensive, would it  
be good to give the webpage control over a feature that can have  
significant performance concerns ?
While computers benefit from good file IO performance, that's not quite  
true in many mobile devices. So, the API would be more like an hint ? How  
can the webpage know the level of fragmentation of the data file ?


Sqlite supports incremental vacuum http://www.sqlite.org/pragma.html so  
this kind of feature should be left for sqlite to handle for itself.




Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-05 Thread Jeremy Orlow
On Fri, Mar 5, 2010 at 2:32 AM, João Eiras  wrote:

> On Fri, 05 Mar 2010 03:22:00 +0100, Dumitru Daniliuc 
> wrote:
>
>  Hi,
>>
>> We (Chromium) would like to add a vacuum() call on the Database object.
>>
>
> [...]
>
> I would argue about having something a bit more generic for naming like
> "defragment()".
>
> I don't see how the callbacks are useful though. Vacuum works
> transparently, its effects are not visible, and what should the page do in
> case of error ?
>
> Given that an operation like vacuum would be disk IO intensive, would it be
> good to give the webpage control over a feature that can have significant
> performance concerns ?
>

I'm sure a UA could always ignore it and/or vacuum whenever it wants...but
the drawback is that if you start vacuuming and then the app tries to open
the database, you might get some serious jank in the web app.


> While computers benefit from good file IO performance, that's not quite
> true in many mobile devices. So, the API would be more like an hint ? How
> can the webpage know the level of fragmentation of the data file ?
>
> Sqlite supports incremental vacuum http://www.sqlite.org/pragma.html so
> this kind of feature should be left for sqlite to handle for itself.
>

>From that page:
"""
Note, however, that auto-vacuum only truncates the freelist pages from the
file. Auto-vacuum does not defragment the database nor repack individual
database pages the way that the VACUUM  command does. In
fact, because it moves pages around within the file, auto-vacuum can
actually make fragmentation worse.
"""


Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-05 Thread Dumitru Daniliuc
hi joao,

thanks for your comments!


> I would argue about having something a bit more generic for naming like
> "defragment()".
>

that's totally fine with me.


> I don't see how the callbacks are useful though. Vacuum works
> transparently, its effects are not visible, and what should the page do in
> case of error ?
>

i was thinking of something like:

db.defragment(errorCallback, successCallback);
showAPrettyImageAndAskTheUserToWait();

function errorCallback(error) {}
function successCallback() {
  getRidOfThePrettyImageAndRestartTheApp();
}

just like you, i'm not sure if the error callback is useful at all, but i
thought i'd add it to make the defragment() call look more like a
transaction. maybe we don't need it.


> Given that an operation like vacuum would be disk IO intensive, would it be
> good to give the webpage control over a feature that can have significant
> performance concerns ?
>

first, i think it's better to give big, always-on apps access to a
potentially expensive clean-up call, than not give them any way at all to
defragment their databases. second, the app which called defragment() is the
only one that should notice the performance overhead (in practice, other web
apps might notice it too because of slower IO, but this is no different from
having an external, non-web app doing some intensive IO work and slowing
down all web apps). and third, most apps will probably never need to use
this call; it is mostly intended for apps with big databases that are always
open on a user's PC. those apps would usually know when they can call
defragment() without slowing down the user (for example, they could do it
when the app is idle).


> While computers benefit from good file IO performance, that's not quite
> true in many mobile devices. So, the API would be more like an hint ? How
> can the webpage know the level of fragmentation of the data file ?
>

on mobile devices i'm imagining the quotas and DBs are pretty small, in
which case defragmenting databases probably wouldn't save much space anyway.
so maybe those platforms should decide that they never need to do any kind
of maintenance on their DBs and make defragment() a no-op?

i don't think a webapp can figure out the level of fragmentation (maybe we
need an API to expose some quota/DB size information to the webapp?).
however, it could use other criteria to decide when to call defragment():
once every couple of weeks, when the system is idle, etc.


> Sqlite supports incremental vacuum http://www.sqlite.org/pragma.html so
> this kind of feature should be left for sqlite to handle for itself.
>

in addition to what jeremy said, having auto-vacuum on is going to add some
performance overhead to every transaction. this is not acceptable to some
apps.

thanks,
dumi


Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-08 Thread João Eiras



I don't see how the callbacks are useful though. Vacuum works
transparently, its effects are not visible, and what should the page do  
in

case of error ?



i was thinking of something like:

db.defragment(errorCallback, successCallback);
showAPrettyImageAndAskTheUserToWait();

function errorCallback(error) {}
function successCallback() {
  getRidOfThePrettyImageAndRestartTheApp();
}

just like you, i'm not sure if the error callback is useful at all, but i
thought i'd add it to make the defragment() call look more like a
transaction. maybe we don't need it.



True, but this is a kind of operation that could very well just run on the  
background, with a single optional callback when it's done (the webpage  
can't do anything if an error is detected anyway). The user agent would  
need to queue any subsequent transactions if a vacuum is running. I would  
consider it as an hint, and after all webpages that own references to the  
underlying data files are closed, would do a vacuum. So, if you have many  
tabs on gmail, and that a single gmail instance tries to do multiple  
vacuums, it would equiv to one single vacuum operation.



Sqlite supports incremental vacuum http://www.sqlite.org/pragma.html so
this kind of feature should be left for sqlite to handle for itself.



in addition to what jeremy said, having auto-vacuum on is going to add  
some

performance overhead to every transaction. this is not acceptable to some
apps.



I perused through the docs, and unfortunately auto-vaccum seems not like  
the ideal solution.



--

João Eiras
Core Developer, Opera Software ASA, http://www.opera.com/



Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-08 Thread Dumitru Daniliuc
On Mon, Mar 8, 2010 at 3:39 AM, João Eiras  wrote:

>
>  I don't see how the callbacks are useful though. Vacuum works
>>> transparently, its effects are not visible, and what should the page do
>>> in
>>> case of error ?
>>>
>>>
>> i was thinking of something like:
>>
>> db.defragment(errorCallback, successCallback);
>> showAPrettyImageAndAskTheUserToWait();
>>
>> function errorCallback(error) {}
>> function successCallback() {
>>  getRidOfThePrettyImageAndRestartTheApp();
>> }
>>
>> just like you, i'm not sure if the error callback is useful at all, but i
>> thought i'd add it to make the defragment() call look more like a
>> transaction. maybe we don't need it.
>>
>>
> True, but this is a kind of operation that could very well just run on the
> background, with a single optional callback when it's done (the webpage
> can't do anything if an error is detected anyway).


ok, so let's drop the errorCallback: vacuum([optional] successCallback);


> The user agent would need to queue any subsequent transactions if a vacuum
> is running. I would consider it as an hint, and after all webpages that own
> references to the underlying data files are closed, would do a vacuum. So,
> if you have many tabs on gmail, and that a single gmail instance tries to do
> multiple vacuums, it would equiv to one single vacuum operation.


what do we do if some databases are opened for the entire life of the
browser? for example, i open my browser which has myfavoriteapp.com set as
its homepage. myfavoriteapp.com immediately opens a DB, and i only close
that app when i close the browser. when would the browser vacuum
myfavoriteapp's DBs in this case?

i think it's ok for the UA to vacuum some DBs automatically when it thinks
it's a good time to do so; however, if a platform supports the vacuum/defrag
call (i.e. if it doesn't treat it is a no-op), then i think a vacuum call
coming from the app should be immediately scheduled (yes, the subsequent
transactions would have to wait for the vacuuming to finish running). in
some cases, the apps know better than the UA when to vacuum their DBs.

by the way, we should probably agree on a name for this call. which one do
you prefer? vacuum, defragment, defrag, something else? i don't have a
strong opinion.

dumi


Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-09 Thread Jeremy Orlow
On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc  wrote:

>
>
> On Mon, Mar 8, 2010 at 3:39 AM, João Eiras  wrote:
>
>>
>>  I don't see how the callbacks are useful though. Vacuum works
 transparently, its effects are not visible, and what should the page do
 in
 case of error ?


>>> i was thinking of something like:
>>>
>>> db.defragment(errorCallback, successCallback);
>>> showAPrettyImageAndAskTheUserToWait();
>>>
>>> function errorCallback(error) {}
>>> function successCallback() {
>>>  getRidOfThePrettyImageAndRestartTheApp();
>>> }
>>>
>>> just like you, i'm not sure if the error callback is useful at all, but i
>>> thought i'd add it to make the defragment() call look more like a
>>> transaction. maybe we don't need it.
>>>
>>>
>> True, but this is a kind of operation that could very well just run on the
>> background, with a single optional callback when it's done (the webpage
>> can't do anything if an error is detected anyway).
>
>
> ok, so let's drop the errorCallback: vacuum([optional] successCallback);
>
>
>> The user agent would need to queue any subsequent transactions if a vacuum
>> is running. I would consider it as an hint, and after all webpages that own
>> references to the underlying data files are closed, would do a vacuum. So,
>> if you have many tabs on gmail, and that a single gmail instance tries to do
>> multiple vacuums, it would equiv to one single vacuum operation.
>
>
> what do we do if some databases are opened for the entire life of the
> browser? for example, i open my browser which has myfavoriteapp.com set as
> its homepage. myfavoriteapp.com immediately opens a DB, and i only close
> that app when i close the browser. when would the browser vacuum
> myfavoriteapp's DBs in this case?
>
> i think it's ok for the UA to vacuum some DBs automatically when it thinks
> it's a good time to do so; however, if a platform supports the vacuum/defrag
> call (i.e. if it doesn't treat it is a no-op), then i think a vacuum call
> coming from the app should be immediately scheduled (yes, the subsequent
> transactions would have to wait for the vacuuming to finish running). in
> some cases, the apps know better than the UA when to vacuum their DBs.
>
> by the way, we should probably agree on a name for this call. which one do
> you prefer? vacuum, defragment, defrag, something else? i don't have a
> strong opinion.
>

I think vacuum is fine since the spec is already tied to the SQLite SQL
dialect.

collectGarbage() is another possibility

Go with whatever you think is most clear and accurate though.

J


Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread Dumitru Daniliuc
joao,

it looks like we mostly agree on this feature, so i was wondering if we
could formally agree on a spec. here's what i propose:

1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases
implementations use SQLite, and in SQLite the command is called VACUUM. so
it seems to me that we might as well call the new function vacuum(). what do
you think?

2. spec: no need for an error callback.

interface Database {
  // the methods and properties currently in the spec
  void vacuum(in optional SQLVoidCallback successCallback);
};

3. what the call should do: the purpose of this call is to allow apps to
vacuum/compact/defragment/clean up their databases whenever they see fit. a
call to vacuum() could take a considerable amount of time (especially on
larger or highly fragmented databases); therefore, it is not recommended for
web apps to call this method during periods of high activity.

how to process a vacuum() call:

   1. if the UA does not support this call (mobile browsers?), jump to step
   3.
   2. queue up a task to vacuum/compact/defragment/clean up the database.
   3. if the task succeeds, and a success callback is provided, queue up a
   task to invoke the success callback; in all other cases (the task failed, or
   no success callback was provided), do nothing: proceed to the next task in
   the queue.

does this seem acceptable? we (google engineers interested in this) feel
that UAs should either not implement the vacuum() call, or they should
respect it (rather than taking it as a hint). it is ok for UAs to keep track
of things like system idleness or databases closing to do more vacuuming
that the apps asked for, if they want to. however, we feel that a vacuum()
request by an app should not be postponed, simply because sometimes apps
know better than UAs when the best time to vacuum is (it might be nice to
give apps more information on how fragmented their databases are, but that's
a separate discussion).

thanks,
dumi


2010/3/9 Jeremy Orlow 

> On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc  wrote:
>
>>
>>
>> On Mon, Mar 8, 2010 at 3:39 AM, João Eiras  wrote:
>>
>>>
>>>  I don't see how the callbacks are useful though. Vacuum works
> transparently, its effects are not visible, and what should the page do
> in
> case of error ?
>
>
 i was thinking of something like:

 db.defragment(errorCallback, successCallback);
 showAPrettyImageAndAskTheUserToWait();

 function errorCallback(error) {}
 function successCallback() {
  getRidOfThePrettyImageAndRestartTheApp();
 }

 just like you, i'm not sure if the error callback is useful at all, but
 i
 thought i'd add it to make the defragment() call look more like a
 transaction. maybe we don't need it.


>>> True, but this is a kind of operation that could very well just run on
>>> the background, with a single optional callback when it's done (the webpage
>>> can't do anything if an error is detected anyway).
>>
>>
>> ok, so let's drop the errorCallback: vacuum([optional] successCallback);
>>
>>
>>> The user agent would need to queue any subsequent transactions if a
>>> vacuum is running. I would consider it as an hint, and after all webpages
>>> that own references to the underlying data files are closed, would do a
>>> vacuum. So, if you have many tabs on gmail, and that a single gmail instance
>>> tries to do multiple vacuums, it would equiv to one single vacuum operation.
>>
>>
>> what do we do if some databases are opened for the entire life of the
>> browser? for example, i open my browser which has myfavoriteapp.com set
>> as its homepage. myfavoriteapp.com immediately opens a DB, and i only
>> close that app when i close the browser. when would the browser vacuum
>> myfavoriteapp's DBs in this case?
>>
>> i think it's ok for the UA to vacuum some DBs automatically when it thinks
>> it's a good time to do so; however, if a platform supports the vacuum/defrag
>> call (i.e. if it doesn't treat it is a no-op), then i think a vacuum call
>> coming from the app should be immediately scheduled (yes, the subsequent
>> transactions would have to wait for the vacuuming to finish running). in
>> some cases, the apps know better than the UA when to vacuum their DBs.
>>
>> by the way, we should probably agree on a name for this call. which one do
>> you prefer? vacuum, defragment, defrag, something else? i don't have a
>> strong opinion.
>>
>
> I think vacuum is fine since the spec is already tied to the SQLite SQL
> dialect.
>
> collectGarbage() is another possibility
>
> Go with whatever you think is most clear and accurate though.
>
> J
>


Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread João Eiras
On Fri, 12 Mar 2010 01:08:41 +0100, Dumitru Daniliuc   
wrote:



joao,

it looks like we mostly agree on this feature, so i was wondering if we
could formally agree on a spec. here's what i propose:

1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases
implementations use SQLite, and in SQLite the command is called VACUUM.  
so
it seems to me that we might as well call the new function vacuum().  
what do

you think?


I'm fine with it but...



2. spec: no need for an error callback.

interface Database {
  // the methods and properties currently in the spec
  void vacuum(in optional SQLVoidCallback successCallback);
};

3. what the call should do: the purpose of this call is to allow apps to
vacuum/compact/defragment/clean up their databases whenever they see  
fit. a

call to vacuum() could take a considerable amount of time (especially on
larger or highly fragmented databases); therefore, it is not recommended  
for

web apps to call this method during periods of high activity.

how to process a vacuum() call:

   1. if the UA does not support this call (mobile browsers?), jump to  
step

   3.
   2. queue up a task to vacuum/compact/defragment/clean up the database.
   3. if the task succeeds, and a success callback is provided, queue up  
a
   task to invoke the success callback; in all other cases (the task  
failed, or
   no success callback was provided), do nothing: proceed to the next  
task in

   the queue.

does this seem acceptable? we (google engineers interested in this) feel
that UAs should either not implement the vacuum() call, or they should
respect it (rather than taking it as a hint). it is ok for UAs to keep  
track

of things like system idleness or databases closing to do more vacuuming
that the apps asked for, if they want to. however, we feel that a  
vacuum()

request by an app should not be postponed, simply because sometimes apps
know better than UAs when the best time to vacuum is (it might be nice to
give apps more information on how fragmented their databases are, but  
that's

a separate discussion).



I unfortunately don't agree with this last sentence :) Web pages do not  
have a way to know the level of fragmentation, and the last thing user  
wants is a web page calling vacuum() before each transaction, *just to be  
safe*, because of potential fragmentation.


But the API is good I think.


thanks,
dumi






Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread Michael Nordman
Instead of calling back on success only, maybe call back on completion
regardless of success or failure. This way the caller would know when the
potentially lengthy operation was done, regardless of the outcome.

2010/3/11 Dumitru Daniliuc 

> joao,
>
> it looks like we mostly agree on this feature, so i was wondering if we
> could formally agree on a spec. here's what i propose:
>
> 1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases
> implementations use SQLite, and in SQLite the command is called VACUUM. so
> it seems to me that we might as well call the new function vacuum(). what do
> you think?
>
> 2. spec: no need for an error callback.
>
>
> interface Database {
>   // the methods and properties currently in the spec
>   void vacuum(in optional SQLVoidCallback successCallback);
> };
>
> 3. what the call should do: the purpose of this call is to allow apps to
> vacuum/compact/defragment/clean up their databases whenever they see fit. a
> call to vacuum() could take a considerable amount of time (especially on
> larger or highly fragmented databases); therefore, it is not recommended for
> web apps to call this method during periods of high activity.
>
> how to process a vacuum() call:
>
>1. if the UA does not support this call (mobile browsers?), jump to
>step 3.
>2. queue up a task to vacuum/compact/defragment/clean up the database.
>3. if the task succeeds, and a success callback is provided, queue up a
>task to invoke the success callback; in all other cases (the task failed, 
> or
>no success callback was provided), do nothing: proceed to the next task in
>the queue.
>
> does this seem acceptable? we (google engineers interested in this) feel
> that UAs should either not implement the vacuum() call, or they should
> respect it (rather than taking it as a hint). it is ok for UAs to keep track
> of things like system idleness or databases closing to do more vacuuming
> that the apps asked for, if they want to. however, we feel that a vacuum()
> request by an app should not be postponed, simply because sometimes apps
> know better than UAs when the best time to vacuum is (it might be nice to
> give apps more information on how fragmented their databases are, but that's
> a separate discussion).
>
> thanks,
> dumi
>
>
> 2010/3/9 Jeremy Orlow 
>
> On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc  wrote:
>>
>>>
>>>
>>> On Mon, Mar 8, 2010 at 3:39 AM, João Eiras  wrote:
>>>

  I don't see how the callbacks are useful though. Vacuum works
>> transparently, its effects are not visible, and what should the page
>> do in
>> case of error ?
>>
>>
> i was thinking of something like:
>
> db.defragment(errorCallback, successCallback);
> showAPrettyImageAndAskTheUserToWait();
>
> function errorCallback(error) {}
> function successCallback() {
>  getRidOfThePrettyImageAndRestartTheApp();
> }
>
> just like you, i'm not sure if the error callback is useful at all, but
> i
> thought i'd add it to make the defragment() call look more like a
> transaction. maybe we don't need it.
>
>
 True, but this is a kind of operation that could very well just run on
 the background, with a single optional callback when it's done (the webpage
 can't do anything if an error is detected anyway).
>>>
>>>
>>> ok, so let's drop the errorCallback: vacuum([optional] successCallback);
>>>
>>>
 The user agent would need to queue any subsequent transactions if a
 vacuum is running. I would consider it as an hint, and after all webpages
 that own references to the underlying data files are closed, would do a
 vacuum. So, if you have many tabs on gmail, and that a single gmail 
 instance
 tries to do multiple vacuums, it would equiv to one single vacuum 
 operation.
>>>
>>>
>>> what do we do if some databases are opened for the entire life of the
>>> browser? for example, i open my browser which has myfavoriteapp.com set
>>> as its homepage. myfavoriteapp.com immediately opens a DB, and i only
>>> close that app when i close the browser. when would the browser vacuum
>>> myfavoriteapp's DBs in this case?
>>>
>>> i think it's ok for the UA to vacuum some DBs automatically when it
>>> thinks it's a good time to do so; however, if a platform supports the
>>> vacuum/defrag call (i.e. if it doesn't treat it is a no-op), then i think a
>>> vacuum call coming from the app should be immediately scheduled (yes, the
>>> subsequent transactions would have to wait for the vacuuming to finish
>>> running). in some cases, the apps know better than the UA when to vacuum
>>> their DBs.
>>>
>>> by the way, we should probably agree on a name for this call. which one
>>> do you prefer? vacuum, defragment, defrag, something else? i don't have a
>>> strong opinion.
>>>
>>
>> I think vacuum is fine since the spec is already tied to the SQLite SQL
>> dialect.
>>
>> collectGarbage() is 

Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread Dumitru Daniliuc
sounds good to me:

interface Database {
 // the methods and properties currently in the spec
 void vacuum(in optional SQLVoidCallback completionCallback);
};

"... upon completion, queue up a task to invoke the completionCallback, if
one was provided."

dumi


On Thu, Mar 11, 2010 at 4:20 PM, Michael Nordman wrote:

> Instead of calling back on success only, maybe call back on completion
> regardless of success or failure. This way the caller would know when the
> potentially lengthy operation was done, regardless of the outcome.
>
> 2010/3/11 Dumitru Daniliuc 
>
> joao,
>>
>> it looks like we mostly agree on this feature, so i was wondering if we
>> could formally agree on a spec. here's what i propose:
>>
>> 1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases
>> implementations use SQLite, and in SQLite the command is called VACUUM. so
>> it seems to me that we might as well call the new function vacuum(). what do
>> you think?
>>
>> 2. spec: no need for an error callback.
>>
>>
>> interface Database {
>>   // the methods and properties currently in the spec
>>   void vacuum(in optional SQLVoidCallback successCallback);
>> };
>>
>> 3. what the call should do: the purpose of this call is to allow apps to
>> vacuum/compact/defragment/clean up their databases whenever they see fit. a
>> call to vacuum() could take a considerable amount of time (especially on
>> larger or highly fragmented databases); therefore, it is not recommended for
>> web apps to call this method during periods of high activity.
>>
>> how to process a vacuum() call:
>>
>>1. if the UA does not support this call (mobile browsers?), jump to
>>step 3.
>>2. queue up a task to vacuum/compact/defragment/clean up the database.
>>3. if the task succeeds, and a success callback is provided, queue up
>>a task to invoke the success callback; in all other cases (the task 
>> failed,
>>or no success callback was provided), do nothing: proceed to the next task
>>in the queue.
>>
>> does this seem acceptable? we (google engineers interested in this) feel
>> that UAs should either not implement the vacuum() call, or they should
>> respect it (rather than taking it as a hint). it is ok for UAs to keep track
>> of things like system idleness or databases closing to do more vacuuming
>> that the apps asked for, if they want to. however, we feel that a vacuum()
>> request by an app should not be postponed, simply because sometimes apps
>> know better than UAs when the best time to vacuum is (it might be nice to
>> give apps more information on how fragmented their databases are, but that's
>> a separate discussion).
>>
>> thanks,
>> dumi
>>
>>
>> 2010/3/9 Jeremy Orlow 
>>
>> On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc  wrote:
>>>


 On Mon, Mar 8, 2010 at 3:39 AM, João Eiras  wrote:

>
>  I don't see how the callbacks are useful though. Vacuum works
>>> transparently, its effects are not visible, and what should the page
>>> do in
>>> case of error ?
>>>
>>>
>> i was thinking of something like:
>>
>> db.defragment(errorCallback, successCallback);
>> showAPrettyImageAndAskTheUserToWait();
>>
>> function errorCallback(error) {}
>> function successCallback() {
>>  getRidOfThePrettyImageAndRestartTheApp();
>> }
>>
>> just like you, i'm not sure if the error callback is useful at all,
>> but i
>> thought i'd add it to make the defragment() call look more like a
>> transaction. maybe we don't need it.
>>
>>
> True, but this is a kind of operation that could very well just run on
> the background, with a single optional callback when it's done (the 
> webpage
> can't do anything if an error is detected anyway).


 ok, so let's drop the errorCallback: vacuum([optional] successCallback);


> The user agent would need to queue any subsequent transactions if a
> vacuum is running. I would consider it as an hint, and after all webpages
> that own references to the underlying data files are closed, would do a
> vacuum. So, if you have many tabs on gmail, and that a single gmail 
> instance
> tries to do multiple vacuums, it would equiv to one single vacuum 
> operation.


 what do we do if some databases are opened for the entire life of the
 browser? for example, i open my browser which has myfavoriteapp.com set
 as its homepage. myfavoriteapp.com immediately opens a DB, and i only
 close that app when i close the browser. when would the browser vacuum
 myfavoriteapp's DBs in this case?

 i think it's ok for the UA to vacuum some DBs automatically when it
 thinks it's a good time to do so; however, if a platform supports the
 vacuum/defrag call (i.e. if it doesn't treat it is a no-op), then i think a
 vacuum call coming from the app should be immediately scheduled (yes, the
 subsequent transac

Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread Dumitru Daniliuc
joao,

if i understand correctly, you basically want to have an automated system
implemented in the browser that decides when to vacuum databases (or at
least make sure it doesn't happen too often). and the vacuum() calls would
be just one of the parameters that the system takes into account. i think
having such a system is fine, but doesn't need to be spec'ed. at the same
time, i think it's worth having a spec'ed vacuum() call that gives the app
some guarantees. in particular, i would like the completionCallback to be a
signal that the database is in a good shape and ready for more work. so how
about this:


   1. if the UA doesn't support the vacuum() call, go to step 3.
   2. queue up a task to vacuum the database. if the UA decides that
   vacuuming is not needed at this time, the vacuuming task scheduled in this
   step could be a no-op.
   3. when the vacuuming task completes, queue up a task to invoke the
   completion callback, if one was specified.

i think this spec should allow you to treat the vacuum() call as a hint,
while also guaranteeing that:
1. the UA believes the database is in a good shape when the completion
callback is called.
2. the completion callback is called as soon as UA believes the DB is in a
good shape.

what do you think?

thanks,
dumi


On Thu, Mar 11, 2010 at 4:13 PM, João Eiras  wrote:

> On Fri, 12 Mar 2010 01:08:41 +0100, Dumitru Daniliuc 
> wrote:
>
>  joao,
>>
>> it looks like we mostly agree on this feature, so i was wondering if we
>> could formally agree on a spec. here's what i propose:
>>
>> 1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases
>> implementations use SQLite, and in SQLite the command is called VACUUM. so
>> it seems to me that we might as well call the new function vacuum(). what
>> do
>> you think?
>>
>
> I'm fine with it but...
>
>
>> 2. spec: no need for an error callback.
>>
>> interface Database {
>>  // the methods and properties currently in the spec
>>  void vacuum(in optional SQLVoidCallback successCallback);
>> };
>>
>> 3. what the call should do: the purpose of this call is to allow apps to
>> vacuum/compact/defragment/clean up their databases whenever they see fit.
>> a
>> call to vacuum() could take a considerable amount of time (especially on
>> larger or highly fragmented databases); therefore, it is not recommended
>> for
>> web apps to call this method during periods of high activity.
>>
>> how to process a vacuum() call:
>>
>>   1. if the UA does not support this call (mobile browsers?), jump to step
>>   3.
>>   2. queue up a task to vacuum/compact/defragment/clean up the database.
>>   3. if the task succeeds, and a success callback is provided, queue up a
>>
>>   task to invoke the success callback; in all other cases (the task
>> failed, or
>>   no success callback was provided), do nothing: proceed to the next task
>> in
>>   the queue.
>>
>> does this seem acceptable? we (google engineers interested in this) feel
>> that UAs should either not implement the vacuum() call, or they should
>> respect it (rather than taking it as a hint). it is ok for UAs to keep
>> track
>> of things like system idleness or databases closing to do more vacuuming
>> that the apps asked for, if they want to. however, we feel that a vacuum()
>> request by an app should not be postponed, simply because sometimes apps
>> know better than UAs when the best time to vacuum is (it might be nice to
>> give apps more information on how fragmented their databases are, but
>> that's
>> a separate discussion).
>>
>>
> I unfortunately don't agree with this last sentence :) Web pages do not
> have a way to know the level of fragmentation, and the last thing user wants
> is a web page calling vacuum() before each transaction, *just to be safe*,
> because of potential fragmentation.
>
> But the API is good I think.
>
>  thanks,
>> dumi
>>
>>
>>


Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread João Eiras
On Fri, 12 Mar 2010 04:07:21 +0100, Dumitru Daniliuc   
wrote:



joao,

if i understand correctly, you basically want to have an automated system
implemented in the browser that decides when to vacuum databases (or at
least make sure it doesn't happen too often). and the vacuum() calls  
would

be just one of the parameters that the system takes into account. i think
having such a system is fine, but doesn't need to be spec'ed. at the same
time, i think it's worth having a spec'ed vacuum() call that gives the  
app
some guarantees. in particular, i would like the completionCallback to  
be a
signal that the database is in a good shape and ready for more work. so  
how

about this:


   1. if the UA doesn't support the vacuum() call, go to step 3.
   2. queue up a task to vacuum the database. if the UA decides that
   vacuuming is not needed at this time, the vacuuming task scheduled in  
this

   step could be a no-op.
   3. when the vacuuming task completes, queue up a task to invoke the
   completion callback, if one was specified.

i think this spec should allow you to treat the vacuum() call as a hint,
while also guaranteeing that:
1. the UA believes the database is in a good shape when the completion
callback is called.
2. the completion callback is called as soon as UA believes the DB is in  
a

good shape.

what do you think?


Looks better, and more flexible.

I wonder if sqlite has a way to query the number of pages in the freelist ?
Probably something like 10% of pages in the freelist would be a good  
threshold to allow a vacuum.




thanks,
dumi




Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread João Eiras



Looks better, and more flexible.

I wonder if sqlite has a way to query the number of pages in the  
freelist ?
Probably something like 10% of pages in the freelist would be a good  
threshold to allow a vacuum.




Oh and btw, read only transactions should disallow vacuum, I think/hope :)



Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-11 Thread Dumitru Daniliuc
On Thu, Mar 11, 2010 at 7:17 PM, João Eiras  wrote:

> On Fri, 12 Mar 2010 04:07:21 +0100, Dumitru Daniliuc 
> wrote:
>
>  joao,
>>
>> if i understand correctly, you basically want to have an automated system
>> implemented in the browser that decides when to vacuum databases (or at
>> least make sure it doesn't happen too often). and the vacuum() calls would
>> be just one of the parameters that the system takes into account. i think
>> having such a system is fine, but doesn't need to be spec'ed. at the same
>> time, i think it's worth having a spec'ed vacuum() call that gives the app
>> some guarantees. in particular, i would like the completionCallback to be
>> a
>> signal that the database is in a good shape and ready for more work. so
>> how
>> about this:
>>
>>
>>   1. if the UA doesn't support the vacuum() call, go to step 3.
>>   2. queue up a task to vacuum the database. if the UA decides that
>>
>>   vacuuming is not needed at this time, the vacuuming task scheduled in
>> this
>>   step could be a no-op.
>>   3. when the vacuuming task completes, queue up a task to invoke the
>>
>>   completion callback, if one was specified.
>>
>> i think this spec should allow you to treat the vacuum() call as a hint,
>> while also guaranteeing that:
>> 1. the UA believes the database is in a good shape when the completion
>> callback is called.
>> 2. the completion callback is called as soon as UA believes the DB is in a
>> good shape.
>>
>> what do you think?
>>
>
> Looks better, and more flexible.
>

thanks. is there anything else you'd like to change? i'm pushing for this so
hard because i'd like to implement this feature, but i don't want to start
before all interested parties agree on a spec.


> I wonder if sqlite has a way to query the number of pages in the freelist ?
> Probably something like 10% of pages in the freelist would be a good
> threshold to allow a vacuum.
>

you can get the number of free pages with PRAGMA freelist_count (
http://www.sqlite.org/pragma.html). unfortunately, looking only at this
number would result in a lot of false negatives. for example, in theory, you
could have a DB with 1000 2KB pages, and each page could have only one
useful byte written on it. so PRAGMA freelist_count would return 0, even
though doing a vacuuming would shrink your DB from ~2MB to ~1KB. but i agree
that it could be one of the parameters to take into account.


Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-12 Thread Dimitri Glazkov
I like the completion callback idea.

Also like the notion of some sort of protection from "over-eager
vacuum-calling syndrome".

:DG<

On Thu, Mar 11, 2010 at 4:20 PM, Michael Nordman  wrote:
> Instead of calling back on success only, maybe call back on completion
> regardless of success or failure. This way the caller would know when the
> potentially lengthy operation was done, regardless of the outcome.
>
> 2010/3/11 Dumitru Daniliuc 
>>
>> joao,
>>
>> it looks like we mostly agree on this feature, so i was wondering if we
>> could formally agree on a spec. here's what i propose:
>>
>> 1. name: vacuum. to the best of my knowledge, all current WebSQLDatabases
>> implementations use SQLite, and in SQLite the command is called VACUUM. so
>> it seems to me that we might as well call the new function vacuum(). what do
>> you think?
>>
>> 2. spec: no need for an error callback.
>>
>> interface Database {
>>   // the methods and properties currently in the spec
>>   void vacuum(in optional SQLVoidCallback successCallback);
>> };
>>
>> 3. what the call should do: the purpose of this call is to allow apps to
>> vacuum/compact/defragment/clean up their databases whenever they see fit.. a
>> call to vacuum() could take a considerable amount of time (especially on
>> larger or highly fragmented databases); therefore, it is not recommended for
>> web apps to call this method during periods of high activity.
>>
>> how to process a vacuum() call:
>>
>> if the UA does not support this call (mobile browsers?), jump to step 3.
>> queue up a task to vacuum/compact/defragment/clean up the database.
>> if the task succeeds, and a success callback is provided, queue up a task
>> to invoke the success callback; in all other cases (the task failed, or no
>> success callback was provided), do nothing: proceed to the next task in the
>> queue.
>>
>> does this seem acceptable? we (google engineers interested in this) feel
>> that UAs should either not implement the vacuum() call, or they should
>> respect it (rather than taking it as a hint). it is ok for UAs to keep track
>> of things like system idleness or databases closing to do more vacuuming
>> that the apps asked for, if they want to. however, we feel that a vacuum()
>> request by an app should not be postponed, simply because sometimes apps
>> know better than UAs when the best time to vacuum is (it might be nice to
>> give apps more information on how fragmented their databases are, but that's
>> a separate discussion).
>>
>> thanks,
>> dumi
>>
>>
>> 2010/3/9 Jeremy Orlow 
>>>
>>> On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc  wrote:


 On Mon, Mar 8, 2010 at 3:39 AM, João Eiras  wrote:
>
>>> I don't see how the callbacks are useful though. Vacuum works
>>> transparently, its effects are not visible, and what should the page
>>> do in
>>> case of error ?
>>>
>>
>> i was thinking of something like:
>>
>> db.defragment(errorCallback, successCallback);
>> showAPrettyImageAndAskTheUserToWait();
>>
>> function errorCallback(error) {}
>> function successCallback() {
>>  getRidOfThePrettyImageAndRestartTheApp();
>> }
>>
>> just like you, i'm not sure if the error callback is useful at all,
>> but i
>> thought i'd add it to make the defragment() call look more like a
>> transaction. maybe we don't need it.
>>
>
> True, but this is a kind of operation that could very well just run on
> the background, with a single optional callback when it's done (the 
> webpage
> can't do anything if an error is detected anyway).

 ok, so let's drop the errorCallback: vacuum([optional] successCallback);

>
> The user agent would need to queue any subsequent transactions if a
> vacuum is running. I would consider it as an hint, and after all webpages
> that own references to the underlying data files are closed, would do a
> vacuum. So, if you have many tabs on gmail, and that a single gmail 
> instance
> tries to do multiple vacuums, it would equiv to one single vacuum 
> operation.

 what do we do if some databases are opened for the entire life of the
 browser? for example, i open my browser which has myfavoriteapp.com set as
 its homepage. myfavoriteapp.com immediately opens a DB, and i only close
 that app when i close the browser. when would the browser vacuum
 myfavoriteapp's DBs in this case?

 i think it's ok for the UA to vacuum some DBs automatically when it
 thinks it's a good time to do so; however, if a platform supports the
 vacuum/defrag call (i.e. if it doesn't treat it is a no-op), then i think a
 vacuum call coming from the app should be immediately scheduled (yes, the
 subsequent transactions would have to wait for the vacuuming to finish
 running). in some cases, the apps know better than the UA when to vacuum
 their DBs.

 by the way, we should p

Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-12 Thread Shane Harrelson
Some things to consider:

(1) VACUUM is never "necessary", though can sometimes reduce the size of the
database file and provide a performance improvement if the database is large
and fragmented.

(2) Use PRAGMA freelist_count to estimate how much VACUUM will shrink a
database.

(3) VACUUM requires O(N) time where N is the size of the database file, and
can require up to 2N temporary space while processing.

Our recommendation is that the use of VACUUM be discouraged.   The
performance improvement only comes up with very large databases that are
highly fragmented.  Where space reclamation is an issue, AUTO_VACUUM would
be more appropriate.

-Shane


On Thu, Mar 11, 2010 at 7:33 PM, Dimitri Glazkov wrote:

> I like the completion callback idea.
>
> Also like the notion of some sort of protection from "over-eager
> vacuum-calling syndrome".
>
> :DG<
>
> On Thu, Mar 11, 2010 at 4:20 PM, Michael Nordman 
> wrote:
> > Instead of calling back on success only, maybe call back on completion
> > regardless of success or failure. This way the caller would know when the
> > potentially lengthy operation was done, regardless of the outcome.
> >
> > 2010/3/11 Dumitru Daniliuc 
> >>
> >> joao,
> >>
> >> it looks like we mostly agree on this feature, so i was wondering if we
> >> could formally agree on a spec. here's what i propose:
> >>
> >> 1. name: vacuum. to the best of my knowledge, all current
> WebSQLDatabases
> >> implementations use SQLite, and in SQLite the command is called VACUUM..
> so
> >> it seems to me that we might as well call the new function vacuum().
> what do
> >> you think?
> >>
> >> 2. spec: no need for an error callback.
> >>
> >> interface Database {
> >>   // the methods and properties currently in the spec
> >>   void vacuum(in optional SQLVoidCallback successCallback);
> >> };
> >>
> >> 3. what the call should do: the purpose of this call is to allow apps to
> >> vacuum/compact/defragment/clean up their databases whenever they see
> fit.. a
> >> call to vacuum() could take a considerable amount of time (especially on
> >> larger or highly fragmented databases); therefore, it is not recommended
> for
> >> web apps to call this method during periods of high activity.
> >>
> >> how to process a vacuum() call:
> >>
> >> if the UA does not support this call (mobile browsers?), jump to step 3.
> >> queue up a task to vacuum/compact/defragment/clean up the database.
> >> if the task succeeds, and a success callback is provided, queue up a
> task
> >> to invoke the success callback; in all other cases (the task failed, or
> no
> >> success callback was provided), do nothing: proceed to the next task in
> the
> >> queue.
> >>
> >> does this seem acceptable? we (google engineers interested in this) feel
> >> that UAs should either not implement the vacuum() call, or they should
> >> respect it (rather than taking it as a hint). it is ok for UAs to keep
> track
> >> of things like system idleness or databases closing to do more vacuuming
> >> that the apps asked for, if they want to. however, we feel that a
> vacuum()
> >> request by an app should not be postponed, simply because sometimes apps
> >> know better than UAs when the best time to vacuum is (it might be nice
> to
> >> give apps more information on how fragmented their databases are, but
> that's
> >> a separate discussion).
> >>
> >> thanks,
> >> dumi
> >>
> >>
> >> 2010/3/9 Jeremy Orlow 
> >>>
> >>> On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc 
> wrote:
> 
> 
>  On Mon, Mar 8, 2010 at 3:39 AM, João Eiras  wrote:
> >
> >>> I don't see how the callbacks are useful though. Vacuum works
> >>> transparently, its effects are not visible, and what should the
> page
> >>> do in
> >>> case of error ?
> >>>
> >>
> >> i was thinking of something like:
> >>
> >> db.defragment(errorCallback, successCallback);
> >> showAPrettyImageAndAskTheUserToWait();
> >>
> >> function errorCallback(error) {}
> >> function successCallback() {
> >>  getRidOfThePrettyImageAndRestartTheApp();
> >> }
> >>
> >> just like you, i'm not sure if the error callback is useful at all,
> >> but i
> >> thought i'd add it to make the defragment() call look more like a
> >> transaction. maybe we don't need it.
> >>
> >
> > True, but this is a kind of operation that could very well just run
> on
> > the background, with a single optional callback when it's done (the
> webpage
> > can't do anything if an error is detected anyway).
> 
>  ok, so let's drop the errorCallback: vacuum([optional]
> successCallback);
> 
> >
> > The user agent would need to queue any subsequent transactions if a
> > vacuum is running. I would consider it as an hint, and after all
> webpages
> > that own references to the underlying data files are closed, would do
> a
> > vacuum. So, if you have many tabs on gmail, and that a single gmail
> instance
> > trie

Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-12 Thread Dumitru Daniliuc
thanks for your comments, shane. a few questions:


> (1) VACUUM is never "necessary", though can sometimes reduce the size of
> the database file and provide a performance improvement if the database is
> large and fragmented.
>

large and fragmented databases is exactly the use case that made us think
about a vacuum() call.


> (2) Use PRAGMA freelist_count to estimate how much VACUUM will shrink a
> database.
>

if i understand the docs correctly, PRAGMA freelist_count will only return
the number of empty pages. so if we had many pages with very little data on
each one, vacuuming the database would considerably reduce its size, but
PRAGMA freelist_node would return 0. is this correct? is there a better way
to estimate the amount of space we'd get back if we vacuumed the database?
in particular, is there a way to figure out how fragmented each page is?


> (3) VACUUM requires O(N) time where N is the size of the database file, and
> can require up to 2N temporary space while processing.
>

i'm not so worried about the space: hard drives are much much bigger than
any DB a web app will have. the time it takes to vacuum a DB though is
something that UAs would probably want to take into account. do you have any
data on how long it takes to vacuum a fragmented database of 1MB? 10MB?
100MB? 1GB? i'm not looking for a benchmark, just some approximate data
(even though it would be great if you had a benchmark that you could share
with us).


> Our recommendation is that the use of VACUUM be discouraged.   The
> performance improvement only comes up with very large databases that are
> highly fragmented.  Where space reclamation is an issue, AUTO_VACUUM would
> be more appropriate.
>

the spec i proposed in my previous email allows UAs to ignore vacuum() calls
whenever they feel like it, so the UAs could just decide that it's not worth
vacuuming small databases. in cases of apps like offline gmail though, the
databases will be large, and i would guess that after a while they'd be
pretty fragmented, due to constantly deleting older emails and inserting
newer ones, so some kind of vacuuming would be needed.

AUTO_VACUUM: if i understand the docs correctly, auto-vacuum does not go
inside pages. it only automatically removes the pages that are entirely
free. is that correct? if so, then it seems to me that auto-vacuuming
doesn't really solve the fragmentation problem. also, what's the performance
overhead of having auto-vacuum on? and do we have any control over when it
kicks in? going back to my offline gmail example, i think it would be much
better to have the app call vacuum() when the user writes an email and the
app is otherwise idle, for example, then to have auto-vacuuming cause
user-visible delays when the app is syncing with the server or doing some
other intensive work.

thanks,
dumi



> -Shane
>
>
>
> On Thu, Mar 11, 2010 at 7:33 PM, Dimitri Glazkov wrote:
>
>> I like the completion callback idea.
>>
>> Also like the notion of some sort of protection from "over-eager
>> vacuum-calling syndrome".
>>
>> :DG<
>>
>> On Thu, Mar 11, 2010 at 4:20 PM, Michael Nordman 
>> wrote:
>> > Instead of calling back on success only, maybe call back on completion
>> > regardless of success or failure. This way the caller would know when
>> the
>> > potentially lengthy operation was done, regardless of the outcome.
>> >
>> > 2010/3/11 Dumitru Daniliuc 
>> >>
>> >> joao,
>> >>
>> >> it looks like we mostly agree on this feature, so i was wondering if we
>> >> could formally agree on a spec. here's what i propose:
>> >>
>> >> 1. name: vacuum. to the best of my knowledge, all current
>> WebSQLDatabases
>> >> implementations use SQLite, and in SQLite the command is called VACUUM.
>> so
>> >> it seems to me that we might as well call the new function vacuum().
>> what do
>> >> you think?
>> >>
>> >> 2. spec: no need for an error callback.
>> >>
>> >> interface Database {
>> >>   // the methods and properties currently in the spec
>> >>   void vacuum(in optional SQLVoidCallback successCallback);
>> >> };
>> >>
>> >> 3. what the call should do: the purpose of this call is to allow apps
>> to
>> >> vacuum/compact/defragment/clean up their databases whenever they see
>> fit.. a
>> >> call to vacuum() could take a considerable amount of time (especially
>> on
>> >> larger or highly fragmented databases); therefore, it is not
>> recommended for
>> >> web apps to call this method during periods of high activity.
>> >>
>> >> how to process a vacuum() call:
>> >>
>> >> if the UA does not support this call (mobile browsers?), jump to step
>> 3.
>> >> queue up a task to vacuum/compact/defragment/clean up the database..
>> >> if the task succeeds, and a success callback is provided, queue up a
>> task
>> >> to invoke the success callback; in all other cases (the task failed, or
>> no
>> >> success callback was provided), do nothing: proceed to the next task in
>> the
>> >> queue.
>> >>
>> >> does this seem acceptable? we (google engineers i

Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-12 Thread Dirk Pranke
I admit to not being super familiar with the spec as it currently
stands, but I find the idea
that we would add something like this fairly unappealing. I'm not
familiar with any other
database API that asks the application programmer to some sort of GC
as part of the
application. I almost feel like if you're going to add this, you
should drop any pretense of
calling this a generic SQL interface, and just call it the "WebSQLLite spec".

-- Dirk

2010/3/9 Jeremy Orlow :
> On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc  wrote:
>>
>>
>> On Mon, Mar 8, 2010 at 3:39 AM, João Eiras  wrote:
>>>
> I don't see how the callbacks are useful though. Vacuum works
> transparently, its effects are not visible, and what should the page do
> in
> case of error ?
>

 i was thinking of something like:

 db.defragment(errorCallback, successCallback);
 showAPrettyImageAndAskTheUserToWait();

 function errorCallback(error) {}
 function successCallback() {
  getRidOfThePrettyImageAndRestartTheApp();
 }

 just like you, i'm not sure if the error callback is useful at all, but
 i
 thought i'd add it to make the defragment() call look more like a
 transaction. maybe we don't need it.

>>>
>>> True, but this is a kind of operation that could very well just run on
>>> the background, with a single optional callback when it's done (the webpage
>>> can't do anything if an error is detected anyway).
>>
>> ok, so let's drop the errorCallback: vacuum([optional] successCallback);
>>
>>>
>>> The user agent would need to queue any subsequent transactions if a
>>> vacuum is running. I would consider it as an hint, and after all webpages
>>> that own references to the underlying data files are closed, would do a
>>> vacuum. So, if you have many tabs on gmail, and that a single gmail instance
>>> tries to do multiple vacuums, it would equiv to one single vacuum operation.
>>
>> what do we do if some databases are opened for the entire life of the
>> browser? for example, i open my browser which has myfavoriteapp.com set as
>> its homepage. myfavoriteapp.com immediately opens a DB, and i only close
>> that app when i close the browser. when would the browser vacuum
>> myfavoriteapp's DBs in this case?
>>
>> i think it's ok for the UA to vacuum some DBs automatically when it thinks
>> it's a good time to do so; however, if a platform supports the vacuum/defrag
>> call (i.e. if it doesn't treat it is a no-op), then i think a vacuum call
>> coming from the app should be immediately scheduled (yes, the subsequent
>> transactions would have to wait for the vacuuming to finish running). in
>> some cases, the apps know better than the UA when to vacuum their DBs.
>>
>> by the way, we should probably agree on a name for this call. which one do
>> you prefer? vacuum, defragment, defrag, something else? i don't have a
>> strong opinion.
>
> I think vacuum is fine since the spec is already tied to the SQLite SQL
> dialect.
> collectGarbage() is another possibility
> Go with whatever you think is most clear and accurate though.
> J



Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-12 Thread Shane Harrelson
This is not an issue unique to SQLite.

MySQL has several options for this, including  "OPTIMIZE TABLE foo".
SQL Server has a SHRINK database option among others.
Oracle has several options including "ALTER TABLE foo SHRINK SPACE".

I think regardless of the underlying storage engine chosen, there will be
extreme situations which stress any kind of automatic space and performance
management algorithms and you'll find users asking for just such interfaces
-- be it called vacuum(), optimize(), defragment(), shrink(), etc.

-Shane


On Fri, Mar 12, 2010 at 4:31 PM, Dirk Pranke  wrote:

> I admit to not being super familiar with the spec as it currently
> stands, but I find the idea
> that we would add something like this fairly unappealing. I'm not
> familiar with any other
> database API that asks the application programmer to some sort of GC
> as part of the
> application. I almost feel like if you're going to add this, you
> should drop any pretense of
> calling this a generic SQL interface, and just call it the "WebSQLLite
> spec".
>
> -- Dirk
>
> 2010/3/9 Jeremy Orlow :
> > On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc 
> wrote:
> >>
> >>
> >> On Mon, Mar 8, 2010 at 3:39 AM, João Eiras  wrote:
> >>>
> > I don't see how the callbacks are useful though. Vacuum works
> > transparently, its effects are not visible, and what should the page
> do
> > in
> > case of error ?
> >
> 
>  i was thinking of something like:
> 
>  db.defragment(errorCallback, successCallback);
>  showAPrettyImageAndAskTheUserToWait();
> 
>  function errorCallback(error) {}
>  function successCallback() {
>   getRidOfThePrettyImageAndRestartTheApp();
>  }
> 
>  just like you, i'm not sure if the error callback is useful at all,
> but
>  i
>  thought i'd add it to make the defragment() call look more like a
>  transaction. maybe we don't need it.
> 
> >>>
> >>> True, but this is a kind of operation that could very well just run on
> >>> the background, with a single optional callback when it's done (the
> webpage
> >>> can't do anything if an error is detected anyway).
> >>
> >> ok, so let's drop the errorCallback: vacuum([optional] successCallback);
> >>
> >>>
> >>> The user agent would need to queue any subsequent transactions if a
> >>> vacuum is running. I would consider it as an hint, and after all
> webpages
> >>> that own references to the underlying data files are closed, would do a
> >>> vacuum. So, if you have many tabs on gmail, and that a single gmail
> instance
> >>> tries to do multiple vacuums, it would equiv to one single vacuum
> operation.
> >>
> >> what do we do if some databases are opened for the entire life of the
> >> browser? for example, i open my browser which has myfavoriteapp.com set
> as
> >> its homepage. myfavoriteapp.com immediately opens a DB, and i only
> close
> >> that app when i close the browser. when would the browser vacuum
> >> myfavoriteapp's DBs in this case?
> >>
> >> i think it's ok for the UA to vacuum some DBs automatically when it
> thinks
> >> it's a good time to do so; however, if a platform supports the
> vacuum/defrag
> >> call (i.e. if it doesn't treat it is a no-op), then i think a vacuum
> call
> >> coming from the app should be immediately scheduled (yes, the subsequent
> >> transactions would have to wait for the vacuuming to finish running). in
> >> some cases, the apps know better than the UA when to vacuum their DBs.
> >>
> >> by the way, we should probably agree on a name for this call. which one
> do
> >> you prefer? vacuum, defragment, defrag, something else? i don't have a
> >> strong opinion.
> >
> > I think vacuum is fine since the spec is already tied to the SQLite SQL
> > dialect.
> > collectGarbage() is another possibility
> > Go with whatever you think is most clear and accurate though.
> > J
>
>


Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-12 Thread Charles McCathieNevile
On Fri, 12 Mar 2010 22:31:15 +0100, Dirk Pranke   
wrote:



I admit to not being super familiar with the spec as it currently
stands, but I find the idea that we would add something like this
fairly unappealing. I'm not familiar with any other database API
that asks the application programmer to some sort of GC as part of
the application. I almost feel like if you're going to add this,
you should drop any pretense of calling this a generic SQL
interface, and just call it the "WebSQLLite spec".


It isn't a generic SQL spec, it is already a "WebSQLite" spec.

But I'm still not sure what the rationale is for this functionality. In  
particular, because the sense I got from the TPAC meeting was that nobody  
except Apple is especially keen on this spec as a long-term solution. So  
working on such low-level functionality doesn't seem very worthwhile -  
although if people are seriously going to implement it then knowing that  
will at least allow ongoing interoperability for those who choose to  
implement it.


cheers

Chaals


-- Dirk

2010/3/9 Jeremy Orlow :
On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc   
wrote:



On Mon, Mar 8, 2010 at 3:39 AM, João Eiras  wrote:



I don't see how the callbacks are useful though. Vacuum works
transparently, its effects are not visible, and what should the  
page do

in
case of error ?



i was thinking of something like:

db.defragment(errorCallback, successCallback);
showAPrettyImageAndAskTheUserToWait();

function errorCallback(error) {}
function successCallback() {
 getRidOfThePrettyImageAndRestartTheApp();
}

just like you, i'm not sure if the error callback is useful at all,  
but

i
thought i'd add it to make the defragment() call look more like a
transaction. maybe we don't need it.



True, but this is a kind of operation that could very well just run on
the background, with a single optional callback when it's done (the  
webpage

can't do anything if an error is detected anyway).


ok, so let's drop the errorCallback: vacuum([optional]  
successCallback);




The user agent would need to queue any subsequent transactions if a
vacuum is running. I would consider it as an hint, and after all  
webpages
that own references to the underlying data files are closed, would do  
a
vacuum. So, if you have many tabs on gmail, and that a single gmail  
instance
tries to do multiple vacuums, it would equiv to one single vacuum  
operation.


what do we do if some databases are opened for the entire life of the
browser? for example, i open my browser which has myfavoriteapp.com  
set as
its homepage. myfavoriteapp.com immediately opens a DB, and i only  
close

that app when i close the browser. when would the browser vacuum
myfavoriteapp's DBs in this case?

i think it's ok for the UA to vacuum some DBs automatically when it  
thinks
it's a good time to do so; however, if a platform supports the  
vacuum/defrag
call (i.e. if it doesn't treat it is a no-op), then i think a vacuum  
call
coming from the app should be immediately scheduled (yes, the  
subsequent
transactions would have to wait for the vacuuming to finish running).  
in

some cases, the apps know better than the UA when to vacuum their DBs.

by the way, we should probably agree on a name for this call. which  
one do

you prefer? vacuum, defragment, defrag, something else? i don't have a
strong opinion.


I think vacuum is fine since the spec is already tied to the SQLite SQL
dialect.
collectGarbage() is another possibility
Go with whatever you think is most clear and accurate though.
J





--
Charles McCathieNevile  Opera Software, Standards Group
je parle français -- hablo español -- jeg lærer norsk
http://my.opera.com/chaals   Try Opera: http://www.opera.com



Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-12 Thread Shane Harrelson
On Fri, Mar 12, 2010 at 4:15 PM, Dumitru Daniliuc  wrote:

> thanks for your comments, shane. a few questions:
>
>
>> (1) VACUUM is never "necessary", though can sometimes reduce the size of
>> the database file and provide a performance improvement if the database is
>> large and fragmented.
>>
>
> large and fragmented databases is exactly the use case that made us think
> about a vacuum() call.
>
>
>> (2) Use PRAGMA freelist_count to estimate how much VACUUM will shrink a
>> database.
>>
>
> if i understand the docs correctly, PRAGMA freelist_count will only return
> the number of empty pages. so if we had many pages with very little data on
> each one, vacuuming the database would considerably reduce its size, but
> PRAGMA freelist_node would return 0. is this correct? is there a better way
> to estimate the amount of space we'd get back if we vacuumed the database?
> in particular, is there a way to figure out how fragmented each page is?
>
>

Without looking at the specific page structure on the database, there's not
a good way to figure this out.   However, the B-Tree balancing algorithm
used by SQLite will attempt to merge pages with neighbors when there space
utilization drops below certain thresholds. Minimum average fill for intkey
leaves is 50%. For other pages I think it's 66%.



> (3) VACUUM requires O(N) time where N is the size of the database file, and
>> can require up to 2N temporary space while processing.
>>
>
> i'm not so worried about the space: hard drives are much much bigger than
> any DB a web app will have. the time it takes to vacuum a DB though is
> something that UAs would probably want to take into account. do you have any
> data on how long it takes to vacuum a fragmented database of 1MB? 10MB?
> 100MB? 1GB? i'm not looking for a benchmark, just some approximate data
> (even though it would be great if you had a benchmark that you could share
> with us).
>
>

VACUUM is largely io-bound and so the speed greatly depends on speed of the
underlying filesystem.  We generally use the rule of thumb of 1 second per
MB on basic systems.



> Our recommendation is that the use of VACUUM be discouraged.   The
>> performance improvement only comes up with very large databases that are
>> highly fragmented.  Where space reclamation is an issue, AUTO_VACUUM would
>> be more appropriate.
>>
>
> the spec i proposed in my previous email allows UAs to ignore vacuum()
> calls whenever they feel like it, so the UAs could just decide that it's not
> worth vacuuming small databases. in cases of apps like offline gmail though,
> the databases will be large, and i would guess that after a while they'd be
> pretty fragmented, due to constantly deleting older emails and inserting
> newer ones, so some kind of vacuuming would be needed.
>
> AUTO_VACUUM: if i understand the docs correctly, auto-vacuum does not go
> inside pages. it only automatically removes the pages that are entirely
> free. is that correct? if so, then it seems to me that auto-vacuuming
> doesn't really solve the fragmentation problem. also, what's the performance
> overhead of having auto-vacuum on? and do we have any control over when it
> kicks in? going back to my offline gmail example, i think it would be much
> better to have the app call vacuum() when the user writes an email and the
> app is otherwise idle, for example, then to have auto-vacuuming cause
> user-visible delays when the app is syncing with the server or doing some
> other intensive work.
>
>
When combined with the basic page merging mentioned above, AUTO_VACUUM does
a very good job, and the performance overhead is minimal.   There's no
control over when it kicks in.



> thanks,
> dumi
>
>
>
>> -Shane
>>
>>
>>
>> On Thu, Mar 11, 2010 at 7:33 PM, Dimitri Glazkov 
>> wrote:
>>
>>> I like the completion callback idea.
>>>
>>> Also like the notion of some sort of protection from "over-eager
>>> vacuum-calling syndrome".
>>>
>>> :DG<
>>>
>>> On Thu, Mar 11, 2010 at 4:20 PM, Michael Nordman 
>>> wrote:
>>> > Instead of calling back on success only, maybe call back on completion
>>> > regardless of success or failure. This way the caller would know when
>>> the
>>> > potentially lengthy operation was done, regardless of the outcome.
>>> >
>>> > 2010/3/11 Dumitru Daniliuc 
>>> >>
>>> >> joao,
>>> >>
>>> >> it looks like we mostly agree on this feature, so i was wondering if
>>> we
>>> >> could formally agree on a spec. here's what i propose:
>>> >>
>>> >> 1. name: vacuum. to the best of my knowledge, all current
>>> WebSQLDatabases
>>> >> implementations use SQLite, and in SQLite the command is called
>>> VACUUM. so
>>> >> it seems to me that we might as well call the new function vacuum().
>>> what do
>>> >> you think?
>>> >>
>>> >> 2. spec: no need for an error callback.
>>> >>
>>> >> interface Database {
>>> >>   // the methods and properties currently in the spec
>>> >>   void vacuum(in optional SQLVoidCallback successCallback);
>>> >> };
>>> >>
>>> >> 3. what the

Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-13 Thread Dirk Pranke
You are of course correct that every database has administrative
commands like this; I probably overreacted in my wording, and I
apologize for that.

However, this is exactly the sort of request that makes people unhappy
with this spec. While other implementations have SQL DDL for doing
things like this, they don't typically expose a single-purpose
function, and so you'd be trying to cram other implementations
similar-but-not-identical functionality into this name.

What do we do? I think there is general consensus that you can't hope
to standardize on a SQL dialect as part of this spec. That seems to
leave you with three options: expose only a limited set of
functionality, expose a generic DBC API and accept that you will write
implementation-specific code using the generic APIs (although it would
be possible to write more generic APIs on top of that), or accept that
this is a SQLite-specific spec and call it that accordingly, to keep
from confusing people.

You all are probably correct that, especially in SQLite's case, you'll
need application-level access to the advanced features (and I say this
not because SQLite is somehow inferior but because AFAIK it doesn't
have background threads and the self-tuning more complex DBs tend to
have), so option #1 probably won't make anyone happy. So, I suggest
either you do #2 or #3. Don't just try to gradually had specific hooks
onto #1 and pretend this API is something it's not.

-- Dirk

On Fri, Mar 12, 2010 at 7:52 PM, Shane Harrelson  wrote:
> This is not an issue unique to SQLite.
>
> MySQL has several options for this, including  "OPTIMIZE TABLE foo".
> SQL Server has a SHRINK database option among others.
> Oracle has several options including "ALTER TABLE foo SHRINK SPACE".
>
> I think regardless of the underlying storage engine chosen, there will be
> extreme situations which stress any kind of automatic space and performance
> management algorithms and you'll find users asking for just such interfaces
> -- be it called vacuum(), optimize(), defragment(), shrink(), etc.
>
> -Shane
>
>
> On Fri, Mar 12, 2010 at 4:31 PM, Dirk Pranke  wrote:
>>
>> I admit to not being super familiar with the spec as it currently
>> stands, but I find the idea
>> that we would add something like this fairly unappealing. I'm not
>> familiar with any other
>> database API that asks the application programmer to some sort of GC
>> as part of the
>> application. I almost feel like if you're going to add this, you
>> should drop any pretense of
>> calling this a generic SQL interface, and just call it the "WebSQLLite
>> spec".
>>
>> -- Dirk
>>
>> 2010/3/9 Jeremy Orlow :
>> > On Mon, Mar 8, 2010 at 8:47 PM, Dumitru Daniliuc 
>> > wrote:
>> >>
>> >>
>> >> On Mon, Mar 8, 2010 at 3:39 AM, João Eiras  wrote:
>> >>>
>> > I don't see how the callbacks are useful though. Vacuum works
>> > transparently, its effects are not visible, and what should the page
>> > do
>> > in
>> > case of error ?
>> >
>> 
>>  i was thinking of something like:
>> 
>>  db.defragment(errorCallback, successCallback);
>>  showAPrettyImageAndAskTheUserToWait();
>> 
>>  function errorCallback(error) {}
>>  function successCallback() {
>>   getRidOfThePrettyImageAndRestartTheApp();
>>  }
>> 
>>  just like you, i'm not sure if the error callback is useful at all,
>>  but
>>  i
>>  thought i'd add it to make the defragment() call look more like a
>>  transaction. maybe we don't need it.
>> 
>> >>>
>> >>> True, but this is a kind of operation that could very well just run on
>> >>> the background, with a single optional callback when it's done (the
>> >>> webpage
>> >>> can't do anything if an error is detected anyway).
>> >>
>> >> ok, so let's drop the errorCallback: vacuum([optional]
>> >> successCallback);
>> >>
>> >>>
>> >>> The user agent would need to queue any subsequent transactions if a
>> >>> vacuum is running. I would consider it as an hint, and after all
>> >>> webpages
>> >>> that own references to the underlying data files are closed, would do
>> >>> a
>> >>> vacuum. So, if you have many tabs on gmail, and that a single gmail
>> >>> instance
>> >>> tries to do multiple vacuums, it would equiv to one single vacuum
>> >>> operation.
>> >>
>> >> what do we do if some databases are opened for the entire life of the
>> >> browser? for example, i open my browser which has myfavoriteapp.com set
>> >> as
>> >> its homepage. myfavoriteapp.com immediately opens a DB, and i only
>> >> close
>> >> that app when i close the browser. when would the browser vacuum
>> >> myfavoriteapp's DBs in this case?
>> >>
>> >> i think it's ok for the UA to vacuum some DBs automatically when it
>> >> thinks
>> >> it's a good time to do so; however, if a platform supports the
>> >> vacuum/defrag
>> >> call (i.e. if it doesn't treat it is a no-op), then i think a vacuum
>> >> call
>> >> coming from the app should be immediately scheduled (yes, the

Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-16 Thread Dumitru Daniliuc
shane, i was hoping you could clarify a few things about AUTO_VACUUM:


> However, the B-Tree balancing algorithm used by SQLite will attempt to
> merge pages with neighbors when there space utilization drops below certain
> thresholds. Minimum average fill for intkey leaves is 50%. For other pages I
> think it's 66%.
>

according to http://www.sqlite.org/pragma.html#pragma_auto_vacuum:
"Auto-vacuum does not defragment the database nor repack individual database
pages the way that the VACUUM command does."

so the way i understand these statements is:
1. sqlite always attempts to merge mostly-empty pages. the (auto-)vacuum
settings have no effect on that.
2. AUTO_VACUUM only moves pages around and deletes the empty ones. it does
not try to repack individual pages. however, because of #1, page repacking
happens anyway (to some extent), when AUTO_VACUUM is on.

am i right?

and one more question:

"Auto-vacuuming is only possible if the database stores some additional
information that allows each database page to be traced backwards to its
referer. Therefore, auto-vacuuming must be turned on before any tables are
created. It is not possible to enable or disable auto-vacuum after a table
has been created."

what happens if we create a database without AUTO_VACUUM on, insert some
data, save it to a file, then turn on AUTO_VACUUM and try to open that
database again? will sqlite add the missing information? will AUTO_VACUUM be
silently turned off for that database? will we get an error when we try to
open/read from/write to that database? anything else that we need to be
aware of in this case?

thanks,
dumi


Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-16 Thread João Eiras
On Wed, 17 Mar 2010 01:50:00 +0100, Dumitru Daniliuc   
wrote:



shane, i was hoping you could clarify a few things about AUTO_VACUUM:



Perhaps this is a bit out of scope of this mailing list ?




Re: [WebSQLDatabase] Adding a vacuum() call

2010-03-17 Thread Dumitru Daniliuc
After getting more details on how AUTO_VACUUM and page repacking work in
SQLite, we (Chromium/WebKit) have decided take the AUTO_VACUUM route and not
push for adding a vacuum() call to the WebSQLDatabases spec at this time.
Many thanks to everybody who contributed to the discussion.

dumi


On Tue, Mar 16, 2010 at 5:03 PM, João Eiras  wrote:

> On Wed, 17 Mar 2010 01:50:00 +0100, Dumitru Daniliuc 
> wrote:
>
>  shane, i was hoping you could clarify a few things about AUTO_VACUUM:
>>
>>
> Perhaps this is a bit out of scope of this mailing list ?
>
>
>