Re: [FileAPI] BlobBuilder.getBlob should clear the BlobBuilder

2011-04-13 Thread Jonas Sicking
On Tue, Apr 12, 2011 at 5:33 PM, Eric Uhrhane er...@google.com wrote:
 On Tue, Apr 12, 2011 at 3:38 PM, Kyle Huey m...@kylehuey.com wrote:
 Hello All,

 In the current FileAPI Writer spec a BlobBuilder can be used to build a
 series of blobs like so:

   var bb = BlobBuilder();
   bb.append(foo);
   var foo = bb.getBlob();
   bb.append(bar);
   var bar = bb.getBlob();
   foo.size; // == 3
   bar.size; // == 6

 My concern with this pattern is that it seems that one of the primary use
 cases is to keep a BlobBuilder around for a while to build up a blob over
 time.  A BlobBuilder left around could potentially entrain large amounts of
 memory.  I propose that BlobBuilder.getBlob() clears the BlobBuilder,
 returning it to an empty state.  The current behavior also doesn't seem
 terribly useful to me (though I'm happy to be convinced otherwise) and be
 easily replicated on top of the proposed behavior (immediately reappending
 the Blob that was just retrieved.)

 Thoughts/comments?

 - Kyle

 If you don't have a use for a current behavior, you can always just
 drop the BlobBuilder as soon as you're done with it, and it'll get
 collected.  I think that's simpler and more intuitive than having it
 clear itself, which is a surprise in an operation that looks
 read-only.  In the other case, where you actually want the append
 behavior, it's faster and simpler not to have to re-append a blob
 you've just pulled out of it.

The problem is that this optimizes for the rare case when you're
creating several blobs which are prefixes of each other.

It's not at all rare for pages to inadvertently hold on to objects
longer than they need. This bogs down both the users machine and
webpage. Yes, pages can fix this by dropping all the references to
an object and wait for GC, but it's all too common mistake not to do
this.

If we think that people will use BlobBuilder to create large blobs,
then it's better to have explicit API for dropping that rather than
relying on GC. Here we additionally have the advantage that we
wouldn't risk people forgetting to use the explicit API since that is
the same API as dropping the data.

Another advantage of dropping the memory automatically is that you
don't need to copy any data into the Blob. Instead you can just make
the Blob take ownership of whatever memory buffers you've built up
during the various calls to .append. You could technically implement
some sort of copy-on-write scheme, but that introduces complexity.

Flip it around, what is the argument for keeping the memory owned by
the BlobBuilder? If it's just that the name looks read-only, I'd be
fine with renaming the extraction-function to something else.

/ Jonas



Re: [FileAPI] BlobBuilder.getBlob should clear the BlobBuilder

2011-04-13 Thread Glenn Maynard
On Wed, Apr 13, 2011 at 2:46 AM, Jonas Sicking jo...@sicking.cc wrote:

 Another advantage of dropping the memory automatically is that you
 don't need to copy any data into the Blob. Instead you can just make
 the Blob take ownership of whatever memory buffers you've built up
 during the various calls to .append. You could technically implement
 some sort of copy-on-write scheme, but that introduces complexity.


You don't actually need copy-on-write for that, so long as Blobs are
immutable.  You only need to refcount the underlying chunks comprising the
Blob, so the chunks can be shared between Blobs.

I think a complete, mature Blob implementation would be fairly complex,
anyway.  For example, in order to support very large blobs, Blob and
BlobBuilder might scratch to disk transparently.  (This wouldn't happen
during BlobBuilder.append, since that's synchronous.  Rather, it would
happen when other async APIs create Blobs, pushing the amount of memory used
by blobs over some threshold; that way the swapping of blobs to disk could
be done asynchronously.)

Since it appears Blob is becoming a major API building block for storing
larger blocks of data, I don't think this is unreasonable complexity to
expect in the longer term.

 The problem is that this optimizes for the rare case when you're
 creating several blobs which are prefixes of each other.

The above having been said, it's not necessary for BlobBuilder to keep its
data around in order to satisfy the uncommon blobs which are prefixes of
each other case.  You can do the following:

  var bb = new BlobBuilder();
  bb.append(lots_of_data);
  var blob1 = bb.getBlobAndReset(); // does what it says
  bb.append(blob1);
  bb.append(some more_data);
  var blob2 = bb.getBlobAndReset(); // returns lots_of_data + some_more_data

This could be optimized by the browser as above: the second, larger blob
could share data with the first.

-- 
Glenn Maynard


[FileAPI] BlobBuilder.getBlob should clear the BlobBuilder

2011-04-12 Thread Kyle Huey
Hello All,

In the current FileAPI Writer spec a BlobBuilder can be used to build a
series of blobs like so:

  var bb = BlobBuilder();
  bb.append(foo);
  var foo = bb.getBlob();
  bb.append(bar);
  var bar = bb.getBlob();
  foo.size; // == 3
  bar.size; // == 6

My concern with this pattern is that it seems that one of the primary use
cases is to keep a BlobBuilder around for a while to build up a blob over
time.  A BlobBuilder left around could potentially entrain large amounts of
memory.  I propose that BlobBuilder.getBlob() clears the BlobBuilder,
returning it to an empty state.  The current behavior also doesn't seem
terribly useful to me (though I'm happy to be convinced otherwise) and be
easily replicated on top of the proposed behavior (immediately reappending
the Blob that was just retrieved.)

Thoughts/comments?

- Kyle


Re: [FileAPI] BlobBuilder.getBlob should clear the BlobBuilder

2011-04-12 Thread Eric Uhrhane
On Tue, Apr 12, 2011 at 3:38 PM, Kyle Huey m...@kylehuey.com wrote:
 Hello All,

 In the current FileAPI Writer spec a BlobBuilder can be used to build a
 series of blobs like so:

   var bb = BlobBuilder();
   bb.append(foo);
   var foo = bb.getBlob();
   bb.append(bar);
   var bar = bb.getBlob();
   foo.size; // == 3
   bar.size; // == 6

 My concern with this pattern is that it seems that one of the primary use
 cases is to keep a BlobBuilder around for a while to build up a blob over
 time.  A BlobBuilder left around could potentially entrain large amounts of
 memory.  I propose that BlobBuilder.getBlob() clears the BlobBuilder,
 returning it to an empty state.  The current behavior also doesn't seem
 terribly useful to me (though I'm happy to be convinced otherwise) and be
 easily replicated on top of the proposed behavior (immediately reappending
 the Blob that was just retrieved.)

 Thoughts/comments?

 - Kyle

If you don't have a use for a current behavior, you can always just
drop the BlobBuilder as soon as you're done with it, and it'll get
collected.  I think that's simpler and more intuitive than having it
clear itself, which is a surprise in an operation that looks
read-only.  In the other case, where you actually want the append
behavior, it's faster and simpler not to have to re-append a blob
you've just pulled out of it.

Eric



Re: [FileAPI] BlobBuilder.getBlob should clear the BlobBuilder

2011-04-12 Thread Olli Pettay

On 04/12/2011 05:33 PM, Eric Uhrhane wrote:

On Tue, Apr 12, 2011 at 3:38 PM, Kyle Hueym...@kylehuey.com  wrote:

Hello All,

In the current FileAPI Writer spec a BlobBuilder can be used to build a
series of blobs like so:

   var bb = BlobBuilder();
   bb.append(foo);
   var foo = bb.getBlob();
   bb.append(bar);
   var bar = bb.getBlob();
   foo.size; // == 3
   bar.size; // == 6

My concern with this pattern is that it seems that one of the primary use
cases is to keep a BlobBuilder around for a while to build up a blob over
time.  A BlobBuilder left around could potentially entrain large amounts of
memory.  I propose that BlobBuilder.getBlob() clears the BlobBuilder,
returning it to an empty state.  The current behavior also doesn't seem
terribly useful to me (though I'm happy to be convinced otherwise) and be
easily replicated on top of the proposed behavior (immediately reappending
the Blob that was just retrieved.)

Thoughts/comments?

- Kyle


If you don't have a use for a current behavior, you can always just
drop the BlobBuilder as soon as you're done with it, and it'll get
collected.  I think that's simpler and more intuitive than having it
clear itself, which is a surprise in an operation that looks
read-only.

I agree. getBlob() sounds very much like read-only operation.
If there is a use case for clearing BlobBuilder, the method
should be called takeBlob() or some such.

-Olli



In the other case, where you actually want the append
behavior, it's faster and simpler not to have to re-append a blob
you've just pulled out of it.

Eric