Re: File API to separate reading from files

2009-09-22 Thread Arun Ranganathan

Arun wrote:
There is lots that is attractive about InputStream, and I think that 
it can be used in other specifications, especially when discussing 
Camera APIs, streaming from web apps (conferencing) etc.  I also like 
the idea of DataHandler.  When we define a byte primitive, it can be 
used in conjunction with the stream interface.  For additional read 
features (fseek) this is also useful.  I also appreciate that you 
have pointed out in a subsequent email [1] that it is possible to 
sidestep the issue of dealing with bytes directly.  Managing bytes 
properly, with the right primitives, is one reason why, despite 
having looked at the Java I/O APIs[2], I went with something 
simpler.  I think that we should have streams at some point, and I'm 
amenable to looking at them in a subsequent iteration of the File 
API.  It's worth saying here that the appeal of streams is for 
*multiple use cases* for both File API and other APIs, and *not* 
because the Java I/O model is one we should emulate.  Programmer 
taste and choice about coining APIs is subjective.



Nikunj wrote in response:
I respect your point on taste, however, I am more interested in 
composability than the maturity of Java I/O. 
Firstly, what Jonas proposed as the Alternative File API [1] uses an 
event model to address use cases such as progress feedback and 
separating reading from file objects.  I expressed reservations about 
complexity, but saw more posts in favor of it than against it.  This 
model has advantages that come with an event model (separate 
notifications like onprogress, onerror, allowing specific 'isolated' 
code, etc) along with a signature similarity to XHR (which developers 
are familiar with).  My caveats about the model were mainly about 
understanding trade-offs.  I'm reconciled to having a v1 of the File API 
specification based on Jonas' proposal (hopefully in good shape by the 
upcoming TPAC), and I believe we can iterate from there.

It would be useful to see how you meet the following requirements:

1. incremental reading of a file's data
The proposal [1] reuses the FileData interface, which will still support 
a slice(offset, length) method that returns another FileData object 
within stipulated byte ranges.  I hope to flesh out what happens under 
range mathematics errors a bit more clearly (e.g. whether an exception 
is raised).  Along with progress events, I think this use case is addressed.


2. concurrent access to file data
(Note that FileRequest and FileReader are used interchangably in 
[1]; I personally prefer FileReader as a name).  Nothing precludes 
multiple FileReader objects from accessing the same file, but not all 
implementations need fire notifications (events) concurrently.  Do you 
have a specific use case in mind?

3. access to all file metadata without needing to read the file
(Note that in FileRequest, which I think should be named FileReader, the 
read* methods take File objects as parameters, although the email 
proposal [1] says that they take FileData objects.  Jonas means File 
objects).


The answer to your question depends on what you mean by *all* file 
metadata. 

File objects (which inherit from FileData objects) expose name and 
mediaType properties, along with size (from FileData).  But, suppose you 
wanted ID3 information from an MP3 file.  In this case (assuming ID3v1 
usage), you would *have* to read the file, and look for the 128 byte 
chunk beginning with TAG.  This can be done in two ways:


i. Using splice() and range mathematics based on the file's size to get 
to the end of the file and look at the last 128 bits of it as a separate 
FileData object (since ID3v1 puts stuff at the end).  Not ideal.
ii. Using read methods and working with the file format.  Again, not 
dripping with syntactic sugar, but certainly feasible.


I agree that metadata extraction could be made better, but I think that 
I'm happy with what the existing proposal has.  I also don't see how any 
other proposal improves on this, even if you read into a stream buffer.


I am happy with the existing metadata extraction for a v1, and believe 
that as we work out more audio and video issues on the platform, we can 
get to specific metadata issues.  Can you clear up what you mean by all 
metadata?

4. separation of error handling from file reading
In Jonas' proposal, this isn't done cleanly (for some definition of 
clean as separate from the reader object), but I think what *is* done 
is good for the majority of use cases.  In Jonas' proposal, the 
FileReader object (named FileRequest in the email [1]) allows separate 
onerror handling (along with onprogress being separate, etc.).  It's not 
done *within* a read method (unlike the existing proposal, which does 
this less well than Jonas' proposal), and the callback that handles the 
event can deal with the response.


This is as separate as is done with XHR.


All things being equal, I would prefer a model that, in order of 
priority:


1. involves 

Re: File API to separate reading from files

2009-09-01 Thread Arun Ranganathan

Nikunj,

The File API is everyone's favorite API for feature requests as well as 
programming style discussions :)

interface InputStream {
  read(in DataHandler, [optional in] long long offset, [optional in] 
long long length);   
  abort();

  attribute Function onerror;
}

There is lots that is attractive about InputStream, and I think that it 
can be used in other specifications, especially when discussing Camera 
APIs, streaming from web apps (conferencing) etc.  I also like the idea 
of DataHandler.  When we define a byte primitive, it can be used in 
conjunction with the stream interface.  For additional read features 
(fseek) this is also useful.  I also appreciate that you have pointed 
out in a subsequent email [1] that it is possible to sidestep the issue 
of dealing with bytes directly.  Managing bytes properly, with the 
right primitives, is one reason why, despite having looked at the Java 
I/O APIs[2], I went with something simpler.  I think that we should have 
streams at some point, and I'm amenable to looking at them in a 
subsequent iteration of the File API.  It's worth saying here that the 
appeal of streams is for *multiple use cases* for both File API and 
other APIs, and *not* because the Java I/O model is one we should 
emulate.  Programmer taste and choice about coining APIs is subjective.


For a first version (which should replace 
http://www.w3.org/TR/file-upload/ , with a more meaningful name like 
File API), I think we should address use cases around reads.  Ian 
Fette has given us plenty of other uses cases for consideration later 
on[3].  While my editor's draft strove to address the use cases for file 
access with different asynchronous data accessors, it was clear that it 
couldn't gracefully account for progress events.  Moreover, general 
feedback favored a model that used events with a separate reader object 
that allowed for progress events, and Jonas' alternative proposal does 
this as well as resembles XHR [4].   While I'm reluctant to sacrifice 
simplicity, I think moving in the direction of the Alternative File 
API[4] reconciles use cases such as progress events with calls for a 
reader/event model.  FWIW, I disagree that resemblance to XHR should be 
seen as unwanted baggage [5].  I think it's desirable to resemble an 
API that has such widespread usage!  While the web is inconsistent, 
event models are widely used, and similarity between XHR and File API, 
which will be used in conjunction anyway, is probably a good thing.


I'd say the following might be next steps:

1. Work on the File API but revise the editor's draft to reflect [4].

2. Collect use cases for further iterations of the File API, and 
determine which WG should carry them forward.  I'm in favor of 
continuing in this WG as opposed to the Device API WG, but that is a 
separate question.


3. In conjunction with 2., discuss security and platform issues around 
proposals such as [6].  I'm interested in airing more concrete proposals 
on this listserv.


-- A*

[1] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0757.html
[2] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0729.html
[3] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0900.html
[4] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0565.html
[5] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0749.html
[6] http://dev.w3.org/2006/webapi/fileio/fileIO.htm





Re: File API to separate reading from files

2009-09-01 Thread Nikunj R. Mehta


On Aug 31, 2009, at 11:28 PM, Arun Ranganathan wrote:


Nikunj,

The File API is everyone's favorite API for feature requests as well  
as programming style discussions :)

interface InputStream {
 read(in DataHandler, [optional in] long long offset, [optional in]  
long long length); abort();

 attribute Function onerror;
}

There is lots that is attractive about InputStream, and I think that  
it can be used in other specifications, especially when discussing  
Camera APIs, streaming from web apps (conferencing) etc.  I also  
like the idea of DataHandler.  When we define a byte primitive, it  
can be used in conjunction with the stream interface.  For  
additional read features (fseek) this is also useful.  I also  
appreciate that you have pointed out in a subsequent email [1] that  
it is possible to sidestep the issue of dealing with bytes  
directly.  Managing bytes properly, with the right primitives, is  
one reason why, despite having looked at the Java I/O APIs[2], I  
went with something simpler.  I think that we should have streams at  
some point, and I'm amenable to looking at them in a subsequent  
iteration of the File API.  It's worth saying here that the appeal  
of streams is for *multiple use cases* for both File API and other  
APIs, and *not* because the Java I/O model is one we should  
emulate.  Programmer taste and choice about coining APIs is  
subjective.


I respect your point on taste, however, I am more interested in  
composability than the maturity of Java I/O. It would be useful to see  
how you meet the following requirements:


1. incremental reading of a file's data
2. concurrent access to file data
3. access to all file metadata without needing to read the file
4. separation of error handling from file reading

All things being equal, I would prefer a model that, in order of  
priority:


1. involves fewer steps, and
2. evolves nicely with file write and binary access, which are both  
likely to be next evolution directions in this area.


Can you provide a comparison of your proposed approach with my  
proposal for the above so that the WG can develop an informed opinion  
about the proposals?




For a first version (which should replace http://www.w3.org/TR/file-upload/ 
 , with a more meaningful name like File API), I think we should  
address use cases around reads.  Ian Fette has given us plenty of  
other uses cases for consideration later on[3].  While my editor's  
draft strove to address the use cases for file access with different  
asynchronous data accessors, it was clear that it couldn't  
gracefully account for progress events.  Moreover, general feedback  
favored a model that used events with a separate reader object that  
allowed for progress events, and Jonas' alternative proposal does  
this as well as resembles XHR [4].   While I'm reluctant to  
sacrifice simplicity, I think moving in the direction of the  
Alternative File API[4] reconciles use cases such as progress  
events with calls for a reader/event model.  FWIW, I disagree that  
resemblance to XHR should be seen as unwanted baggage [5].  I  
think it's desirable to resemble an API that has such widespread  
usage!


This is arguable at best, since it seems to be an opinion not shared  
by everyone, especially not the editor of XMLHttpRequest [1]. In fact,  
there is no similarity to XHR in the current editor's draft, and I  
wonder why those benefits were considered unimportant when drafting  
previously.


While the web is inconsistent, event models are widely used, and  
similarity between XHR and File API, which will be used in  
conjunction anyway, is probably a good thing.


Can you explain in light of the objections I raised in [2], why the  
Alternative File API is the right approach. I haven't seen any  
replies to my points.




I'd say the following might be next steps:

1. Work on the File API but revise the editor's draft to reflect [4].

2. Collect use cases for further iterations of the File API, and  
determine which WG should carry them forward.  I'm in favor of  
continuing in this WG as opposed to the Device API WG, but that is a  
separate question.


3. In conjunction with 2., discuss security and platform issues  
around proposals such as [6].  I'm interested in airing more  
concrete proposals on this listserv.


-- A*

[1] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0757.html
[2] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0729.html
[3] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0900.html
[4] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0565.html
[5] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0749.html
[6] http://dev.w3.org/2006/webapi/fileio/fileIO.htm





Nikunj
http://o-micron.blogspot.com

[1] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0537.html
[2] http://lists.w3.org/Archives/Public/public-webapps/2009JulSep/0748.html




Re: File API to separate reading from files

2009-08-31 Thread Garrett Smith
On Wed, Aug 19, 2009 at 11:47 AM, Nikunj R.
Mehtanikunj.me...@oracle.com wrote:
 Here's an alternative, more easily extensible, proposal for reading files.
 It provides applications a way to read small amounts of data at a time. It
 also allows applications to concurrently read the same file.
I Agree.

[snip]


[snip example]


 Secondly, a list of files can be obtained using some UI.
 typedef sequenceFile FileList;

Agree.

 Thirdly, an abstract interface is an input stream that is not limited to
 files. It works at the level of bytes that files are made of. The read()
 operation can specify the extent that is required. If an application wishes
 to read small increments, it can thus specify those increments. Of course,
 the File interface identifies its size, so the application can suitably
 choose increments. Processing of blocks read from the file occurs in
 callbacks. XHR could also consider taking an InputStream parameter during
 the send() operation.

Would it be possible to have a reader handle creating the input stream
and making the decision based on what type of Reader it is, passing
byte offset lengths to the input stream -- essentially hiding those
details?

[snip example]

 Fifthly, a file can be used for reading an input stream by specifying the
 name of a file when constructing the stream
 [Constructor(in File toOpen)]
 interface FileInputStream : InputStream {
 }
 Sixthly, one can create various kinds of derived readers such as text
 reader, binary string reader, and data URL reader. By inheriting from
 InputStream, the basic mechanisms such as abort and onerror are inherited.
 Moreover, the base read behavior is altered by the subclass although it
 behaves in a similar manner, except that the data seen outside is different.
 [Constructor(in InputStream base)]
 interface BinaryStringInputStream : InputStream {
   read(in StringDataHandler, [optional in] long long offset, [optional in]
 long long length);
 }
 The callback is provided a DOMString. The String's length is expected to
 match the increment requested.
 [CallBack=FunctionOnly]
 interface StringDataHandler {
 handle(in DOMString data);
 }
 For text reading, encoding is optionally specified.
 [Constructor(in InputStream base, [optional in] DOMString encoding)]
 interface TextInputStream : InputStream {
   read(in StringDataHandler, [optional in] long long offset, [optional in]
 long long length);
 }

 A file can be alternatively read as a dataURL using a similar kind of
 handler as above.
 [Constructor(in InputStream base)]
 interface FileDataURL: InputStream {
   read(in StringDataHandler, [optional in] long long offset, [optional in]
 long long length);
 }
 This API has the advantage that it can cleanly be extended to deal with both
 writing use cases and binary data. Furthermore, it can also support
 extensions that perform cryptographic, compression, or coding on top of the
 basic interfaces.
 To compare with the editor's draft, here's a typical programming case in
 JavaScript:
 var fileList = ...
 // There is a mistake in the example provided in Section 3 where it does
 fileList.files[0]
 var myFile = fileList[0];

That's odd.

 // *According to my proposal*
 var stream = new TextInputStream(new FileInputStream(myFile), UTF-16);
 stream.read(handleDataAsText);

// don't you need to add the onerror before read()?

 stream.onerror = errorHandler;
 function handleDataAsText(fileContent) {
 }
 function errorHandler(error) {
 }
 Note the two differences:
 1. Error handling is separated from file reading

Right. Method handleDataAsText does one thing only, as does the error handler.

You seem to have misplaced the onerror. Shouldn't that, as
commented, be assigned before - read - is called? Could read() raise
an exception immediately?

Why put the callback as an argument? What is wrong with having a
success callback?

A generic read method puts the type of reading on the stream, as you
would have it. Read just sends a message: read, but does not specify
the details.

 2. Two extra objects are needed to read text data out of the file. However,
 the composability of input streams enables a far richer library to operate.

I don't see why this is important.

For the purpose of the goals of this specification, is it the
complexity justified? I had the Reader idea and that was deemed too
complex, but what I see you proposing sounds, well, flexible, but more
involved. There's more busywork just to read a file.

 This API matches more closely the Java API for IO.

That is not necessarily ideal.

Design decisions a decade ago in a different language, for different
contexts might not be the best decisions for this context.

I feel a bit odd about giving an API critique to someone who seems to
be a lot more knowledgeable and experienced. But anyway, this proposal
is extensible. It does not paint itself into a corner like the other.

Is it possible to simplify the interface a little bit? I'm not married
to the Reader idea, but it was a simpler 

Re: File API to separate reading from files

2009-08-31 Thread イアンフェッティ
I would like to make another plug for
http://dev.w3.org/2006/webapi/fileio/fileIO.htm
This had the notion of writing files, file streams, directories, and
being able to integrate into the host filesystem. All of these are
important for reasons I outlined in
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-August/022388.html
and subsequent replies.

Quoting that email:
I would much rather have a well thought-out local filesystem
proposal, than continued creep of the existing File and Local Storage
proposal. These proposals are both designed from the perspective of I
want to take some existing data and either put it into the cloud or
make it available offline. They don't really handle the use case of
I want to create new data and save it to the local filesystem, or I
want to modify existing data on the filesystem, or I want to
maintain a virtual filesystem for my application, and potentially map
in the existing filesystem (e.g. if I'm flickr and I want to be able
to read the user's My Photos folder, send those up, but also make
thumbnails that I want to save locally and don't care if they get
uploaded, maintain an index file with image metadata / thumbnails /
 locally, save off some intermediate files, ... For this, I would
really like to see us take another look at
http://dev.w3.org/2006/webapi/fileio/fileIO.htm (I don't think this
spec is exactly what we need, but I like the general approach of
origins get a virtual filesystem tucked away that they can use, they
can fread/fwrite/fseek, and optionally if they want to interact with
the host FS they can request that and then get some sub-set of that
(e.g. my documents or my photos) mapped in..


2009/8/31 Garrett Smith dhtmlkitc...@gmail.com

 On Wed, Aug 19, 2009 at 11:47 AM, Nikunj R.
 Mehtanikunj.me...@oracle.com wrote:
  Here's an alternative, more easily extensible, proposal for reading files.
  It provides applications a way to read small amounts of data at a time. It
  also allows applications to concurrently read the same file.
 I Agree.

 [snip]


 [snip example]


  Secondly, a list of files can be obtained using some UI.
  typedef sequenceFile FileList;

 Agree.

  Thirdly, an abstract interface is an input stream that is not limited to
  files. It works at the level of bytes that files are made of. The read()
  operation can specify the extent that is required. If an application wishes
  to read small increments, it can thus specify those increments. Of course,
  the File interface identifies its size, so the application can suitably
  choose increments. Processing of blocks read from the file occurs in
  callbacks. XHR could also consider taking an InputStream parameter during
  the send() operation.

 Would it be possible to have a reader handle creating the input stream
 and making the decision based on what type of Reader it is, passing
 byte offset lengths to the input stream -- essentially hiding those
 details?

 [snip example]

  Fifthly, a file can be used for reading an input stream by specifying the
  name of a file when constructing the stream
  [Constructor(in File toOpen)]
  interface FileInputStream : InputStream {
  }
  Sixthly, one can create various kinds of derived readers such as text
  reader, binary string reader, and data URL reader. By inheriting from
  InputStream, the basic mechanisms such as abort and onerror are inherited.
  Moreover, the base read behavior is altered by the subclass although it
  behaves in a similar manner, except that the data seen outside is different.
  [Constructor(in InputStream base)]
  interface BinaryStringInputStream : InputStream {
    read(in StringDataHandler, [optional in] long long offset, [optional in]
  long long length);
  }
  The callback is provided a DOMString. The String's length is expected to
  match the increment requested.
  [CallBack=FunctionOnly]
  interface StringDataHandler {
  handle(in DOMString data);
  }
  For text reading, encoding is optionally specified.
  [Constructor(in InputStream base, [optional in] DOMString encoding)]
  interface TextInputStream : InputStream {
    read(in StringDataHandler, [optional in] long long offset, [optional in]
  long long length);
  }
 
  A file can be alternatively read as a dataURL using a similar kind of
  handler as above.
  [Constructor(in InputStream base)]
  interface FileDataURL: InputStream {
    read(in StringDataHandler, [optional in] long long offset, [optional in]
  long long length);
  }
  This API has the advantage that it can cleanly be extended to deal with both
  writing use cases and binary data. Furthermore, it can also support
  extensions that perform cryptographic, compression, or coding on top of the
  basic interfaces.
  To compare with the editor's draft, here's a typical programming case in
  JavaScript:
  var fileList = ...
  // There is a mistake in the example provided in Section 3 where it does
  fileList.files[0]
  var myFile = fileList[0];

 That's odd.

  // *According to my proposal*
  

Re: File API to separate reading from files

2009-08-31 Thread Arun Ranganathan

Ian Fette (イアンフェッティ) wrote:

I would like to make another plug for
http://dev.w3.org/2006/webapi/fileio/fileIO.htm
This had the notion of writing files, file streams, directories, and
being able to integrate into the host filesystem. All of these are
important for reasons I outlined in
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-August/022388.html
and subsequent replies.
  

Ian:

Nothing in my draft precludes those from coming later, in a v2 (and 
subsequent iterations), modulo good proposals about platform differences 
and security issues.  Writing out data to the filesystem is a very 
different set of issues, with different security issues.  There are 
numerous issues that need resolution here.  It turns out the File API is 
a controversial one, even to get right basic read features, and I'm keen 
to finish a v1 draft soon.  It's clear that even within the spectrum of 
read, we may not quite satisfy all use cases (such as fseek) in v1, but 
I'd like something that we can iterate on later.  *Even* addressing the 
use cases of reading data, we've encountered discussions about 
programming style (Java I/O vs. a model closer to XHR), progress events, 
etc.  I'd like to resolve these in a v1 draft first.


I think v1 should address the most common use cases, which really do 
appear to be:

I
want to take some existing data and either put it into the cloud or
make it available offline. 
Right, because we can't even do this elegantly on the web today!  
Developers use Firefox's synchronous API for File access, or Gears, or 
Flash (at least to get data from the file system into web apps and then 
the cloud).  Furthermore:

They don't really handle the use case of
I want to create new data and save it to the local filesystem, or I
want to modify existing data on the filesystem, or I want to
maintain a virtual filesystem for my application, and potentially map
in the existing filesystem 
Not *yet* but I think that these features can be evolved over the course 
of time.  The proposal you cite:
http://dev.w3.org/2006/webapi/fileio/fileIO.htm 
  
doesn't adequately address security issues, and deals with use cases 
that I'm not so sure are critical for a first version.  I appreciate 
that Google wants these next, and so I'd like to see proposals that 
address the open issues, some of which have been mentioned on the whatwg 
thread you posted a link to.


I'm on vacation Sept. 1 - Sept. 4, so I'll respond to other email about 
this topic upon my return.


-- A*




File API to separate reading from files

2009-08-19 Thread Nikunj R. Mehta
Here's an alternative, more easily extensible, proposal for reading  
files. It provides applications a way to read small amounts of data at  
a time. It also allows applications to concurrently read the same file.


Firstly, there is a simple interface to access file metadata. This  
metadata is always accessed synchronously. A file object could be  
passed to XHR, in which case it can upload the file during the send()  
process.


interface File {
  readonly attribute DOMString name;
  readonly attribute DOMString mediaType;
  readonly atribute DOMString url;
  readonly attribute unsigned long long size;
}

Secondly, a list of files can be obtained using some UI.

typedef sequenceFile FileList;

Thirdly, an abstract interface is an input stream that is not limited  
to files. It works at the level of bytes that files are made of. The  
read() operation can specify the extent that is required. If an  
application wishes to read small increments, it can thus specify those  
increments. Of course, the File interface identifies its size, so the  
application can suitably choose increments. Processing of blocks read  
from the file occurs in callbacks. XHR could also consider taking an  
InputStream parameter during the send() operation.


interface InputStream {
  read(in DataHandler, [optional in] long long offset, [optional in]  
long long length);	

  abort();
  attribute Function onerror;
}

Fourthly, reading a block of bytes is supported through an interface  
that accepts an array of integers. This is similar to the Gears Blob  
interface.


[CallBack=FunctionOnly]
interface DataHandler {
  handle(in sequenceint data);
}

Fifthly, a file can be used for reading an input stream by specifying  
the name of a file when constructing the stream

[Constructor(in File toOpen)]
interface FileInputStream : InputStream {
}

Sixthly, one can create various kinds of derived readers such as text  
reader, binary string reader, and data URL reader. By inheriting from  
InputStream, the basic mechanisms such as abort and onerror are  
inherited. Moreover, the base read behavior is altered by the subclass  
although it behaves in a similar manner, except that the data seen  
outside is different.


[Constructor(in InputStream base)]
interface BinaryStringInputStream : InputStream {
  read(in StringDataHandler, [optional in] long long offset,  
[optional in] long long length);	

}

The callback is provided a DOMString. The String's length is expected  
to match the increment requested.


[CallBack=FunctionOnly]
interface StringDataHandler {
handle(in DOMString data);
}

For text reading, encoding is optionally specified.

[Constructor(in InputStream base, [optional in] DOMString encoding)]
interface TextInputStream : InputStream {
  read(in StringDataHandler, [optional in] long long offset,  
[optional in] long long length);	

}


A file can be alternatively read as a dataURL using a similar kind of  
handler as above.


[Constructor(in InputStream base)]
interface FileDataURL: InputStream {
  read(in StringDataHandler, [optional in] long long offset,  
[optional in] long long length);	

}

This API has the advantage that it can cleanly be extended to deal  
with both writing use cases and binary data. Furthermore, it can also  
support extensions that perform cryptographic, compression, or coding  
on top of the basic interfaces.


To compare with the editor's draft, here's a typical programming case  
in JavaScript:


var fileList = ...
// There is a mistake in the example provided in Section 3 where it  
does fileList.files[0]

var myFile = fileList[0];

// *According to editor's draft*
myFile.getAsText(handleDataAsText)
function handleDataAsText(fileContent, error) {
  if (error) {

  }
}

// *According to my proposal*
var stream = new TextInputStream(new FileInputStream(myFile), UTF-16);
stream.read(handleDataAsText);
stream.onerror = errorHandler;
function handleDataAsText(fileContent) {

}

function errorHandler(error) {

}

Note the two differences:
1. Error handling is separated from file reading
2. Two extra objects are needed to read text data out of the file.  
However, the composability of input streams enables a far richer  
library to operate.


This API matches more closely the Java API for IO. It also benefits  
from the extensibility model used in Java, while retaining the  
asynchronous processing nature that is preferred in ECMAScript  
environments. It is also not too different from the editor's draft in  
that it does not introduce a completely different kind of data  
processing - we are still looking at callbacks. However, the  
improvement is in the composability of streams as well as supporting  
multiple concurrent file readers and processing blocks of data at a  
time.


Progress events can be built on top but I welcome suggestions to build  
them in to this proposal.


Nikunj
http://o-micron.blogspot.com





Re: File API to separate reading from files

2009-08-19 Thread Jonas Sicking
On Wed, Aug 19, 2009 at 11:47 AM, Nikunj R.
Mehtanikunj.me...@oracle.com wrote:
 Here's an alternative, more easily extensible, proposal for reading files.
 It provides applications a way to read small amounts of data at a time. It
 also allows applications to concurrently read the same file.
 Firstly, there is a simple interface to access file metadata. This metadata
 is always accessed synchronously. A file object could be passed to XHR, in
 which case it can upload the file during the send() process.
 interface File {
   readonly attribute DOMString name;
 readonly attribute DOMString mediaType;
 readonly atribute DOMString url;
 readonly attribute unsigned long long size;
 }
 Secondly, a list of files can be obtained using some UI.
 typedef sequenceFile FileList;
 Thirdly, an abstract interface is an input stream that is not limited to
 files. It works at the level of bytes that files are made of. The read()
 operation can specify the extent that is required. If an application wishes
 to read small increments, it can thus specify those increments. Of course,
 the File interface identifies its size, so the application can suitably
 choose increments. Processing of blocks read from the file occurs in
 callbacks. XHR could also consider taking an InputStream parameter during
 the send() operation.
 interface InputStream {
   read(in DataHandler, [optional in] long long offset, [optional in] long
 long length);
 abort();
 attribute Function onerror;
 }
 Fourthly, reading a block of bytes is supported through an interface that
 accepts an array of integers. This is similar to the Gears Blob interface.
 [CallBack=FunctionOnly]
 interface DataHandler {
   handle(in sequenceint data);
 }
 Fifthly, a file can be used for reading an input stream by specifying the
 name of a file when constructing the stream
 [Constructor(in File toOpen)]
 interface FileInputStream : InputStream {
 }
 Sixthly, one can create various kinds of derived readers such as text
 reader, binary string reader, and data URL reader. By inheriting from
 InputStream, the basic mechanisms such as abort and onerror are inherited.
 Moreover, the base read behavior is altered by the subclass although it
 behaves in a similar manner, except that the data seen outside is different.
 [Constructor(in InputStream base)]
 interface BinaryStringInputStream : InputStream {
   read(in StringDataHandler, [optional in] long long offset, [optional in]
 long long length);
 }
 The callback is provided a DOMString. The String's length is expected to
 match the increment requested.
 [CallBack=FunctionOnly]
 interface StringDataHandler {
 handle(in DOMString data);
 }
 For text reading, encoding is optionally specified.
 [Constructor(in InputStream base, [optional in] DOMString encoding)]
 interface TextInputStream : InputStream {
   read(in StringDataHandler, [optional in] long long offset, [optional in]
 long long length);
 }

 A file can be alternatively read as a dataURL using a similar kind of
 handler as above.
 [Constructor(in InputStream base)]
 interface FileDataURL: InputStream {
   read(in StringDataHandler, [optional in] long long offset, [optional in]
 long long length);
 }
 This API has the advantage that it can cleanly be extended to deal with both
 writing use cases and binary data. Furthermore, it can also support
 extensions that perform cryptographic, compression, or coding on top of the
 basic interfaces.
 To compare with the editor's draft, here's a typical programming case in
 JavaScript:
 var fileList = ...
 // There is a mistake in the example provided in Section 3 where it does
 fileList.files[0]
 var myFile = fileList[0];
 // *According to editor's draft*
 myFile.getAsText(handleDataAsText)
 function handleDataAsText(fileContent, error) {
   if (error) {
   }
 }
 // *According to my proposal*
 var stream = new TextInputStream(new FileInputStream(myFile), UTF-16);
 stream.read(handleDataAsText);
 stream.onerror = errorHandler;
 function handleDataAsText(fileContent) {
 }
 function errorHandler(error) {
 }
 Note the two differences:
 1. Error handling is separated from file reading
 2. Two extra objects are needed to read text data out of the file. However,
 the composability of input streams enables a far richer library to operate.
 This API matches more closely the Java API for IO. It also benefits from the
 extensibility model used in Java, while retaining the asynchronous
 processing nature that is preferred in ECMAScript environments. It is also
 not too different from the editor's draft in that it does not introduce a
 completely different kind of data processing - we are still looking at
 callbacks. However, the improvement is in the composability of streams as
 well as supporting multiple concurrent file readers and processing blocks of
 data at a time.
 Progress events can be built on top but I welcome suggestions to build them
 in to this proposal.
 Nikunj
 http://o-micron.blogspot.com

A few comments on this:

I do like the idea 

Re: File API to separate reading from files

2009-08-19 Thread Nikunj R. Mehta


On Aug 19, 2009, at 12:21 PM, Jonas Sicking wrote:


On Wed, Aug 19, 2009 at 11:47 AM, Nikunj R.
Mehtanikunj.me...@oracle.com wrote:
Here's an alternative, more easily extensible, proposal for reading  
files.
It provides applications a way to read small amounts of data at a  
time. It

also allows applications to concurrently read the same file.
Firstly, there is a simple interface to access file metadata. This  
metadata
is always accessed synchronously. A file object could be passed to  
XHR, in

which case it can upload the file during the send() process.
interface File {
  readonly attribute DOMString name;
readonly attribute DOMString mediaType;
readonly atribute DOMString url;
readonly attribute unsigned long long size;
}
Secondly, a list of files can be obtained using some UI.
typedef sequenceFile FileList;
Thirdly, an abstract interface is an input stream that is not  
limited to
files. It works at the level of bytes that files are made of. The  
read()
operation can specify the extent that is required. If an  
application wishes
to read small increments, it can thus specify those increments. Of  
course,
the File interface identifies its size, so the application can  
suitably

choose increments. Processing of blocks read from the file occurs in
callbacks. XHR could also consider taking an InputStream parameter  
during

the send() operation.
interface InputStream {
  read(in DataHandler, [optional in] long long offset, [optional  
in] long

long length);
abort();
attribute Function onerror;
}
Fourthly, reading a block of bytes is supported through an  
interface that
accepts an array of integers. This is similar to the Gears Blob  
interface.

[CallBack=FunctionOnly]
interface DataHandler {
  handle(in sequenceint data);
}
Fifthly, a file can be used for reading an input stream by  
specifying the

name of a file when constructing the stream
[Constructor(in File toOpen)]
interface FileInputStream : InputStream {
}
Sixthly, one can create various kinds of derived readers such as text
reader, binary string reader, and data URL reader. By inheriting from
InputStream, the basic mechanisms such as abort and onerror are  
inherited.
Moreover, the base read behavior is altered by the subclass  
although it
behaves in a similar manner, except that the data seen outside is  
different.

[Constructor(in InputStream base)]
interface BinaryStringInputStream : InputStream {
  read(in StringDataHandler, [optional in] long long offset,  
[optional in]

long long length);
}
The callback is provided a DOMString. The String's length is  
expected to

match the increment requested.
[CallBack=FunctionOnly]
interface StringDataHandler {
handle(in DOMString data);
}
For text reading, encoding is optionally specified.
[Constructor(in InputStream base, [optional in] DOMString encoding)]
interface TextInputStream : InputStream {
  read(in StringDataHandler, [optional in] long long offset,  
[optional in]

long long length);
}

A file can be alternatively read as a dataURL using a similar kind of
handler as above.
[Constructor(in InputStream base)]
interface FileDataURL: InputStream {
  read(in StringDataHandler, [optional in] long long offset,  
[optional in]

long long length);
}
This API has the advantage that it can cleanly be extended to deal  
with both

writing use cases and binary data. Furthermore, it can also support
extensions that perform cryptographic, compression, or coding on  
top of the

basic interfaces.
To compare with the editor's draft, here's a typical programming  
case in

JavaScript:
var fileList = ...
// There is a mistake in the example provided in Section 3 where it  
does

fileList.files[0]
var myFile = fileList[0];
// *According to editor's draft*
myFile.getAsText(handleDataAsText)
function handleDataAsText(fileContent, error) {
  if (error) {
  }
}
// *According to my proposal*
var stream = new TextInputStream(new FileInputStream(myFile),  
UTF-16);

stream.read(handleDataAsText);
stream.onerror = errorHandler;
function handleDataAsText(fileContent) {
}
function errorHandler(error) {
}
Note the two differences:
1. Error handling is separated from file reading
2. Two extra objects are needed to read text data out of the file.  
However,
the composability of input streams enables a far richer library to  
operate.
This API matches more closely the Java API for IO. It also benefits  
from the

extensibility model used in Java, while retaining the asynchronous
processing nature that is preferred in ECMAScript environments. It  
is also
not too different from the editor's draft in that it does not  
introduce a
completely different kind of data processing - we are still looking  
at
callbacks. However, the improvement is in the composability of  
streams as
well as supporting multiple concurrent file readers and processing  
blocks of

data at a time.
Progress events can be built on top but I welcome suggestions to  
build them

in to this proposal.
Nikunj
http://o-micron.blogspot.com


A few 

Re: File API to separate reading from files

2009-08-19 Thread Arve Bersvendsen

On Wed, 19 Aug 2009 21:21:54 +0200, Jonas Sicking jo...@sicking.cc wrote:


I do like the idea of having a stream primitive. I think we'll need
that for other things in the future such as reading data from a
camera, or reading data from a microphone.


I quite concur, and for the widget space, actually writing data is a real  
use-case.



However I'm not convinced that we should force people to use streams
to deal with the simple use case of reading data from a file. In 95%
(if not more) of the cases the user simply wants to get the contents
of the file, and so forcing them to do that using:


Does the presence of an IOStream primitive exclude getAs*?
--
Arve Bersvendsen

Opera Software ASA, http://www.opera.com/



Re: File API to separate reading from files

2009-08-19 Thread Nikunj R. Mehta


On Aug 19, 2009, at 3:07 PM, Arve Bersvendsen wrote:

On Wed, 19 Aug 2009 21:21:54 +0200, Jonas Sicking jo...@sicking.cc  
wrote:



I do like the idea of having a stream primitive. I think we'll need
that for other things in the future such as reading data from a
camera, or reading data from a microphone.


I quite concur, and for the widget space, actually writing data is a  
real

use-case.


However I'm not convinced that we should force people to use streams
to deal with the simple use case of reading data from a file. In 95%
(if not more) of the cases the user simply wants to get the contents
of the file, and so forcing them to do that using:


Does the presence of an IOStream primitive exclude getAs*?


They are independent proposals. Of course, it is still possible that  
the next WD combine the two. I wouldn't like that too much, but that's  
just me.



--
Arve Bersvendsen

Opera Software ASA, http://www.opera.com/



Nikunj
http://o-micron.blogspot.com






Re: File API to separate reading from files

2009-08-19 Thread Nikunj R. Mehta
I want to clarify that it is indeed possible to adjust my proposal to  
sidestep the issue of dealing with bytes directly. Here's how.


Remove the read() method in the InputStream interface.
Remove the DataHandler interface

Notice that composability doesn't depend on the availability of raw  
bytes to the JavaScript interface. In fact, the implementation should  
be able to dissect the bytes corresponding to the input stream being  
passed so that it can either:


1. produce a data URL
2. produce text blocks
3. produce binary string blocks

I hope that eliminates any expectation of a dependency on representing  
bytes as integers from my proposal. I just threw it in since I felt  
Gears thought it was OK. Perhaps, a real byte handler can be added  
when ECMAScript is ready with byte arrays and the File I/O will serve  
as a rallying cause for it.


Nikunj

On Aug 19, 2009, at 11:47 AM, Nikunj R. Mehta wrote:

Here's an alternative, more easily extensible, proposal for reading  
files. It provides applications a way to read small amounts of data  
at a time. It also allows applications to concurrently read the same  
file.


Firstly, there is a simple interface to access file metadata. This  
metadata is always accessed synchronously. A file object could be  
passed to XHR, in which case it can upload the file during the  
send() process.


interface File {
  readonly attribute DOMString name;
  readonly attribute DOMString mediaType;
  readonly atribute DOMString url;
  readonly attribute unsigned long long size;
}

Secondly, a list of files can be obtained using some UI.

typedef sequenceFile FileList;

Thirdly, an abstract interface is an input stream that is not  
limited to files. It works at the level of bytes that files are made  
of. The read() operation can specify the extent that is required. If  
an application wishes to read small increments, it can thus specify  
those increments. Of course, the File interface identifies its size,  
so the application can suitably choose increments. Processing of  
blocks read from the file occurs in callbacks. XHR could also  
consider taking an InputStream parameter during the send() operation.


interface InputStream {
  read(in DataHandler, [optional in] long long offset, [optional in]  
long long length);	

  abort();
  attribute Function onerror;
}

Fourthly, reading a block of bytes is supported through an interface  
that accepts an array of integers. This is similar to the Gears Blob  
interface.


[CallBack=FunctionOnly]
interface DataHandler {
  handle(in sequenceint data);
}

Fifthly, a file can be used for reading an input stream by  
specifying the name of a file when constructing the stream

[Constructor(in File toOpen)]
interface FileInputStream : InputStream {
}

Sixthly, one can create various kinds of derived readers such as  
text reader, binary string reader, and data URL reader. By  
inheriting from InputStream, the basic mechanisms such as abort and  
onerror are inherited. Moreover, the base read behavior is altered  
by the subclass although it behaves in a similar manner, except that  
the data seen outside is different.


[Constructor(in InputStream base)]
interface BinaryStringInputStream : InputStream {
  read(in StringDataHandler, [optional in] long long offset,  
[optional in] long long length);	

}

The callback is provided a DOMString. The String's length is  
expected to match the increment requested.


[CallBack=FunctionOnly]
interface StringDataHandler {
handle(in DOMString data);
}

For text reading, encoding is optionally specified.

[Constructor(in InputStream base, [optional in] DOMString encoding)]
interface TextInputStream : InputStream {
  read(in StringDataHandler, [optional in] long long offset,  
[optional in] long long length);	

}


A file can be alternatively read as a dataURL using a similar kind  
of handler as above.


[Constructor(in InputStream base)]
interface FileDataURL: InputStream {
  read(in StringDataHandler, [optional in] long long offset,  
[optional in] long long length);	

}

This API has the advantage that it can cleanly be extended to deal  
with both writing use cases and binary data. Furthermore, it can  
also support extensions that perform cryptographic, compression, or  
coding on top of the basic interfaces.


To compare with the editor's draft, here's a typical programming  
case in JavaScript:


var fileList = ...
// There is a mistake in the example provided in Section 3 where it  
does fileList.files[0]

var myFile = fileList[0];

// *According to editor's draft*
myFile.getAsText(handleDataAsText)
function handleDataAsText(fileContent, error) {
  if (error) {

  }
}

// *According to my proposal*
var stream = new TextInputStream(new FileInputStream(myFile),  
UTF-16);

stream.read(handleDataAsText);
stream.onerror = errorHandler;
function handleDataAsText(fileContent) {

}

function errorHandler(error) {

}

Note the two differences:
1. Error handling is separated from file