Re: [whatwg] Application deployment

2008-08-03 Thread Christoph Päper

Robert O'Callahan:

http://www.example.com/site.jar#/path/inside/foo.html#heading1


URL parsing doesn't support multiple fragment identifiers


I'm surprised that RFC 3986 (like 2396) makes '#' reserved in  
fragment identifiers (only '[]', too). The fragment ID is terminated  
only by the end of the URI after all. The one reason for disallowing  
'#' I can think of is tokenization starting from the end of the  
string, but as far as I know that may fail for other parts.


  fragment= *( pchar / "/" / "?" )
  pchar   = unreserved / pct-encoded / sub-delims / ":" / "@"
  unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
  pct-encoded = "%" HEXDIG HEXDIG
  sub-delims  = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" /  
"," / ";" / "="


  
should work fine, though.


-8<8<8<8<8<8<8<-

I'm also surprised that RFC 3986 (unlike 2396) misses a section on US- 
ASCII characters deliberately excluded, i.e.  and '"<>{}|\`^ ',  
previously also '[]'. I think


  reserved= gen-delims / sub-delims
  gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
  ...

should be something like

  reserved= delims / enclosing / unwise / controls
  delims  = gen-delims / sub-delims
  enclosing   = DQUOTE / "<" / ">" / SP
  unwise  = "{" / "}" / "|" / "\" / "`" / "^"
  controls= %x00-1F / %x7F
  ...


Re: [whatwg] Application deployment

2008-07-31 Thread Russell Leggett
>
> Please explain why you consider concatenating JavaScript sources dirty.
>

I don't necessarily think it's dirty, but any choices that game the system
for purely performance reasons seem hackish to me. Concatenating js files
for performance reasons is certainly less offensive than css sprites, but it
still begs the question: is this always the right choice. For example, let's
say I'm using jQuery plus a few plugins. The resources are really separate
entities from third parties. Should I have to concatenate them?

That said, there is clearly not much interest for this proposal here.
I graciously concede :)

-Russ

On Wed, Jul 30, 2008 at 2:43 PM, Kristof Zelechovski
<[EMAIL PROTECTED]>wrote:

>  Please explain why you consider concatenating JavaScript sources dirty.
>  You can have a library of all JavaScript definitions relevant to your site
> in one source file and I am not sure what is wrong with it, except that a
> library should consist of books, but that concept was already broken long
> ago.
>
> Chris
>
>
>  --
>
> *From:* [EMAIL PROTECTED] [mailto:
> [EMAIL PROTECTED] *On Behalf Of *Russell Leggett
> *Sent:* Wednesday, July 30, 2008 4:25 PM
> *To:* Peter Kasting
> *Cc:* [EMAIL PROTECTED]
> *Subject:* Re: [whatwg] Application deployment
>
>
>
> It seems to me that many of the additions to the HTML spec are there
> because they provide a standard way to do something we are already doing
> with a hack or more complicated means. CSS sprites are clearly a hack.
> Concatenating js files are clearly a hack. Serving from
> multiple sub-domains to beat the connection limit is also a workaround. My
> proposal is intended to approach the deployment issue directly, because I
> think it is a limitation in the html spec itself and therefore, I think the
> html spec should provide its own solution. My proposal may not be the best
> way, but assuming the issue will be dealt with eventually by some other
> party through some other means does not seem right either.
>
>
>


Re: [whatwg] Application deployment

2008-07-30 Thread Kristof Zelechovski
Please explain why you consider concatenating JavaScript sources dirty.  You
can have a library of all JavaScript definitions relevant to your site in
one source file and I am not sure what is wrong with it, except that a
library should consist of books, but that concept was already broken long
ago.

Chris

 

  _  

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Russell Leggett
Sent: Wednesday, July 30, 2008 4:25 PM
To: Peter Kasting
Cc: [EMAIL PROTECTED]
Subject: Re: [whatwg] Application deployment

 

It seems to me that many of the additions to the HTML spec are there because
they provide a standard way to do something we are already doing with a hack
or more complicated means. CSS sprites are clearly a hack. Concatenating js
files are clearly a hack. Serving from multiple sub-domains to beat the
connection limit is also a workaround. My proposal is intended to approach
the deployment issue directly, because I think it is a limitation in the
html spec itself and therefore, I think the html spec should provide its own
solution. My proposal may not be the best way, but assuming the issue will
be dealt with eventually by some other party through some other means does
not seem right either.

 



Re: [whatwg] Application deployment

2008-07-30 Thread Dave Singer

At 21:45  -0700 29/07/08, Robert O'Callahan wrote:
On Tue, Jul 29, 2008 at 11:20 AM, Dave Singer 
<[EMAIL PROTECTED]> wrote:


Caching is on a full URL basis, of course.  Once that is decided, 
then yes, I think that pre-cached items for a given URL are in the 
general cache for that site.



A site that uses this feature is likely to be fragile. It will have 
to have z.html both in the archive and available directly from the 
server, in case z.html is requested before the load of the archive 
has finished.


No.  The definition *for MPEG-21 files* (which is all I have 
specified so far) is that accesses to the matching absolute URL (or 
relative URL) from within the archibe MUST find the resource within 
the archive.  Since, as I say, this format starts with a directory, 
you know whether you have it or not.  If ZIP or JAR files don't have 
a directory, then yes, they have a different trade-off and must load 
the whole thing before they know.


You only need a resource *outside* the archive if it is requested 
'nakedly' from outside the archive.  If you do that, it might indeed 
hurt, but that's your choice as a site.


The performance trade-off is very simple;  if you have many small 
resources it may be much more efficient to ftch them as a package 
than individually.  The downside is that this is a single connection 
in a pre-defined order whereas multiple resources could be fetched on 
parallel connections, and as needed.  I doubt more connections to the 
same server gets you more bandwidth, however, and the mpeg-21 format 
also allows extent-based interleaving so that e.g. a lareg HTML page 
and and large JPEG can be loaded progressively together.


And if those copies ever get out of sync you're in very big trouble, 
because depending on the context, either the archive version or the 
direct version is likely to consistently win the load race, so just 
occasionally some clients will get the wrong version. This seems 
like a highly error-prone design.


Rob

--
"He was pierced for our transgressions, he was crushed for our 
iniquities; the punishment that brought us peace was upon him, and 
by his wounds we are healed. We all, like sheep, have gone astray, 
each of us has turned to his own way; and the LORD has laid on him 
the iniquity of us all." [Isaiah 53:5-6]



--
David Singer
Apple/QuickTime

Re: [whatwg] Application deployment

2008-07-30 Thread Russell Leggett
>
> The only thing archives get you IMO is difficulty with caching algorithms,
> annoyances rewriting URLs, potentially blocked parsing, and possibly
> inefficient use of network bandwidth due to reduced parallelization.
>

I don't see any reason that parsing would need to be blocked any more than
it already is. No rewriting of URLs would be necessary at all, and I have
already provided suggestions for simple solutions that would prevent
unnecessary blocking.

Server sharding and higher connection limits solve the problem of
> artificially low connection limits.  JS script references block further
> parsing in most browsers; the correct solution to this, as Ian said, seems
> like some variant of Safari's optimistic parser.  Referencing large numbers
> of tiny images causes excessive image header bytes + TCP connection overhead
> that can be reduced or eliminated with CSS spriting.


Server sharding and CSS sprites are both artificial solutions that are used
to deal with limitations of the existing deployment model. If you are
worried about fragility, look no further than css sprites. They have to be
background images, and require precise measurement of size and location.
This creates extremely tight coupling between the css code and the file
itself. Not to mention the maintenance of the sprite images themselves.
Clearly we are already dealing with the problems of resource loading and how
to make it most efficient. Our existing solutions are widely varied and
complex, but all of them result in changes to our html/css/js code that
would not already be there if we did not have that limitation.

It seems to me that many of the additions to the HTML spec are there because
they provide a standard way to do something we are already doing with a hack
or more complicated means. CSS sprites are clearly a hack. Concatenating js
files are clearly a hack. Serving from multiple sub-domains to beat the
connection limit is also a workaround. My proposal is intended to approach
the deployment issue directly, because I think it is a limitation in the
html spec itself and therefore, I think the html spec should provide its own
solution. My proposal may not be the best way, but assuming the issue will
be dealt with eventually by some other party through some other means does
not seem right either.

-Russ


On Wed, Jul 30, 2008 at 4:27 AM, Peter Kasting <[EMAIL PROTECTED]> wrote:

> On Tue, Jul 29, 2008 at 5:10 PM, Russell Leggett <
> [EMAIL PROTECTED]> wrote:
>
>>  That is a performance killer.
>>
>>
>> I don't think it is as much of a performance killer as you say it is.
>> Correct me if I'm wrong, but the standard connection limit is two.
>>
>
> The standard connection limit is 6, not 2, as of IE 8 and Fx 3.  I would be
> very surprised if this came back down or was not adopted by all other
> browser makers over the next year or two.
>
> Furthermore, the connection limit applies only to resources off one host.
>  Sites have for years gotten around this by sharding across hosts (
> img1.foo.com, img2.foo.com, ...).
>
> There are many reasons resources can cause slowdown on the web, but I don't
> view this "archive" proposal as useful in solving them compared to existing
> tactics.  Server sharding and higher connection limits solve the problem of
> artificially low connection limits.  JS script references block further
> parsing in most browsers; the correct solution to this, as Ian said, seems
> like some variant of Safari's optimistic parser.  Referencing large numbers
> of tiny images causes excessive image header bytes + TCP connection overhead
> that can be reduced or eliminated with CSS spriting.
>
> The only thing archives get you IMO is difficulty with caching algorithms,
> annoyances rewriting URLs, potentially blocked parsing, and possibly
> inefficient use of network bandwidth due to reduced parallelization.
>  Archives remove the flexibility of a network stack to optimize
> parallelization levels for the user's current connection type (not that I
> think today's browsers actually do such a thing, at least not well; but it
> is an area with potential gains).
>
> PK
>


Re: [whatwg] Application deployment

2008-07-30 Thread Peter Kasting
On Tue, Jul 29, 2008 at 5:10 PM, Russell Leggett
<[EMAIL PROTECTED]>wrote:

> That is a performance killer.
>
>
> I don't think it is as much of a performance killer as you say it is.
> Correct me if I'm wrong, but the standard connection limit is two.
>

The standard connection limit is 6, not 2, as of IE 8 and Fx 3.  I would be
very surprised if this came back down or was not adopted by all other
browser makers over the next year or two.

Furthermore, the connection limit applies only to resources off one host.
 Sites have for years gotten around this by sharding across hosts (
img1.foo.com, img2.foo.com, ...).

There are many reasons resources can cause slowdown on the web, but I don't
view this "archive" proposal as useful in solving them compared to existing
tactics.  Server sharding and higher connection limits solve the problem of
artificially low connection limits.  JS script references block further
parsing in most browsers; the correct solution to this, as Ian said, seems
like some variant of Safari's optimistic parser.  Referencing large numbers
of tiny images causes excessive image header bytes + TCP connection overhead
that can be reduced or eliminated with CSS spriting.

The only thing archives get you IMO is difficulty with caching algorithms,
annoyances rewriting URLs, potentially blocked parsing, and possibly
inefficient use of network bandwidth due to reduced parallelization.
 Archives remove the flexibility of a network stack to optimize
parallelization levels for the user's current connection type (not that I
think today's browsers actually do such a thing, at least not well; but it
is an area with potential gains).

PK


Re: [whatwg] Application deployment

2008-07-30 Thread Kristof Zelechovski
The documents belonging to the container should not be available directly
from the server, except when they are served via a server extension that
goes to the container to get them.  This effect should be easy to achieve on
the server side.

Chris

 

  _  

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Robert O'Callahan
Sent: Wednesday, July 30, 2008 6:45 AM
To: Dave Singer
Cc: [EMAIL PROTECTED]
Subject: Re: [whatwg] Application deployment

 

On Tue, Jul 29, 2008 at 11:20 AM, Dave Singer <[EMAIL PROTECTED]> wrote:

Caching is on a full URL basis, of course.  Once that is decided, then yes,
I think that pre-cached items for a given URL are in the general cache for
that site.


A site that uses this feature is likely to be fragile. It will have to have
z.html both in the archive and available directly from the server, in case
z.html is requested before the load of the archive has finished. And if
those copies ever get out of sync you're in very big trouble, because
depending on the context, either the archive version or the direct version
is likely to consistently win the load race, so just occasionally some
clients will get the wrong version. This seems like a highly error-prone
design.

Rob

 



Re: [whatwg] Application deployment

2008-07-29 Thread Robert O'Callahan
On Tue, Jul 29, 2008 at 11:20 AM, Dave Singer <[EMAIL PROTECTED]> wrote:

> Caching is on a full URL basis, of course.  Once that is decided, then yes,
> I think that pre-cached items for a given URL are in the general cache for
> that site.
>

A site that uses this feature is likely to be fragile. It will have to have
z.html both in the archive and available directly from the server, in case
z.html is requested before the load of the archive has finished. And if
those copies ever get out of sync you're in very big trouble, because
depending on the context, either the archive version or the direct version
is likely to consistently win the load race, so just occasionally some
clients will get the wrong version. This seems like a highly error-prone
design.

Rob
-- 
"He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all." [Isaiah
53:5-6]


Re: [whatwg] Application deployment

2008-07-29 Thread Dave Singer
The situation is a lot better for archives (like MPEG-21 files) that 
have a directory at the front...



At 20:10  -0400 29/07/08, Russell Leggett wrote:

That is a performance killer.


I don't think it is as much of a performance killer as you say it 
is. Correct me if I'm wrong, but the standard connection limit is 
two. It is not as though every external file could be loaded at 
once. Additionally, as I said, you could split resources into 
multiple archives to take advantage of multiple connections, because 
they can be loaded asynchronously without issue.  Remember, the use 
case for this would be when there are likely dozens of different 
files that need to be loaded.


So you get nondeterministic load behaviour anyway. This is not good.


This is not so different than directly requesting a file from 
multiple tabs. Let's say page 1 and page 2 each use the same image. 
If I load page 2 first, it will go directly to the server. If I load 
page 1 first, page 2 will get the image from cache.



Clearly, there are other ways of performing this task, but I think 
this way is simple and I know that I would gladly accept it as a 
possibility. It falls within the same realm that any caching 
behavior does. It is meant purely for performance, but if you are 
relying on it for a given behavior then you are on a road to trouble.


The archived resources should be static, and also available as 
individual files. Pre-fetching them should only be used for 
performance gains, and if it would not be performant, it should not 
be used. However, I think there is a fairly wide range of sites or 
applications that could benefit from this feature. If there are 
other ways of improving it, or making it work better for certain 
edge cases, that would be great, but don't throw the baby out with 
the bath water.


Off the top of my head, I can think of a couple of ways to refine 
the feature and deal with the issues raised.


Only blocking the loading of files that could logically be inside 
the archive: if the archive is located at "/js/resources.zip", then 
the only files that should be blocked would have to be located under 
"/js".  would not be blocked.
To go a step further, there could even be some kind of "pattern" 
attribute that would block the loading of files that matched a url 
pattern. For example, if the archive were located at "/", but had a 
pattern of "**.js,**.css,/images/*", only js, css, and files under 
the "/images" directory would be blocked.


On Tue, Jul 29, 2008 at 2:13 PM, Robert O'Callahan 
<[EMAIL PROTECTED]> wrote:


On Tue, Jul 29, 2008 at 5:59 AM, Russell Leggett 
<[EMAIL PROTECTED]> wrote:


Yes, the one major hang up that I foresee is how a browser should 
handle asynchronous loading. How would it know the contents of the 
archive before it loaded the archive so it did not try to load the 
same files directly? The simple answer would be to load the 
archive(s) synchronously.



That is a performance killer.

As for references in a different tab, if the separate tab/document 
did not reference the zip archive first, it would operate as normal. 
It would check the cache and then attempt to load. If the zip had 
been loaded from the first page already, the file would be present 
in the cache, and if not, then the browser would attempt to retrieve 
it from the server.



So you get nondeterministic load behaviour anyway. This is not good.

Rob

--
"He was pierced for our transgressions, he was crushed for our 
iniquities; the punishment that brought us peace was upon him, and 
by his wounds we are healed. We all, like sheep, have gone astray, 
each of us has turned to his own way; and the LORD has laid on him 
the iniquity of us all." [Isaiah 53:5-6]



--
David Singer
Apple/QuickTime

Re: [whatwg] Application deployment

2008-07-29 Thread Russell Leggett
>
> That is a performance killer.


I don't think it is as much of a performance killer as you say it is.
Correct me if I'm wrong, but the standard connection limit is two. It is not
as though every external file could be loaded at once. Additionally, as I
said, you could split resources into multiple archives to take advantage of
multiple connections, because they can be loaded asynchronously without
issue.  Remember, the use case for this would be when there are likely
dozens of different files that need to be loaded.

So you get nondeterministic load behaviour anyway. This is not good.


This is not so different than directly requesting a file from multiple tabs.
Let's say page 1 and page 2 each use the same image. If I load page 2 first,
it will go directly to the server. If I load page 1 first, page 2 will get
the image from cache.

Clearly, there are other ways of performing this task, but I think this way
is simple and I know that I would gladly accept it as a possibility. It
falls within the same realm that any caching behavior does. It is meant
purely for performance, but if you are relying on it for a given behavior
then you are on a road to trouble.

The archived resources should be static, and also available as individual
files. Pre-fetching them should only be used for performance gains, and if
it would not be performant, it should not be used. However, I think there is
a fairly wide range of sites or applications that could benefit from this
feature. If there are other ways of improving it, or making it work better
for certain edge cases, that would be great, but don't throw the baby out
with the bath water.

Off the top of my head, I can think of a couple of ways to refine the
feature and deal with the issues raised.

   - Only blocking the loading of files that could logically be inside the
   archive: if the archive is located at "/js/resources.zip", then the only
   files that should be blocked would have to be located under "/js".  would not be blocked.
   - To go a step further, there could even be some kind of "pattern"
   attribute that would block the loading of files that matched a url pattern.
   For example, if the archive were located at "/", but had a pattern of
   "**.js,**.css,/images/*", only js, css, and files under the "/images"
   directory would be blocked.


On Tue, Jul 29, 2008 at 2:13 PM, Robert O'Callahan <[EMAIL PROTECTED]>wrote:

> On Tue, Jul 29, 2008 at 5:59 AM, Russell Leggett <
> [EMAIL PROTECTED]> wrote:
>
>> Yes, the one major hang up that I foresee is how a browser should handle
>> asynchronous loading. How would it know the contents of the archive before
>> it loaded the archive so it did not try to load the same files directly? The
>> simple answer would be to load the archive(s) synchronously.
>>
>
> That is a performance killer.
>
> As for references in a different tab, if the separate tab/document did not
>> reference the zip archive first, it would operate as normal. It would check
>> the cache and then attempt to load. If the zip had been loaded from the
>> first page already, the file would be present in the cache, and if not, then
>> the browser would attempt to retrieve it from the server.
>>
>
> So you get nondeterministic load behaviour anyway. This is not good.
>
> Rob
> --
> "He was pierced for our transgressions, he was crushed for our iniquities;
> the punishment that brought us peace was upon him, and by his wounds we are
> healed. We all, like sheep, have gone astray, each of us has turned to his
> own way; and the LORD has laid on him the iniquity of us all." [Isaiah
> 53:5-6]
>


Re: [whatwg] Application deployment

2008-07-29 Thread Robert O'Callahan
On Tue, Jul 29, 2008 at 2:52 AM, Kristof Zelechovski
<[EMAIL PROTECTED]>wrote:

>  Archive: is not generic enough but perhaps you could bend the URL
> notation to embrace something like inside:.  I still would not recommend it
> but it would not make me that sore.
>
> How about http://www.site.com/app.jar>?
>
> The user agent would be required to append a query string to local
> hyperlinks and that parameter would be reserved (or rename it to
> h809370dfwhbwa0r92347090).
>
>
That query string would have to be appended everywhere you do baseURI +
relativeURI -> absoluteURI conversion. So you're really just messing with
relative URI syntax for this particular scheme. That's not cleaner than the
URI extension for jar:/archive: (or whatever you want to call it), IMHO.

OTOH, you can simulate several entry points by having all supported entry
> points on the start page (à la Microsoft Access) and have the user navigate
> to what she needs.  I do not think this would be prohibitive from the
> customer's point of view.  And I am sure there is no need to publish each
> local address.
>
That breaks bookmarks and similar navigation mechanisms such as intelligent
URLbar autocompletion. Also note that an entry point can be a particular
document hosted in a Web application, or even a particular email message, so
you can't always offer one-click navigation.

Rob
-- 
"He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all." [Isaiah
53:5-6]


Re: [whatwg] Application deployment

2008-07-29 Thread Dave Singer

At 19:51  +1200 29/07/08, Robert O'Callahan wrote:
On Tue, Jul 29, 2008 at 8:02 AM, Dave Singer 
<[EMAIL PROTECTED]> wrote:



c) that the contents of the container, once fetched and un-packed, 
logically 'shadow' the directory where the container came from.



It sounds like that affects all loads, which leads to issues:

So if I load 
http://www.example.com/x.m21#y.html and (in the same document, or in 
another tab?) load 
http://www.example.com/z.html, and 
x.m21 contains a z.html but the server also responds to 
http://example.com/z.html, does the 
second load (z.html) come from the server or the container? Does it 
depend on whether the second load starts before the first load 
finishes?


Caching is on a full URL basis, of course.  Once that is decided, 
then yes, I think that pre-cached items for a given URL are in the 
general cache for that site.  If that site doesn't want that effect, 
then don't have z.html inside a ZIP archive in a directory, and a 
different z.html in the directory by itself.


Nor should you refer to z.html as a simple file, outside the archive 
in which it is packaged, unless it is also available separately, 
since there is no assurance that the archive has been fetched and 
pre-cached.


I don't see any of these restrictions as particularly un-obvious or 
unreasonable.






The same questions apply to Russell's proposal.

Rob

--
"He was pierced for our transgressions, he was crushed for our 
iniquities; the punishment that brought us peace was upon him, and 
by his wounds we are healed. We all, like sheep, have gone astray, 
each of us has turned to his own way; and the LORD has laid on him 
the iniquity of us all." [Isaiah 53:5-6]



--
David Singer
Apple/QuickTime

Re: [whatwg] Application deployment

2008-07-29 Thread Robert O'Callahan
On Tue, Jul 29, 2008 at 5:59 AM, Russell Leggett
<[EMAIL PROTECTED]>wrote:

> Yes, the one major hang up that I foresee is how a browser should handle
> asynchronous loading. How would it know the contents of the archive before
> it loaded the archive so it did not try to load the same files directly? The
> simple answer would be to load the archive(s) synchronously.
>

That is a performance killer.

As for references in a different tab, if the separate tab/document did not
> reference the zip archive first, it would operate as normal. It would check
> the cache and then attempt to load. If the zip had been loaded from the
> first page already, the file would be present in the cache, and if not, then
> the browser would attempt to retrieve it from the server.
>

So you get nondeterministic load behaviour anyway. This is not good.

Rob
-- 
"He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all." [Isaiah
53:5-6]


Re: [whatwg] Application deployment

2008-07-29 Thread Russell Leggett
>
> So if I load 
> http://www.example.com/x.m21#y.html and
> (in the same document, or in another tab?) load
> http://www.example.com/z.html, and x.m21 contains a z.html but the server
> also responds to http://example.com/z.html, does the second load (z.html)
> come from the server or the container? Does it depend on whether the second
> load starts before the first load finishes?
>
> The same questions apply to Russell's proposal.


Yes, the one major hang up that I foresee is how a browser should handle
asynchronous loading. How would it know the contents of the archive before
it loaded the archive so it did not try to load the same files directly? The
simple answer would be to load the archive(s) synchronously. In my previous
example:









The browser could begin loading the zip, and during the load wait before
loading any other files. In an effort to take advantage of multiple
connections, multiple archives could be used. Multiple archives could be
loaded asynchronously without issue.

As for references in a different tab, if the separate tab/document did not
reference the zip archive first, it would operate as normal. It would check
the cache and then attempt to load. If the zip had been loaded from the
first page already, the file would be present in the cache, and if not, then
the browser would attempt to retrieve it from the server.

My proposal is only intended as a way to make HTML work the way it was
intended and remain efficient.  CSS sprites and concatenated scripts are
assumed for any high performance site, but they add an unnecessary level of
complexity. Other suggestions such as HTTP pipelining and the jar protocol
are more complex and out of scope of the HTML5 specification. I think my
proposal degrades gracefully, and while I am not a browser manufacturer, it
seems relatively simple to implement.

Russ

On Tue, Jul 29, 2008 at 3:51 AM, Robert O'Callahan <[EMAIL PROTECTED]>wrote:

> On Tue, Jul 29, 2008 at 8:02 AM, Dave Singer <[EMAIL PROTECTED]> wrote:
>
>>
>> c) that the contents of the container, once fetched and un-packed,
>> logically 'shadow' the directory where the container came from.
>>
>
> It sounds like that affects all loads, which leads to issues:
>
> So if I load 
> http://www.example.com/x.m21#y.htmland 
> (in the same document, or in another tab?) load
> http://www.example.com/z.html, and x.m21 contains a z.html but the server
> also responds to http://example.com/z.html, does the second load (z.html)
> come from the server or the container? Does it depend on whether the second
> load starts before the first load finishes?
>
> The same questions apply to Russell's proposal.
>
> Rob
> --
> "He was pierced for our transgressions, he was crushed for our iniquities;
> the punishment that brought us peace was upon him, and by his wounds we are
> healed. We all, like sheep, have gone astray, each of us has turned to his
> own way; and the LORD has laid on him the iniquity of us all." [Isaiah
> 53:5-6]
>


Re: [whatwg] Application deployment

2008-07-29 Thread Kristof Zelechovski
I think that just puts some restrictions on the arrangement on the server.
My guess is that once a resource is shadowed, it becomes invisible, and the
server should not serve resources that might be shadowed unless the
publisher knows what she is doing.  It is not the only way to make a site
inconsistent.

Chris

 

  _  

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Robert O'Callahan
Sent: Tuesday, July 29, 2008 9:51 AM
To: Dave Singer
Cc: [EMAIL PROTECTED]
Subject: Re: [whatwg] Application deployment

 

On Tue, Jul 29, 2008 at 8:02 AM, Dave Singer <[EMAIL PROTECTED]> wrote:


c) that the contents of the container, once fetched and un-packed, logically
'shadow' the directory where the container came from.


It sounds like that affects all loads, which leads to issues:

So if I load <http://www.example.com/x.m21#y.html*q>
http://www.example.com/x.m21#y.html and (in the same document, or in another
tab?) load http://www.example.com/z.html, and x.m21 contains a z.html but
the server also responds to http://example.com/z.html, does the second load
(z.html) come from the server or the container? Does it depend on whether
the second load starts before the first load finishes?

The same questions apply to Russell's proposal.

Rob

 



Re: [whatwg] Application deployment

2008-07-29 Thread Kristof Zelechovski
Archive: is not generic enough but perhaps you could bend the URL notation
to embrace something like inside:.  I still would not recommend it but it
would not make me that sore.

How about http://www.site.com/app.jar>?

The user agent would be required to append a query string to local
hyperlinks and that parameter would be reserved (or rename it to
h809370dfwhbwa0r92347090).

Of course this URL scheme would never leak to HTTP.

OTOH, you can simulate several entry points by having all supported entry
points on the start page (à la Microsoft Access) and have the user navigate
to what she needs.  I do not think this would be prohibitive from the
customer’s point of view.  And I am sure there is no need to publish each
local address.

Chris

 

  _  

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Robert
O'Callahan
Sent: Tuesday, July 29, 2008 9:55 AM
To: Kristof Zelechovski
Cc: Adrian Sutton; Adam Barth; [EMAIL PROTECTED]; Russell Leggett; Philipp
Serafin
Subject: Re: [whatwg] Application deployment

 

On Tue, Jul 29, 2008 at 6:21 AM, Kristof Zelechovski <[EMAIL PROTECTED]>
wrote:

My complaint was about how the jar URL scheme wannabe conceptually differs
from the schemes we already officially have, not about how ugly it is to
have two consecutive colons.  It is ugly but it does not matter.  What
matters is that a scheme is being promoted that is specific to one content
type, just as the APPLET element is discouraged for the same reason.  


Suppose it was called "archive:" instead of "jar:" and the spec was made
open-ended so that other archive types other than ZIP files were permitted.
Would your objection still apply?

 

Anyway, it is not obvious at all that linking inside a packaged HTML
application should be supported.


Multiple entry points to a single application are common.

Rob





Re: [whatwg] Application deployment

2008-07-29 Thread Robert O'Callahan
On Tue, Jul 29, 2008 at 6:21 AM, Kristof Zelechovski
<[EMAIL PROTECTED]>wrote:

>  My complaint was about how the jar URL scheme wannabe conceptually
> differs from the schemes we already officially have, not about how ugly it
> is to have two consecutive colons.  It is ugly but it does not matter.  What
> matters is that a scheme is being promoted that is specific to one content
> type, just as the APPLET element is discouraged for the same reason.
>

Suppose it was called "archive:" instead of "jar:" and the spec was made
open-ended so that other archive types other than ZIP files were permitted.
Would your objection still apply?


> Anyway, it is not obvious at all that linking inside a packaged HTML
> application should be supported.
>

Multiple entry points to a single application are common.

Rob
-- 
"He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all." [Isaiah
53:5-6]


Re: [whatwg] Application deployment

2008-07-29 Thread Robert O'Callahan
On Tue, Jul 29, 2008 at 8:02 AM, Dave Singer <[EMAIL PROTECTED]> wrote:

>
> c) that the contents of the container, once fetched and un-packed,
> logically 'shadow' the directory where the container came from.
>

It sounds like that affects all loads, which leads to issues:

So if I load 
http://www.example.com/x.m21#y.htmland
(in the same document, or in another tab?) load
http://www.example.com/z.html, and x.m21 contains a z.html but the server
also responds to http://example.com/z.html, does the second load (z.html)
come from the server or the container? Does it depend on whether the second
load starts before the first load finishes?

The same questions apply to Russell's proposal.

Rob
-- 
"He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all." [Isaiah
53:5-6]


Re: [whatwg] Application deployment

2008-07-28 Thread Dave Singer

FYI

When faced with this question in MPEG (MPEG-21 files are container 
files too), we consulted with folks at the W3C (in Cannes, if I 
recall correctly) and decided:


a) that a scheme type was wrong, and that 'picking a piece out of an 
archive' at the client-side was almost the definition of what a 
fragment was for;


b) to solve the 'stacked fragments' by using a * for the second one 
(a character not allowed in fragments, if I recall correctly)


c) that the contents of the container, once fetched and un-packed, 
logically 'shadow' the directory where the container came from.



So, imagine a container x.m21 containing y.html and z.jpg.  We want 
to see anchor-point q in y.html, with the jpeg in the page.


the 'external' pointer reads

http://www.example.com/x.m21#y.html*q

this causes the m21 file to be fetched and unpacked, and then 
interpreted as if its source URI was

http://www.example.com/y.html#q

y.html has been pre-cached as a result of the unpack operation, and 
the re-write of the URI has eliminated x.21 and re-written the first 
* after the # (which has gone) as a #.  So we find y.html and go to 
anchor q.


In y.html,
I believe under these circumstances document analysis for schemes 
used works, relative URLs work, and documents do not need re-writing 
when they are packed, if they use relative URLs.

--
David Singer
Apple/QuickTime


Re: [whatwg] Application deployment

2008-07-28 Thread Ian Hickson
On Mon, 28 Jul 2008, Russell Leggett wrote:
> 
> Let's say I have a large javascript application that is broken into 
> several files for better organization.
>
> But let's say we could zip up all the files, and retrieve them at the 
> start of an html document:
> 
> 
> 
> 
> This zip might contain a directory "js" and inside would contain the js 
> files.

It seems like HTTP pipelining in conjunction with a mechanism like 
Safari's optimistic tokeniser is a better solution to this, in that it 
requires no server-side changes to work today.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Application deployment

2008-07-28 Thread Russell Leggett
Although jar, mhtml, and also the widget spec have some related ideas, I
think all of them are more complex than the solution I'm suggesting as well
off target. I will give a full example.

Let's say I have a large javascript application that is broken into several
files for better organization.








This could easily be more, but you get the drift.

But let's say we could zip up all the files, and retrieve them at the start
of an html document:




This zip might contain a directory "js" and inside would contain the js
files. When the zip file was loaded with the link tag, it would immediately
be unzipped and the files would be put in the cache as though they were
loaded individually. None of the javascript or other resources would be
executed or processed, they would simply be added to the cache. Later in the
html document, these resources could be pulled from the cache.


My web app









Notice that the script tags stay the same with or without resources link. If
it was not supported, it could easily be ignored and the page would still
work. In addition to script tags, this could easily work with css and images
as well as any other resources.

On Mon, Jul 28, 2008 at 2:21 PM, Kristof Zelechovski
<[EMAIL PROTECTED]>wrote:

>  My complaint was about how the jar URL scheme wannabe conceptually
> differs from the schemes we already officially have, not about how ugly it
> is to have two consecutive colons.  It is ugly but it does not matter.  What
> matters is that a scheme is being promoted that is specific to one content
> type, just as the APPLET element is discouraged for the same reason.
> Content types and URL schemas should not be coupled because they live in
> different worlds.  The jar scheme is an exception in Java just as the
> javascript scheme is an exception in HTML because these are essential for
> the internal mechanisms of either language.  Java does not recognize the
> javascript scheme; why should HTML recognize jar?  Because Java programmers
> use it extensively?  Even if that is true, which I doubt (because I think
> there should be a more abstract API for getting application resources
> anyway, perhaps using jar in the implementation), it hardly matters for
> HTML.
>
> I think dealing with two fragment identifiers is a lesser evil than turning
> the URL upside down.
>
> The difference between a hierarchical file system and a flat file system
> are minute indeed and it is primarily related to search efficiency:
> traversing a directory tree in logical order is straightforward in HFS but
> requires a prior conversion in FFS; HFS directories are inaccessible
> (without server extensions) but FFS "directories" simply do not exist.
>
> If relative locators are allowed to go out of the jar (relative to the
> directory the jar is in) then all internal hyperlinks into the archive must
> be "#full/path#fragment" and all local links must be "##fragment".  That
> means the code base must be preprocessed before packaging.
>
> Anyway, it is not obvious at all that linking inside a packaged HTML
> application should be supported.  An alternative solution would be to
> indicate the start page in the manifest and let the code run under a fake
> root.
>
> IMHO,
>
> Chris
>  --
>
> *From:* Adrian Sutton [mailto:[EMAIL PROTECTED]
> *Sent:* Monday, July 28, 2008 10:56 AM
> *To:* Kristof Zelechovski; Adam Barth
> *Cc:* [EMAIL PROTECTED]; Russell Leggett; Philipp Serafin
> *Subject:* Re: [whatwg] Application deployment
>
>
>
> On 28/07/2008 09:22, "Kristof Zelechovski" <[EMAIL PROTECTED]> wrote:
>
> > Having this URL monster shipped does not preclude replacing it with a
> more
> > logical one and deprecating the original one.  People make mistakes all
> the
> > time and fortunately there are cases where the harm can be undone.
>
> It's not just FireFox that supports this URL scheme - the entire Java world
> uses it and supports it back as long as JAR files have existed as far as I
> know. While web pages are a different domain it seems silly to have two
> completely different notations for the same thing just because of aesthetic
> reasons.
>
> It's also worth noting that the jar: scheme will allow you to target
> anchors
> in a HTML document that's within the archive where as the fragment
> identifier syntax would not, unless you used two fragment identifiers:
> http://www.example.com/site.jar#/path/inside/foo.html#heading1<http://www.example.com/site.jar#/path/inside/foo.html%23heading1>
>
>
> > Of course this means that the way relative locators inside an archived
> > document are handled must be changed (they should apply to the fragment
>

Re: [whatwg] Application deployment

2008-07-28 Thread Kristof Zelechovski
My complaint was about how the jar URL scheme wannabe conceptually differs
from the schemes we already officially have, not about how ugly it is to
have two consecutive colons.  It is ugly but it does not matter.  What
matters is that a scheme is being promoted that is specific to one content
type, just as the APPLET element is discouraged for the same reason.
Content types and URL schemas should not be coupled because they live in
different worlds.  The jar scheme is an exception in Java just as the
javascript scheme is an exception in HTML because these are essential for
the internal mechanisms of either language.  Java does not recognize the
javascript scheme; why should HTML recognize jar?  Because Java programmers
use it extensively?  Even if that is true, which I doubt (because I think
there should be a more abstract API for getting application resources
anyway, perhaps using jar in the implementation), it hardly matters for
HTML.

I think dealing with two fragment identifiers is a lesser evil than turning
the URL upside down.

The difference between a hierarchical file system and a flat file system are
minute indeed and it is primarily related to search efficiency: traversing a
directory tree in logical order is straightforward in HFS but requires a
prior conversion in FFS; HFS directories are inaccessible (without server
extensions) but FFS "directories" simply do not exist.

If relative locators are allowed to go out of the jar (relative to the
directory the jar is in) then all internal hyperlinks into the archive must
be "#full/path#fragment" and all local links must be "##fragment".  That
means the code base must be preprocessed before packaging.

Anyway, it is not obvious at all that linking inside a packaged HTML
application should be supported.  An alternative solution would be to
indicate the start page in the manifest and let the code run under a fake
root.

IMHO,

Chris

  _  

From: Adrian Sutton [mailto:[EMAIL PROTECTED] 
Sent: Monday, July 28, 2008 10:56 AM
To: Kristof Zelechovski; Adam Barth
Cc: [EMAIL PROTECTED]; Russell Leggett; Philipp Serafin
Subject: Re: [whatwg] Application deployment

 

On 28/07/2008 09:22, "Kristof Zelechovski" <[EMAIL PROTECTED]> wrote:
> Having this URL monster shipped does not preclude replacing it with a more
> logical one and deprecating the original one.  People make mistakes all
the
> time and fortunately there are cases where the harm can be undone.

It's not just FireFox that supports this URL scheme - the entire Java world
uses it and supports it back as long as JAR files have existed as far as I
know. While web pages are a different domain it seems silly to have two
completely different notations for the same thing just because of aesthetic
reasons.

It's also worth noting that the jar: scheme will allow you to target anchors
in a HTML document that's within the archive where as the fragment
identifier syntax would not, unless you used two fragment identifiers:
http://www.example.com/site.jar#/path/inside/foo.html#heading1


> Of course this means that the way relative locators inside an archived
> document are handled must be changed (they should apply to the fragment
and
> not to the archive path); it should not be possible to escape an archive
> following relative hyperlinks.

Why not? It seems reasonable to have some things inside the JAR and some
dynamically created outside of it. For example were Gmail wanting to reduce
the initial download time for it's JavaScript and UI resources it could put
them in a JAR file but the JavaScript would still want to send requests to
retrieve the user's actual mail data. It could use an absolute URL to do it
but why not support relative URLs?

> It should also be noted that such an archive has a flat file system (only
> one directory with files tagged with relative paths rather then plain
names)
> whereas the HTTP path component addresses a hierarchical file system with
> true directories.  It can cause relative hyperlinks to break when
archiving
> an existing directory.

The file system inside a JAR or ZIP is strictly speaking flat, but logically
hierarchical - ie: you unzip it and you get a hierarchy of directories. The
actual method of storage in bits and bytes doesn't seem to matter. Perhaps
I'm misunderstanding your point...

Regards,

Adrian Sutton.
__
Adrian Sutton, CTO
UK: +44 1 753 27 2229  US: +1 (650) 292 9659 x717
Ephox <http://www.ephox.com/>
Ephox Blogs <http://planet.ephox.com/>, Personal Blog
<http://www.symphonious.net/>



Re: [whatwg] Application deployment

2008-07-28 Thread Charles McCathieNevile
On Sun, 27 Jul 2008 23:05:44 +0200, Philipp Serafin <[EMAIL PROTECTED]>  
wrote:



On Sun, Jul 27, 2008 at 8:44 PM, Russell Leggett
<[EMAIL PROTECTED]> wrote:

...

This is a suggestion that is more helpful to larger single page web
applications, but could also be very helpful to other resource  
intensive web
pages. My thought is that it could be extremely helpful to create some  
kind

of web application deployment format.




I think for HTML, this is already covered by MHTML
. The problem here is probably to
bring more people to implement this one.


That's one common approach. An alternative, which is used to get a bit  
more functionality by people who were thinking the same as you and  
building platforms to do it, is the widget packaging spec -  
http://www.w3.org/tr/widgets (but it isn't developed by WHAT-WG).


cheers

Chaals

--
Charles McCathieNevile  Opera Software, Standards Group
je parle français -- hablo español -- jeg lærer norsk
http://my.opera.com/chaals   Try Opera 9.5: http://www.opera.com


Re: [whatwg] Application deployment

2008-07-28 Thread Robert O'Callahan
On Mon, Jul 28, 2008 at 8:56 PM, Adrian Sutton <[EMAIL PROTECTED]>wrote:

> It's also worth noting that the jar: scheme will allow you to target
> anchors
> in a HTML document that's within the archive where as the fragment
> identifier syntax would not, unless you used two fragment identifiers:
> http://www.example.com/site.jar#/path/inside/foo.html#heading1
>

URL parsing doesn't support multiple fragment identifiers so that doesn't
work. Any way of referencing the contents of archives at arbitrary locations
is likely to require an extension to URL parsing.

Rob
-- 
"He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all." [Isaiah
53:5-6]


Re: [whatwg] Application deployment

2008-07-28 Thread Robert O'Callahan
On Mon, Jul 28, 2008 at 7:55 PM, Adam Barth <[EMAIL PROTECTED]> wrote:

> I suspect
> the reason the Firefox developers chose ! to separate the URL to the
> JAR from the path within the JAR is that ! is not a valid URL
> character.
>

I think Java invented the syntax, actually.

The main value of using the packaged archive is
> that the content author can sign the archive.  For example, this is
> the mechanism used for Firefox extensions.
>

And signed Java applets. Just to clarify --- extensions don't usually use
jar: URLs.

Rob
-- 
"He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all." [Isaiah
53:5-6]


Re: [whatwg] Application deployment

2008-07-28 Thread Robert O'Callahan
The offline app cache can be used to accelerate online apps this way.

The jar: protoocol works pretty well too. However, any mechanism that lets
you reference the contents of a ZIP file requires care to avoid XSS attacks.

Rob
-- 
"He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all." [Isaiah
53:5-6]


Re: [whatwg] Application deployment

2008-07-28 Thread Russell Leggett
Just to clarify, I wanted to point out that my suggestion is related to both
of the suggested alternatives (mhtml and the jar protocol), but is very
different in intention. I think there is a very real need in the area of
deployment for resource intensive web pages/applications. The developer has
to choose between bad performance (several http requests) or a complicated
build process (concatenating js and css and creating css sprites). And even
in the best case scenario, they still cannot be loaded together (js and css
still have to be loaded separately). The intention for my suggestion is not
that resources be accessible from inside a zip or jar, or that a whole web
site be zipped up and sent over email, I was just trying to think of an easy
way to relieve this pain point.
My thought for implementation would be something like:



Then the zip file could basically be unzipped and loaded into the browser
cache. When a link to retrieve a stylesheet, image, or script was reached,
it would just check the cache as it normally would. There would be no
special link urls or protocols.

I'm sure there are holes in the idea somewhere, but I really do think that
some solution can be found, and I think it is a large enough pain point that
it is worth addressing.

Thanks,
Russ

On Mon, Jul 28, 2008 at 5:16 AM, Ian Hickson <[EMAIL PROTECTED]> wrote:

> On Mon, 28 Jul 2008, Adam Barth wrote:
> >
> > My guess is this mechanism will not be included in HTML 5 because some
> > of the other browser vendors have expressed their distaste for nested
> > URL schemes.
>
> I've no intention of adding jar: to HTML5, but more because it seems
> completely orthogonal to the markup language than for any other reason.
>
> --
> Ian Hickson   U+1047E)\._.,--,'``.fL
> http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
>


Re: [whatwg] Application deployment

2008-07-28 Thread Ian Hickson
On Mon, 28 Jul 2008, Adam Barth wrote:
> 
> My guess is this mechanism will not be included in HTML 5 because some 
> of the other browser vendors have expressed their distaste for nested 
> URL schemes.

I've no intention of adding jar: to HTML5, but more because it seems 
completely orthogonal to the markup language than for any other reason.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Application deployment

2008-07-28 Thread Adrian Sutton
On 28/07/2008 09:22, "Kristof Zelechovski" <[EMAIL PROTECTED]> wrote:
> Having this URL monster shipped does not preclude replacing it with a more
> logical one and deprecating the original one.  People make mistakes all the
> time and fortunately there are cases where the harm can be undone.

It's not just FireFox that supports this URL scheme - the entire Java world
uses it and supports it back as long as JAR files have existed as far as I
know. While web pages are a different domain it seems silly to have two
completely different notations for the same thing just because of aesthetic
reasons.

It's also worth noting that the jar: scheme will allow you to target anchors
in a HTML document that's within the archive where as the fragment
identifier syntax would not, unless you used two fragment identifiers:
http://www.example.com/site.jar#/path/inside/foo.html#heading1


> Of course this means that the way relative locators inside an archived
> document are handled must be changed (they should apply to the fragment and
> not to the archive path); it should not be possible to escape an archive
> following relative hyperlinks.

Why not? It seems reasonable to have some things inside the JAR and some
dynamically created outside of it. For example were Gmail wanting to reduce
the initial download time for it's JavaScript and UI resources it could put
them in a JAR file but the JavaScript would still want to send requests to
retrieve the user's actual mail data. It could use an absolute URL to do it
but why not support relative URLs?

> It should also be noted that such an archive has a flat file system (only
> one directory with files tagged with relative paths rather then plain names)
> whereas the HTTP path component addresses a hierarchical file system with
> true directories.  It can cause relative hyperlinks to break when archiving
> an existing directory.

The file system inside a JAR or ZIP is strictly speaking flat, but logically
hierarchical - ie: you unzip it and you get a hierarchy of directories. The
actual method of storage in bits and bytes doesn't seem to matter. Perhaps
I'm misunderstanding your point...

Regards,

Adrian Sutton.
__
Adrian Sutton, CTO
UK: +44 1 753 27 2229  US: +1 (650) 292 9659 x717
Ephox 
Ephox Blogs , Personal Blog




Re: [whatwg] Application deployment

2008-07-28 Thread Kristof Zelechovski
Having this URL monster shipped does not preclude replacing it with a more
logical one and deprecating the original one.  People make mistakes all the
time and fortunately there are cases where the harm can be undone.
(It is not about withdrawing the support for JAR archives but about changing
the URL notation for accessing their content).  Perhaps the new notation
could even make it into HTML?
Of course this means that the way relative locators inside an archived
document are handled must be changed (they should apply to the fragment and
not to the archive path); it should not be possible to escape an archive
following relative hyperlinks.  
It should also be noted that such an archive has a flat file system (only
one directory with files tagged with relative paths rather then plain names)
whereas the HTTP path component addresses a hierarchical file system with
true directories.  It can cause relative hyperlinks to break when archiving
an existing directory.
Chris

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Adam Barth
Sent: Monday, July 28, 2008 9:55 AM
To: Kristof Zelechovski
Cc: Philipp Serafin; [EMAIL PROTECTED]; Russell Leggett
Subject: Re: [whatwg] Application deployment

On Sun, Jul 27, 2008 at 11:55 PM, Kristof Zelechovski
<[EMAIL PROTECTED]> wrote:
> http://www.example.com/site.jar!/path/inside/foo.html>?
> What kind of a syntax is that??  JAR is not a protocol, it is a content
> type.

In Firefox, jar is a protocol that means retrieve the enclosed URL,
unzip the contents, and look for the path after the "!".  I suspect
the reason the Firefox developers chose ! to separate the URL to the
JAR from the path within the JAR is that ! is not a valid URL
character.

> It should rather be
> <http://www.example.com/site.jar#path/inside/foo.html>.  It reads:
retrieve
> the resource "site.jar" using the HTTP protocol and look into it for the
> fragment "foo.html".  I do not know how to read the original notation and
I
> think it should be withdrawn.

Withdrawn from what?  This feature has already shipped in a number of
versions of Firefox.  The main value of using the packaged archive is
that the content author can sign the archive.  For example, this is
the mechanism used for Firefox extensions.

My guess is this mechanism will not be included in HTML 5 because some
of the other browser vendors have expressed their distaste for nested
URL schemes.


> Chris
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Adam Barth
> Sent: Sunday, July 27, 2008 11:33 PM
> To: Philipp Serafin
> Cc: [EMAIL PROTECTED]; Russell Leggett
> Subject: Re: [whatwg] Application deployment
>
> Firefox already implements this today with the jar protocol.  Put your
> content into a zip archive and access it using this kind of URL:
>
> jar:http://www.example.com/site.jar!/path/inside/foo.html
>
> I'm not sure many sites use this feature, but it has been a source of
> several recent security issues.
>
> Adam
>
>
>
>



Re: [whatwg] Application deployment

2008-07-28 Thread Adam Barth
On Sun, Jul 27, 2008 at 11:55 PM, Kristof Zelechovski
<[EMAIL PROTECTED]> wrote:
> http://www.example.com/site.jar!/path/inside/foo.html>?
> What kind of a syntax is that??  JAR is not a protocol, it is a content
> type.

In Firefox, jar is a protocol that means retrieve the enclosed URL,
unzip the contents, and look for the path after the "!".  I suspect
the reason the Firefox developers chose ! to separate the URL to the
JAR from the path within the JAR is that ! is not a valid URL
character.

> It should rather be
> <http://www.example.com/site.jar#path/inside/foo.html>.  It reads: retrieve
> the resource "site.jar" using the HTTP protocol and look into it for the
> fragment "foo.html".  I do not know how to read the original notation and I
> think it should be withdrawn.

Withdrawn from what?  This feature has already shipped in a number of
versions of Firefox.  The main value of using the packaged archive is
that the content author can sign the archive.  For example, this is
the mechanism used for Firefox extensions.

My guess is this mechanism will not be included in HTML 5 because some
of the other browser vendors have expressed their distaste for nested
URL schemes.


> Chris
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Adam Barth
> Sent: Sunday, July 27, 2008 11:33 PM
> To: Philipp Serafin
> Cc: [EMAIL PROTECTED]; Russell Leggett
> Subject: Re: [whatwg] Application deployment
>
> Firefox already implements this today with the jar protocol.  Put your
> content into a zip archive and access it using this kind of URL:
>
> jar:http://www.example.com/site.jar!/path/inside/foo.html
>
> I'm not sure many sites use this feature, but it has been a source of
> several recent security issues.
>
> Adam
>
>
>
>


Re: [whatwg] Application deployment

2008-07-27 Thread Kristof Zelechovski
http://www.example.com/site.jar!/path/inside/foo.html>?
What kind of a syntax is that??  JAR is not a protocol, it is a content
type.  It should rather be
<http://www.example.com/site.jar#path/inside/foo.html>.  It reads: retrieve
the resource "site.jar" using the HTTP protocol and look into it for the
fragment "foo.html".  I do not know how to read the original notation and I
think it should be withdrawn.
Chris
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Adam Barth
Sent: Sunday, July 27, 2008 11:33 PM
To: Philipp Serafin
Cc: [EMAIL PROTECTED]; Russell Leggett
Subject: Re: [whatwg] Application deployment

Firefox already implements this today with the jar protocol.  Put your
content into a zip archive and access it using this kind of URL:

jar:http://www.example.com/site.jar!/path/inside/foo.html

I'm not sure many sites use this feature, but it has been a source of
several recent security issues.

Adam





Re: [whatwg] Application deployment

2008-07-27 Thread Ian Hickson
On Sun, 27 Jul 2008, Russell Leggett wrote:
> 
> This is a suggestion that is more helpful to larger single page web 
> applications, but could also be very helpful to other resource intensive 
> web pages. My thought is that it could be extremely helpful to create 
> some kind of web application deployment format. Basically, the same idea 
> as what java does with jars (Java ARchives). A jar is basically just a 
> zip file with a different extension. Inside, it contains all the 
> resources required for that application, including code and images.

As Philip commented, the MHTML spec already provides for this.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Application deployment

2008-07-27 Thread Adam Barth
Firefox already implements this today with the jar protocol.  Put your
content into a zip archive and access it using this kind of URL:

jar:http://www.example.com/site.jar!/path/inside/foo.html

I'm not sure many sites use this feature, but it has been a source of
several recent security issues.

Adam


On Sun, Jul 27, 2008 at 2:05 PM, Philipp Serafin <[EMAIL PROTECTED]> wrote:
> On Sun, Jul 27, 2008 at 8:44 PM, Russell Leggett
> <[EMAIL PROTECTED]> wrote:
>> Hi all,
>> I checked through the archives, but did not see anything, so if this has
>> been addressed already, I apologize.
>> This is a suggestion that is more helpful to larger single page web
>> applications, but could also be very helpful to other resource intensive web
>> pages. My thought is that it could be extremely helpful to create some kind
>> of web application deployment format. Basically, the same idea as what java
>> does with jars (Java ARchives). A jar is basically just a zip file with a
>> different extension. Inside, it contains all the resources required for that
>> application, including code and images.
>> How hard would it be to support something similar in a browser? Instead of
>> worrying about concatenating javascript and css files to reduce HTTP
>> requests, if all js,css,and even images and other files could be zipped up
>> or tarred, that would only require a single HTTP request. This could
>> basically just add the files to the browser cache or other local storage
>> mechanism so that requests for the resources would not need to make an extra
>> trip.
>> Thanks,
>> Russ
>
> I think for HTML, this is already covered by MHTML
> . The problem here is probably to
> bring more people to implement this one.
>
> If a more generic approach is wished, how about this:
> A new "archive" URI scheme, that works as follows:
>
> Generic syntax: archive:()/
>
> The action would be:
>  - recursively evaluate the URI in the first path and fetch the
> specified resource;
>  - if the received resource is in a supported archive format, search
> for the file specified in the second part and extract it;
>
> Possible examples:
> archive:(http://example.com/~joe/mywebpage.rar)/index.html
> archive:(ftp://example.com/applet.jar)/com.example.applet/core/App.class
> archive:(archive:(http://example.com/~joe/app/build.tar.gz)/build.tar)/main.cpp
>


Re: [whatwg] Application deployment

2008-07-27 Thread Philipp Serafin
On Sun, Jul 27, 2008 at 8:44 PM, Russell Leggett
<[EMAIL PROTECTED]> wrote:
> Hi all,
> I checked through the archives, but did not see anything, so if this has
> been addressed already, I apologize.
> This is a suggestion that is more helpful to larger single page web
> applications, but could also be very helpful to other resource intensive web
> pages. My thought is that it could be extremely helpful to create some kind
> of web application deployment format. Basically, the same idea as what java
> does with jars (Java ARchives). A jar is basically just a zip file with a
> different extension. Inside, it contains all the resources required for that
> application, including code and images.
> How hard would it be to support something similar in a browser? Instead of
> worrying about concatenating javascript and css files to reduce HTTP
> requests, if all js,css,and even images and other files could be zipped up
> or tarred, that would only require a single HTTP request. This could
> basically just add the files to the browser cache or other local storage
> mechanism so that requests for the resources would not need to make an extra
> trip.
> Thanks,
> Russ

I think for HTML, this is already covered by MHTML
. The problem here is probably to
bring more people to implement this one.

If a more generic approach is wished, how about this:
A new "archive" URI scheme, that works as follows:

Generic syntax: archive:()/

The action would be:
 - recursively evaluate the URI in the first path and fetch the
specified resource;
 - if the received resource is in a supported archive format, search
for the file specified in the second part and extract it;

Possible examples:
archive:(http://example.com/~joe/mywebpage.rar)/index.html
archive:(ftp://example.com/applet.jar)/com.example.applet/core/App.class
archive:(archive:(http://example.com/~joe/app/build.tar.gz)/build.tar)/main.cpp