Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-26 Thread Anne van Kesteren
On Thu, 26 Aug 2010 02:28:49 +0200, Chris Double  
chris.dou...@double.co.nz wrote:
On Thu, Aug 26, 2010 at 5:25 AM, Eric Carlson eric.carl...@apple.com  
wrote:

FWIW, I agree with Silvia that a new file extension and MIME type make
sense.


I also think that a new file extension and MIME type is the way to go.


Would Firefox / Safari support text/srt files in some undocumented fashion  
then or just simply not support those? The former would not really be an  
acceptable solution to me.



--
Anne van Kesteren
http://annevankesteren.nl/


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-26 Thread Henri Sivonen
Silvia Pfeiffer wrote:
 You misunderstand my intent. I am by no means suggesting that no
 WebSRT
 content is treated as SRT by any application. All I am asking for is a
 different file extension and a different mime type and possibly a
 magic
 identifier such that *authoring* applications (and authors) can
 clearly
 designate this to be a different format, in particular if they include
 new
 features. Then a *playback application* has the chance to identify
 them as a
 different format and provide a specific parser for it, instead of
 failing
 like Totem. They can also decide to extend their existing SRT parser
 to
 support both WebSRT and SRT. And I also have no issue with a user
 deciding
 to give a WebSRT file a go by renaming it to .srt.
 
 By keeping WebSRT and SRT as different formats we give the
 applications a
 choice to support either, or both in the same parser. If we don't, we
 force
 them to deal in a single parser with all the oddities of SRT formats
 as well
 as all the extra features and all the extensibility of WebSRT.

Why wouldn't it always be a superior solution for all parties to do the 
following:
 1) Make sure WebSRT never requires processing that'd require rendering a 
substantial body of legacy .srt content in a broken way. (This would require 
supporting non-UTF-8 encodings by sniffing as well as supporting font and 
u, which would happen for free if my innerHTML proposal were adopted.)
 2) Make playback software that supports WebSRT only have a WebSRT code path 
and use that code path for legacy .srt content as well.
?

Specifically, if #1 is done, why would any pragmatic developer not want to do 
#2 if they are supporting WebSRT in their software? Why would anyone want to 
have a code path that turns off new WebSRT features if they have a code path 
that supports WebSRT features?

Or is #1 *impossible* due to the craziness of the legacy? (I thought any given 
.srt consumer only has a single code path and implemetation-wise there aren't 
already multiple .srt format even though doom9 spec-wise there are at least 
two.)

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-26 Thread Philip Jägenstedt

On Thu, 26 Aug 2010 09:58:29 +0200, Henri Sivonen hsivo...@iki.fi wrote:


Silvia Pfeiffer wrote:

You misunderstand my intent. I am by no means suggesting that no
WebSRT
content is treated as SRT by any application. All I am asking for is a
different file extension and a different mime type and possibly a
magic
identifier such that *authoring* applications (and authors) can
clearly
designate this to be a different format, in particular if they include
new
features. Then a *playback application* has the chance to identify
them as a
different format and provide a specific parser for it, instead of
failing
like Totem. They can also decide to extend their existing SRT parser
to
support both WebSRT and SRT. And I also have no issue with a user
deciding
to give a WebSRT file a go by renaming it to .srt.

By keeping WebSRT and SRT as different formats we give the
applications a
choice to support either, or both in the same parser. If we don't, we
force
them to deal in a single parser with all the oddities of SRT formats
as well
as all the extra features and all the extensibility of WebSRT.


Why wouldn't it always be a superior solution for all parties to do the  
following:
 1) Make sure WebSRT never requires processing that'd require rendering  
a substantial body of legacy .srt content in a broken way. (This would  
require supporting non-UTF-8 encodings by sniffing as well as supporting  
font and u, which would happen for free if my innerHTML proposal  
were adopted.)
 2) Make playback software that supports WebSRT only have a WebSRT code  
path and use that code path for legacy .srt content as well.

?

Specifically, if #1 is done, why would any pragmatic developer not want  
to do #2 if they are supporting WebSRT in their software? Why would  
anyone want to have a code path that turns off new WebSRT features if  
they have a code path that supports WebSRT features?


I think many media player developers would be hesitant to include a full  
HTML parser just for parsing (Web)SRT, especially since they'd also need a  
layout engine to get anything more than they would get from a simpler  
parser.


I do think it's a good idea to make the WebSRT handle existing SRT content  
as well as possible. The encoding issue is easy to side-step by just  
saying that that's a preprocessing step.


Or is #1 *impossible* due to the craziness of the legacy? (I thought any  
given .srt consumer only has a single code path and implemetation-wise  
there aren't already multiple .srt format even though doom9 spec-wise  
there are at least two.)


There are some issues with the current WebSRT parser that I've been  
meaning to send mail about, but by my impression is that it's not  
impossible to define a parser which works well enough to replace existing  
ones.


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-26 Thread Henri Sivonen
  Why wouldn't it always be a superior solution for all parties to do
  the
  following:
   1) Make sure WebSRT never requires processing that'd require
   rendering
  a substantial body of legacy .srt content in a broken way. (This
  would
  require supporting non-UTF-8 encodings by sniffing as well as
  supporting
  font and u, which would happen for free if my innerHTML
  proposal
  were adopted.)
   2) Make playback software that supports WebSRT only have a WebSRT
   code
  path and use that code path for legacy .srt content as well.
  ?
 
  Specifically, if #1 is done, why would any pragmatic developer not
  want
  to do #2 if they are supporting WebSRT in their software? Why would
  anyone want to have a code path that turns off new WebSRT features
  if
  they have a code path that supports WebSRT features?
 
 I think many media player developers would be hesitant to include a
 full
 HTML parser just for parsing (Web)SRT, especially since they'd also
 need a
 layout engine to get anything more than they would get from a simpler
 parser.

If their app can ingest both WebSRT and legacy SRT (with WebSRT ingested by 
whatever potentially spec-incompliant means), why would they not use the same 
ingest code path for both?

If the app isn't capable of supporting any feature that's permitted in WebSRT 
but not part of legacy SRT, how does failing at the point of finding out that 
this file claims to be WebSRT rather than SRT make things much better than 
failing at I found stuff that I can't handle/skip over in this SRT file?

In particular, it seems like a wrong optimization to make it possible for apps 
that don't support any WebSRT features over legacy features to fail early than 
to make apps that support at least one WebSRT-introduced feature unify their 
processing of WebSRT and SRT by processing both WebSRT and SRT as one format 
where legacy SRT files just don't happen to use new features.

To me, having different code paths for WebSRT and SRT is like IE adding a new 
Trident snapshot with every release whereas supporting SRT by treating it as 
WebSRT with no new features (if the app is supporting even one 
WebSRT-introduced feature!) is like what the other browsers are doing with 
HTML/CSS/DOM.

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-26 Thread Philip Jägenstedt
On Wed, 25 Aug 2010 17:40:08 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


 At this point, what is your recommendation? The following ideas have  
been

on the table:

* Change the file extension to something other than .srt.

I don't have an opinion, browsers ignore the file extension anyway.


 Yes, I think we should definitely have a new file extension.



I'll leave this to others to decide, but since browsers have no  
concept

of
file extensions, just using .srt will work. If the format is SRT-like
it's
likely at least some files will use .srt in practice.




All SRT files in practice use the .srt extension - it is typically how
these
formats are identified by applications. Just because *nix ignores file
extensions mostly for identifying file types doesn't mean that
applications
do. Again, I believe strongly that re-using the same file extension is  
the

one biggest pain we can inflict on the community.



As shown above, several popular (?) media players ignore or give little
weight to the file extension.



I don't think that's a fair sample - as I said, on Linux and on the
command-line things are different. I have a GUI mplayer here and it  
reacts

like VLC - doesn't let me open .wsrt files. The vast majority of
applications on Windows and the Mac make their decision on whether they
support files based on the file extension.


That the file selection dialogs are filtered by file extensions doesn't  
mean that applications don't sniff the content. In fact, MPlayer, VLC and  
Totem will happily load and use an SRT file even if it is called foo.smi,  
even though SAMI is a completely incompatible format. In other words, they  
sniff the content as being SRT. The reason that they rely on sniffing is  
likely that many files use the wrong file extension (my OpenSubtitles  
batch have no extensions, so I have no statistics on this).


Again, if we want to avoid exposing existing SRT parsers to WebSRT syntax,  
then the format needs to be more incompatible. File extensions will be  
changed, popular players rely on sniffing, some ignore leading garbage and  
also headers can simply be removed by naive conversion tools.


Assuming we pick the same file extension and we now have a new  
application

that only supports WebSRT parsing, we will make a large bunch of existing
valid SRT files invalid - not only those that are not in UTF-8, but also
those with font../font and u.../u. I do wonder if the text  
between

the font start and end element and inside the u../u may even get
removed because of lack of support for these.


I've seen no application that removes everything between tags it doesn't  
recognize, the only things that I've seen happen is treating it as plain  
text or ignoring the tags much like a browser does with HTML.



  * Add a header to WebSRT to make it uniquely identifiable.




The header would have to be mandatory and browsers would have to  
reject

files that don't have it. Such files would be compatible with some
existing
software and break some, depending on how they sniff. We could also  
put

metadata in such a header.


 Yes, I think we need to introduce a header. Maybe we can hide all  
the

structure in what SRT recognizes as comments (i.e. start the lines as
;.
But I believe we need some hints like the @profile to identify the  
type

of
the cues and the link to link to a style sheet, and we need  
metadata

like
the meta element of HTML headers.


I had no idea that semicolon was used for comments in SRT, is this  
usage

widespread? Does it work in most players?




I thought it was, but maybe it was just introduced for WebSRT. It is  
not
tested in Hixie's SRT research[2]. Can you take a quick look through  
your
SRT file collection if there are any? I'm probably wrong about this  
seeing

as it's not mentioned in the wiki page for SRT [3].

[2] http://wiki.whatwg.org/wiki/SRT_research
[3] http://en.wikipedia.org/wiki/SubRip



OK, I grepped the 1 files. Only 15 had any lines beginning with a
semicolon, and by manual inspection it doesn't look like any of them are
clearly intended as comments (it's hard to tell, all are in foreign
languages). None of them were at the very beginning of the file.



Ah, that actually makes for another incompatibility of WebSRT and SRT:  
such
lines are regarded as comments in WebSRT when they probably aren't in  
SRT.


I can't find anything about this when searching for comment and  
semicolon in the spec, are you sure you're not thinking of some other  
format than WebSRT?


It seems increasingly that the only thing that WebSRT and SRT still have  
in
common is the -- character sequence. As a friend of mine in a11y  
recently
said: I was hoping to never have to stare at -- ever again... We  
could

indeed go all the way and define an much more different format, though I
don't think it will create implementations as quickly as a SRT-based but
changed format.


I would prefer if we follow one of two paths:

1. Let 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-26 Thread Philip Jägenstedt

On Thu, 26 Aug 2010 11:52:26 +0200, Henri Sivonen hsivo...@iki.fi wrote:


 Why wouldn't it always be a superior solution for all parties to do
 the
 following:
  1) Make sure WebSRT never requires processing that'd require
  rendering
 a substantial body of legacy .srt content in a broken way. (This
 would
 require supporting non-UTF-8 encodings by sniffing as well as
 supporting
 font and u, which would happen for free if my innerHTML
 proposal
 were adopted.)
  2) Make playback software that supports WebSRT only have a WebSRT
  code
 path and use that code path for legacy .srt content as well.
 ?

 Specifically, if #1 is done, why would any pragmatic developer not
 want
 to do #2 if they are supporting WebSRT in their software? Why would
 anyone want to have a code path that turns off new WebSRT features
 if
 they have a code path that supports WebSRT features?

I think many media player developers would be hesitant to include a
full
HTML parser just for parsing (Web)SRT, especially since they'd also
need a
layout engine to get anything more than they would get from a simpler
parser.


If their app can ingest both WebSRT and legacy SRT (with WebSRT ingested  
by whatever potentially spec-incompliant means), why would they not use  
the same ingest code path for both?


I don't they should or would, I'm just saying that they'd probably be  
hesitant to use an HTML parser in that single code path, as there's very  
little benefit for them.


If the app isn't capable of supporting any feature that's permitted in  
WebSRT but not part of legacy SRT, how does failing at the point of  
finding out that this file claims to be WebSRT rather than SRT make  
things much better than failing at I found stuff that I can't  
handle/skip over in this SRT file?


In particular, it seems like a wrong optimization to make it possible  
for apps that don't support any WebSRT features over legacy features to  
fail early than to make apps that support at least one WebSRT-introduced  
feature unify their processing of WebSRT and SRT by processing both  
WebSRT and SRT as one format where legacy SRT files just don't happen to  
use new features.


To me, having different code paths for WebSRT and SRT is like IE adding  
a new Trident snapshot with every release whereas supporting SRT by  
treating it as WebSRT with no new features (if the app is supporting  
even one WebSRT-introduced feature!) is like what the other browsers are  
doing with HTML/CSS/DOM.


Is this in reply to something other than what you quoted? In any case, I  
agree.


--
Philip Jägenstedt
Core Developer
Opera Software


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-25 Thread Silvia Pfeiffer
On Tue, Aug 24, 2010 at 8:49 PM, Philip Jägenstedt phil...@opera.comwrote:

 On Tue, 24 Aug 2010 04:32:21 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

  On Mon, Aug 23, 2010 at 6:55 PM, Philip Jägenstedt phil...@opera.com
 wrote:

   Aside: WebSRT can't contain binary data, only UTF-8 encoded text.




 It sure can. Just base-64 encode it. I'm not saying it's a good thing,
 but
 if somebody really has an urge...


 Sure, this would be a metadata track. Sites have no reason to offer
 download links to it, and if anyone gets hold of such a file it would
 quickly be evident that it's useless.



 After a user has seen the crap on screen. I'm just saying: it's a legal
 WebSRT file and really not compatible with any existing infrastructure for
 SRT.


 A fair point. The alternatives I can see are (1) using an incompatible
 format so that the user sees nothing or (2) adding a header that indicates
 that the track is metadata.

 In order to tell the user to stop wasting their time with this file, I
 think (1) is clearly worse. (2) is absolutely an option, but it will only
 make a difference to software that understands this header and if the header
 is optional it will likely often be omitted. A dialog saying this is a
 metadata track, you can't watch it is slightly friendlier than a screen
 full of crap, but they are both pretty effective at getting the message
 across.



Yeah, I'm totally for adding a hint as to what format is in the cue. Then, a
WebSRT file can be identified as to what it contains.



   If we define WebSRT in a way that can handle 99% of existing content and

 degrade gracefully (enough) when using new features in old software, it
 seems reasonable to do. If lots of software developers cry foul, then
 perhaps we should reconsider. It seems to me, though, that actually
 researching and defining a good algorithm for parsing SRT would be of
 use
 to
 others than just browsers.


  How is that different from moving away from SRT. If everyone has to
 change
 their parsing of SRT to accommodate a new spec, then that is a new
 format.


 Not everyone has to change their parsers immediately, many will continue
 to
 work. However, if someone wants to support SRT in a compatible way, it's
 very helpful to have a spec, assuming that WebSRT is actually compatible
 enough with existing SRT content.

 This is quite similar to HTML4 vs HTML5. There are lots of mostly
 compatible HTML parsers, but HTML5 defines a single parsing algorithm,
 and
 slow convergence towards that is a good thing.


 No, no, no! It is not at all similar to HTML4 and HTML5. A Web browser
 cannot suddenly stop working for a Web page, just because it has some
 extra
 functionality in it. Thus, the HTML format has been developed such that it
 can be extended without breaking existing stuff. We can guarantee that no
 browser will break because that is the way in which the format has been
 specified.

 No such thing has happened for SRT and there is simply no way to guarantee
 that all new WebSRT files will work in all existing SRT software, because
 SRT has not been specified as a extensible format and because there is no
 agreement between all parties that have implemented SRT support as to how
 extensions should be made.

 We can introduce such a thing for WebSRT, but we cannot claim it for SRT.


 You are right, existing SRT parsers are probably far less interoperable
 than HTML parsers were before HTML5.

 Existing content demands that SRT parsers handle at least i, b, font
 and u in some manner, even if it is by ignoring it. Any parsers that treat
 SRT as plain text don't even work with todays content, so I don't think they
 should be considered at all.


You've just defined what SRT is. I would actually define SRT as the plain
text format and the i, b, font and u markup as extensions.



 The question, then, is if parsers that handle the mentioned markup also
 ignore 1, ruby and rt. I haven't tested it, but I assume that some
 will ignore it and some won't. How many percent of the media player market
 would have to handle this correctly for these extensions to be OK, in your
 opinion?


If a single one breaks, it would be bad IMO because the expectations of the
users of that software will be broken even if it may just be a small
percentage of users and we have no influence on the upgrade path of that
software - in particular if it is proprietary.




  If the SRT ecosystem is so fragile that it cannot tolerate any extension
 whatsoever, then we should stay far away from it. It just seems that's
 not
 the case.



 How do we know that everyone that uses SRT now really wants to use WebSRT
 instead and wants to take part in the new ecosystem that we are
 introducing?
 We make some pretty big assumptions about what everyone who is not a Web
 browser vendor wants to do with SRT. That doesn't make the existing SRT
 ecosystem fragile - but it makes it an existing environment that needs to
 be
 respected.


 At this point, 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-25 Thread Philip Jägenstedt
On Wed, 25 Aug 2010 09:16:56 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


On Tue, Aug 24, 2010 at 8:49 PM, Philip Jägenstedt  
phil...@opera.comwrote:



On Tue, 24 Aug 2010 04:32:21 +0200, Silvia Pfeiffer 
silviapfeiff...@gmail.com wrote:

 On Mon, Aug 23, 2010 at 6:55 PM, Philip Jägenstedt phil...@opera.com

wrote:

  Aside: WebSRT can't contain binary data, only UTF-8 encoded text.








It sure can. Just base-64 encode it. I'm not saying it's a good  
thing,

but
if somebody really has an urge...



Sure, this would be a metadata track. Sites have no reason to offer
download links to it, and if anyone gets hold of such a file it would
quickly be evident that it's useless.




After a user has seen the crap on screen. I'm just saying: it's a legal
WebSRT file and really not compatible with any existing infrastructure  
for

SRT.



A fair point. The alternatives I can see are (1) using an incompatible
format so that the user sees nothing or (2) adding a header that  
indicates

that the track is metadata.

In order to tell the user to stop wasting their time with this file, I
think (1) is clearly worse. (2) is absolutely an option, but it will  
only
make a difference to software that understands this header and if the  
header

is optional it will likely often be omitted. A dialog saying this is a
metadata track, you can't watch it is slightly friendlier than a screen
full of crap, but they are both pretty effective at getting the message
across.




Yeah, I'm totally for adding a hint as to what format is in the cue.  
Then, a

WebSRT file can be identified as to what it contains.


OK, but note that a browser would ignore this and trust what track kind  
says. I wouldn't want the kind change after the external track is loaded,  
it would make the UI confusing if a captions track disappeared from the  
menu as soon as it was loaded because it internally claims to be metadata.


  If we define WebSRT in a way that can handle 99% of existing content  
and


degrade gracefully (enough) when using new features in old software,  
it
seems reasonable to do. If lots of software developers cry foul,  
then

perhaps we should reconsider. It seems to me, though, that actually
researching and defining a good algorithm for parsing SRT would be  
of

use
to
others than just browsers.


 How is that different from moving away from SRT. If everyone has to

change
their parsing of SRT to accommodate a new spec, then that is a new
format.


Not everyone has to change their parsers immediately, many will  
continue

to
work. However, if someone wants to support SRT in a compatible way,  
it's
very helpful to have a spec, assuming that WebSRT is actually  
compatible

enough with existing SRT content.

This is quite similar to HTML4 vs HTML5. There are lots of mostly
compatible HTML parsers, but HTML5 defines a single parsing algorithm,
and
slow convergence towards that is a good thing.



No, no, no! It is not at all similar to HTML4 and HTML5. A Web browser
cannot suddenly stop working for a Web page, just because it has some
extra
functionality in it. Thus, the HTML format has been developed such  
that it
can be extended without breaking existing stuff. We can guarantee that  
no

browser will break because that is the way in which the format has been
specified.

No such thing has happened for SRT and there is simply no way to  
guarantee
that all new WebSRT files will work in all existing SRT software,  
because
SRT has not been specified as a extensible format and because there is  
no
agreement between all parties that have implemented SRT support as to  
how

extensions should be made.

We can introduce such a thing for WebSRT, but we cannot claim it for  
SRT.




You are right, existing SRT parsers are probably far less interoperable
than HTML parsers were before HTML5.

Existing content demands that SRT parsers handle at least i, b,  
font
and u in some manner, even if it is by ignoring it. Any parsers that  
treat
SRT as plain text don't even work with todays content, so I don't think  
they

should be considered at all.



You've just defined what SRT is. I would actually define SRT as the plain
text format and the i, b, font and u markup as extensions.


Perhaps SRT was originally plain text, but for a very long time now, files  
with the .srt extension contain markup, more than 50% do in the  
OpenSubtitles sample data. With nothing to differentiate the plain text  
and markup formats, there is effectively only one format, no matter what  
we choose to call it.



The question, then, is if parsers that handle the mentioned markup also
ignore 1, ruby and rt. I haven't tested it, but I assume that some
will ignore it and some won't. How many percent of the media player  
market
would have to handle this correctly for these extensions to be OK, in  
your

opinion?



If a single one breaks, it would be bad IMO because the expectations of  
the

users of that software will be broken even if 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-25 Thread Silvia Pfeiffer
On Wed, Aug 25, 2010 at 7:20 PM, Philip Jägenstedt phil...@opera.comwrote:

 On Wed, 25 Aug 2010 09:16:56 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

  On Tue, Aug 24, 2010 at 8:49 PM, Philip Jägenstedt phil...@opera.com
 wrote:

  On Tue, 24 Aug 2010 04:32:21 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

  On Mon, Aug 23, 2010 at 6:55 PM, Philip Jägenstedt phil...@opera.com

 wrote:

  Aside: WebSRT can't contain binary data, only UTF-8 encoded text.





 It sure can. Just base-64 encode it. I'm not saying it's a good thing,
 but
 if somebody really has an urge...


  Sure, this would be a metadata track. Sites have no reason to offer
 download links to it, and if anyone gets hold of such a file it would
 quickly be evident that it's useless.



 After a user has seen the crap on screen. I'm just saying: it's a legal
 WebSRT file and really not compatible with any existing infrastructure
 for
 SRT.


 A fair point. The alternatives I can see are (1) using an incompatible
 format so that the user sees nothing or (2) adding a header that
 indicates
 that the track is metadata.

 In order to tell the user to stop wasting their time with this file, I
 think (1) is clearly worse. (2) is absolutely an option, but it will only
 make a difference to software that understands this header and if the
 header
 is optional it will likely often be omitted. A dialog saying this is a
 metadata track, you can't watch it is slightly friendlier than a screen
 full of crap, but they are both pretty effective at getting the message
 across.




 Yeah, I'm totally for adding a hint as to what format is in the cue. Then,
 a
 WebSRT file can be identified as to what it contains.


 OK, but note that a browser would ignore this and trust what track kind
 says. I wouldn't want the kind change after the external track is loaded, it
 would make the UI confusing if a captions track disappeared from the menu as
 soon as it was loaded because it internally claims to be metadata.


Yes, I have no problem with that. Though I believe we have overloaded @kind
with too much meaning as I already mentioned earlier [1]. I think it would
make more sense to pull the different dimensions into different attributes:
- @type or @format for the format of the cue
- @kind for the semantic meaning of it (subtitle, caption, karaoke etc) -
one track could even satisfy several needs, so this would be a lit of kinds
- and finally the visual rendering problem, which could possibly be solved
by providing a link to a div or p where the data should be rendered
alternatively to the default. Right now, audio and metadata tracks get no
rendering at all and I see that as a problem.


[1]
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2010-July/027356.html




  The question, then, is if parsers that handle the mentioned markup also
 ignore 1, ruby and rt. I haven't tested it, but I assume that some
 will ignore it and some won't. How many percent of the media player
 market
 would have to handle this correctly for these extensions to be OK, in
 your
 opinion?



 If a single one breaks, it would be bad IMO because the expectations of
 the
 users of that software will be broken even if it may just be a small
 percentage of users and we have no influence on the upgrade path of that
 software - in particular if it is proprietary.


 Neither a new file extension, MIME type or header is enough to stop some
 implementations from treating it as SRT and break. The only remaining
 option, AFAICT, is making the format fundamentally incompatible with SRT. Is
 it worth it?


If it has a different file extension and a different mime type and even a
different header, I don't think any existing software will open it as SRT.
Why would it think that a random file is a SRT file? It would need to be an
application that accepts absolutely anything that you give it as SRT and
then that software has more fundamental problems.



 At this point, what is your recommendation? The following ideas have been
 on the table:

 * Change the file extension to something other than .srt.

 I don't have an opinion, browsers ignore the file extension anyway.


 Yes, I think we should definitely have a new file extension.


 I'll leave this to others to decide, but since browsers have no concept of
 file extensions, just using .srt will work. If the format is SRT-like it's
 likely at least some files will use .srt in practice.


All SRT files in practice use the .srt extension - it is typically how these
formats are identified by applications. Just because *nix ignores file
extensions mostly for identifying file types doesn't mean that applications
do. Again, I believe strongly that re-using the same file extension is the
one biggest pain we can inflict on the community.




  * Change the MIME type to something other than text/srt.

 I doubt it makes any difference, as most software that deal with SRT
 today
 have no concept of MIME types. No matter what I'd want 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-25 Thread Philip Jägenstedt
On Wed, 25 Aug 2010 14:39:00 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


On Wed, Aug 25, 2010 at 7:20 PM, Philip Jägenstedt  
phil...@opera.comwrote:




The question, then, is if parsers that handle the mentioned markup  
also
ignore 1, ruby and rt. I haven't tested it, but I assume that  
some

will ignore it and some won't. How many percent of the media player
market
would have to handle this correctly for these extensions to be OK, in
your
opinion?




If a single one breaks, it would be bad IMO because the expectations of
the
users of that software will be broken even if it may just be a small
percentage of users and we have no influence on the upgrade path of  
that

software - in particular if it is proprietary.



Neither a new file extension, MIME type or header is enough to stop some
implementations from treating it as SRT and break. The only remaining
option, AFAICT, is making the format fundamentally incompatible with  
SRT. Is

it worth it?



If it has a different file extension and a different mime type and even a
different header, I don't think any existing software will open it as  
SRT.
Why would it think that a random file is a SRT file? It would need to be  
an

application that accepts absolutely anything that you give it as SRT and
then that software has more fundamental problems.


I renamed a SRT file to .wsrt and added WEBSRT on a line before the cues  
and it still plays just fine in MPlayer, using `mplayer video.ogv -sub  
subs.wsrt`. VLC won't open a subtitle file with .wsrt extension, but the  
same file (with a WEBSRT header) works with the extension srt or txt.  
Totem is the other way around, the file extension doesn't matter, but it  
rejects files with a header.


The results are hardly consistent, but at least one player exist for which  
it's not enough to change the file extension and add a header. If we want  
to make sure that no content is treated as SRT by any application, the  
format must be more incompatible.


At this point, what is your recommendation? The following ideas have  
been

on the table:

* Change the file extension to something other than .srt.

I don't have an opinion, browsers ignore the file extension anyway.



Yes, I think we should definitely have a new file extension.



I'll leave this to others to decide, but since browsers have no concept  
of
file extensions, just using .srt will work. If the format is SRT-like  
it's

likely at least some files will use .srt in practice.



All SRT files in practice use the .srt extension - it is typically how  
these

formats are identified by applications. Just because *nix ignores file
extensions mostly for identifying file types doesn't mean that  
applications
do. Again, I believe strongly that re-using the same file extension is  
the

one biggest pain we can inflict on the community.


As shown above, several popular (?) media players ignore or give little  
weight to the file extension.



 * Change the MIME type to something other than text/srt.


I doubt it makes any difference, as most software that deal with SRT
today
have no concept of MIME types. No matter what I'd want exactly 1 MIME
type
or alternatively make browsers ignore the MIME type completely.


You're right in that existing SRT software probably doesn't deal much  
with

a
SRT mime type. Right now text/x-srt or text/srt is sometimes used for  
SRT

files, but often text/plain is also in use and more likely from a Web
server. Since this is the space where Web browsers play, I am not  
overly
fussed, though I think logically text/websrt makes more sense with a  
.wsrt
extension. Then, also SRT files can be served as text/websrt to allow  
them
to take part in the WebSRT infrastructure if indeed they will continue  
to

be
valid WebSRT files.



Is there anything you expect would break if WebSRT files were served as
text/srt?



I'm asking because I don't know how anal Web browsers are about mime  
types.
I would think a Web browser should accept WebSRT and SRT files in  
text/plain

format as well as WebSRT files in text/websrt format and SRT files in
text/srt format. Would something break if they even came as text/html? I
would expect that it makes a difference when these are loaded directly  
as a

resource for display (e.g. when you directly go to
http://example.com/mycaptions.wsrt), but not when used through a track
element, where WebSRT is the baseline format and thus is expected.


It's actually easier for a browser to ignore the MIME type than it is to  
be strict about it, at least when the format is easily identified by  
sniffing (sniffing code is needed anyway for local files). WebSRT isn't  
very easy to sniff, so that would be an argument in favor of a mandatory  
magic header.


The main reason to care about the MIME type is some kind of doing the  
right thing by not letting people get away with misconfigured servers.  
Sometimes I feel it's just a waste of everyone's time though, it would  
generally 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-25 Thread Silvia Pfeiffer
On Thu, Aug 26, 2010 at 12:39 AM, Philip Jägenstedt phil...@opera.comwrote:

 On Wed, 25 Aug 2010 14:39:00 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

  On Wed, Aug 25, 2010 at 7:20 PM, Philip Jägenstedt phil...@opera.com
 wrote:


  The question, then, is if parsers that handle the mentioned markup also
 ignore 1, ruby and rt. I haven't tested it, but I assume that
 some
 will ignore it and some won't. How many percent of the media player
 market
 would have to handle this correctly for these extensions to be OK, in
 your
 opinion?



 If a single one breaks, it would be bad IMO because the expectations of
 the
 users of that software will be broken even if it may just be a small
 percentage of users and we have no influence on the upgrade path of that
 software - in particular if it is proprietary.


 Neither a new file extension, MIME type or header is enough to stop some
 implementations from treating it as SRT and break. The only remaining
 option, AFAICT, is making the format fundamentally incompatible with SRT.
 Is
 it worth it?



 If it has a different file extension and a different mime type and even a
 different header, I don't think any existing software will open it as SRT.
 Why would it think that a random file is a SRT file? It would need to be
 an
 application that accepts absolutely anything that you give it as SRT and
 then that software has more fundamental problems.


 I renamed a SRT file to .wsrt and added WEBSRT on a line before the cues
 and it still plays just fine in MPlayer, using `mplayer video.ogv -sub
 subs.wsrt`.



I wouldn't count command-line applications for this - you can always throw
just about anything at a command-line application and that is good and an
advantage, because it may just work, as it did here. But it is a controlled
environment by somebody who knows what they are doing - it is unlikely to
cause problems and confusion.



 VLC won't open a subtitle file with .wsrt extension, but the same file
 (with a WEBSRT header) works with the extension srt or txt.


Again - that's a good thing and exactly what I would prefer. If you know
what you are doing and you know your file is probably just going to work,
you can consciously decide to fall back to SRT.



 Totem is the other way around, the file extension doesn't matter, but it
 rejects files with a header.


That's just proof that it's a different file format.



 The results are hardly consistent, but at least one player exist for which
 it's not enough to change the file extension and add a header. If we want to
 make sure that no content is treated as SRT by any application, the format
 must be more incompatible.


You misunderstand my intent. I am by no means suggesting that no WebSRT
content is treated as SRT by any application. All I am asking for is a
different file extension and a different mime type and possibly a magic
identifier such that *authoring* applications (and authors) can clearly
designate this to be a different format, in particular if they include new
features. Then a *playback application* has the chance to identify them as a
different format and provide a specific parser for it, instead of failing
like Totem. They can also decide to extend their existing SRT parser to
support both WebSRT and SRT. And I also have no issue with a user deciding
to give a WebSRT file a go by renaming it to .srt.

By keeping WebSRT and SRT as different formats we give the applications a
choice to support either, or both in the same parser. If we don't, we force
them to deal in a single parser with all the oddities of SRT formats as well
as all the extra features and all the extensibility of WebSRT.




  At this point, what is your recommendation? The following ideas have been
 on the table:

 * Change the file extension to something other than .srt.

 I don't have an opinion, browsers ignore the file extension anyway.


  Yes, I think we should definitely have a new file extension.


 I'll leave this to others to decide, but since browsers have no concept
 of
 file extensions, just using .srt will work. If the format is SRT-like
 it's
 likely at least some files will use .srt in practice.



 All SRT files in practice use the .srt extension - it is typically how
 these
 formats are identified by applications. Just because *nix ignores file
 extensions mostly for identifying file types doesn't mean that
 applications
 do. Again, I believe strongly that re-using the same file extension is the
 one biggest pain we can inflict on the community.


 As shown above, several popular (?) media players ignore or give little
 weight to the file extension.


I don't think that's a fair sample - as I said, on Linux and on the
command-line things are different. I have a GUI mplayer here and it reacts
like VLC - doesn't let me open .wsrt files. The vast majority of
applications on Windows and the Mac make their decision on whether they
support files based on the file extension.

Assuming we pick the same 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-25 Thread Eric Carlson



On Aug 25, 2010, at 8:40 AM, Silvia Pfeiffer wrote:
 On Thu, Aug 26, 2010 at 12:39 AM, Philip Jägenstedt phil...@opera.com wrote:
  
 The results are hardly consistent, but at least one player exist for which 
 it's not enough to change the file extension and add a header. If we want to 
 make sure that no content is treated as SRT by any application, the format 
 must be more incompatible.
 
 You misunderstand my intent. I am by no means suggesting that no WebSRT 
 content is treated as SRT by any application. All I am asking for is a 
 different file extension and a different mime type and possibly a magic 
 identifier such that *authoring* applications (and authors) can clearly 
 designate this to be a different format, in particular if they include new 
 features. Then a *playback application* has the chance to identify them as a 
 different format and provide a specific parser for it, instead of failing 
 like Totem. They can also decide to extend their existing SRT parser to 
 support both WebSRT and SRT. And I also have no issue with a user deciding to 
 give a WebSRT file a go by renaming it to .srt.
 
 By keeping WebSRT and SRT as different formats we give the applications a 
 choice to support either, or both in the same parser. If we don't, we force 
 them to deal in a single parser with all the oddities of SRT formats as well 
 as all the extra features and all the extensibility of WebSRT. 
 

 
 I think we've made some interesting finds in this thread, but we're starting 
 to go in circles by now. Perhaps we should give it a rest until we get input 
 from a third party. A medal to anyone who has followed it this far :)


  FWIW, I agree with Silvia that a new file extension and MIME type make sense. 

  Keeping them the same won't help applications that don't know about WebSRT, 
they will try to play the files and aren't likely to deal with the differences 
gracefully. Keeping them the same also won't help new applications that know 
about WebSRT, it won't make any difference if there is one MIME type or two.

eric



Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-25 Thread Chris Double
On Thu, Aug 26, 2010 at 2:39 AM, Philip Jägenstedt phil...@opera.com wrote:
 It's actually easier for a browser to ignore the MIME type than it is to be
 strict about it, at least when the format is easily identified by sniffing
 (sniffing code is needed anyway for local files).

Firefox (in the case of video) uses file extensions to identify video
files. We have an internal maping of file extensions to mime types. We
don't sniff the content. I imagine we'd do the same with whatever file
extension is used for WebSRT.

Chris.
-- 
http://www.bluishcoder.co.nz


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-25 Thread Chris Double
On Thu, Aug 26, 2010 at 2:39 AM, Philip Jägenstedt phil...@opera.com wrote:

 The main reason to care about the MIME type is some kind of doing the right
 thing by not letting people get away with misconfigured servers. Sometimes
 I feel it's just a waste of everyone's time though, it would generally be
 less work for both browsers and authors to not bother.

I disagree that this is the main reason. I was a web developer before
being a browser developer and I can say it was highly annoying dealing
with browsers that sniff content types. There were times where we
wanted to send a file as plain text or binary data but the browser
would sniff it and attempt to handle it.

Chris.
-- 
http://www.bluishcoder.co.nz


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-25 Thread Chris Double
On Thu, Aug 26, 2010 at 5:25 AM, Eric Carlson eric.carl...@apple.com wrote:

   FWIW, I agree with Silvia that a new file extension and MIME type make
 sense.

I also think that a new file extension and MIME type is the way to go.

Chris.
-- 
http://www.bluishcoder.co.nz


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-24 Thread Philip Jägenstedt
On Tue, 24 Aug 2010 04:32:21 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


On Mon, Aug 23, 2010 at 6:55 PM, Philip Jägenstedt  
phil...@opera.comwrote:



 Aside: WebSRT can't contain binary data, only UTF-8 encoded text.





It sure can. Just base-64 encode it. I'm not saying it's a good thing,  
but

if somebody really has an urge...



Sure, this would be a metadata track. Sites have no reason to offer
download links to it, and if anyone gets hold of such a file it would
quickly be evident that it's useless.



After a user has seen the crap on screen. I'm just saying: it's a legal
WebSRT file and really not compatible with any existing infrastructure  
for

SRT.


A fair point. The alternatives I can see are (1) using an incompatible  
format so that the user sees nothing or (2) adding a header that indicates  
that the track is metadata.


In order to tell the user to stop wasting their time with this file, I  
think (1) is clearly worse. (2) is absolutely an option, but it will only  
make a difference to software that understands this header and if the  
header is optional it will likely often be omitted. A dialog saying this  
is a metadata track, you can't watch it is slightly friendlier than a  
screen full of crap, but they are both pretty effective at getting the  
message across.


 If we define WebSRT in a way that can handle 99% of existing content  
and
degrade gracefully (enough) when using new features in old software,  
it

seems reasonable to do. If lots of software developers cry foul, then
perhaps we should reconsider. It seems to me, though, that actually
researching and defining a good algorithm for parsing SRT would be of  
use

to
others than just browsers.


How is that different from moving away from SRT. If everyone has to  
change
their parsing of SRT to accommodate a new spec, then that is a new  
format.




Not everyone has to change their parsers immediately, many will  
continue to

work. However, if someone wants to support SRT in a compatible way, it's
very helpful to have a spec, assuming that WebSRT is actually compatible
enough with existing SRT content.

This is quite similar to HTML4 vs HTML5. There are lots of mostly
compatible HTML parsers, but HTML5 defines a single parsing algorithm,  
and

slow convergence towards that is a good thing.



No, no, no! It is not at all similar to HTML4 and HTML5. A Web browser
cannot suddenly stop working for a Web page, just because it has some  
extra
functionality in it. Thus, the HTML format has been developed such that  
it

can be extended without breaking existing stuff. We can guarantee that no
browser will break because that is the way in which the format has been
specified.

No such thing has happened for SRT and there is simply no way to  
guarantee

that all new WebSRT files will work in all existing SRT software, because
SRT has not been specified as a extensible format and because there is no
agreement between all parties that have implemented SRT support as to how
extensions should be made.

We can introduce such a thing for WebSRT, but we cannot claim it for SRT.


You are right, existing SRT parsers are probably far less interoperable  
than HTML parsers were before HTML5.


Existing content demands that SRT parsers handle at least i, b, font  
and u in some manner, even if it is by ignoring it. Any parsers that  
treat SRT as plain text don't even work with todays content, so I don't  
think they should be considered at all. The question, then, is if parsers  
that handle the mentioned markup also ignore 1, ruby and rt. I  
haven't tested it, but I assume that some will ignore it and some won't.  
How many percent of the media player market would have to handle this  
correctly for these extensions to be OK, in your opinion?



If the SRT ecosystem is so fragile that it cannot tolerate any extension
whatsoever, then we should stay far away from it. It just seems that's  
not

the case.



How do we know that everyone that uses SRT now really wants to use WebSRT
instead and wants to take part in the new ecosystem that we are  
introducing?

We make some pretty big assumptions about what everyone who is not a Web
browser vendor wants to do with SRT. That doesn't make the existing SRT
ecosystem fragile - but it makes it an existing environment that needs  
to be

respected.


At this point, what is your recommendation? The following ideas have been  
on the table:


* Change the file extension to something other than .srt.

I don't have an opinion, browsers ignore the file extension anyway.

* Change the MIME type to something other than text/srt.

I doubt it makes any difference, as most software that deal with SRT today  
have no concept of MIME types. No matter what I'd want exactly 1 MIME type  
or alternatively make browsers ignore the MIME type completely.


* Add a header to WebSRT to make it uniquely identifiable.

The header would have to be mandatory and browsers would have to reject  

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-23 Thread Philip Jägenstedt
On Sat, 21 Aug 2010 01:32:49 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


On Fri, Aug 20, 2010 at 10:53 PM, Philip Jägenstedt  
phil...@opera.comwrote:



On Wed, 18 Aug 2010 00:42:04 +0200, Silvia Pfeiffer 
silviapfeiff...@gmail.com wrote:

 On Thu, Aug 12, 2010 at 6:09 PM, Philip Jägenstedt phil...@opera.com

wrote:

Yeah, so the only conforming solution is probably to use CSS3
transition-delay property. That may not be the most elegant solution,  
but

it
works.



So, it seems clear that in order to use an HTML parser we have to  
sacrifice some features or make them more verbose.



That sounds like there are multiple problems, when in fact we are only
talking about the single use case of timestamps.


I was referring also to the voices markup which is made much more verbose.


All other requirements are
met by the existing innerHTML parser. Is it really necessary to throw out
all the advantages of re-using innerHTML just to avoid some extra markup  
for this single use case?


No, this isn't a critical use case in itself. I'm not fundamentally  
opposed to using an HTML parser, I just don't see any great benefits, but  
some complications.



The whole of the WebSRT parser isn't very big or complicated, so I don't
think implementation cost is a strong argument for reusing the HTML  
parser,

especially since at least the timing syntax needs a separate parser.




It's not just about implementation cost - it's also the problem of
maintaining another spec that can grow to have eventually all the  
features

that HTML5 has and more. Do you really eventually want to re-spec and
re-implement a whole innerHTML parser plus the extra t element when we
start putting svg and canvas and all sorts of other more complex HTML
features into captions? Just because the t element is making trouble  
now?

Is this really the time to re-invent HTML?


I don't expect that SVG, canvas, images, etc will ever natively be made  
part of captions. Rather, I would hope that the metadata state together  
with scripts is used. If we think that e.g. images in captions are an  
important use case, then WebSRT is not a good solution.


If we allow arbitrary HTML and expect browsers to handle it well, it adds  
some complexity. For example, any videos and images in the cue would have  
to be fully loaded and ready to be decoded by the time the cue is to be  
shown, which I really don't want to implement the logic for. Simply having  
an iframe-like container where the document is replaced for each cue  
wouldn't be enough, rather one would have to create one document per cue  
during parsing and wait for all of those to finish loading before  
beginning playback. I'm not sure, but I'm guessing that amounts to  
significant memory overhead.


As an aside, I personally see it as a good things that font *doesn't*  
work in WebSRT, whereas it would using an HTML parser.



It's a bit more than just annoying to users. If there are automated
processes involved that print that stuff on tape for example, you can  
burn
through a lot of material and money before realising that your input  
files

are broken and if you cannot get software support for the new files
implemented, you may need to implement costly manual checking of the
files.



SRT as it is today can and does contain broken timestamps, missing
linebreaks and at least i, b, u and font ... markup, some of  
which
is broken. If anyone is able to to rely on their input as being  
well-formed
enough as to be put through automatic but costly processes, they'd have  
to

have very good control of where their input comes from. I can't see how
WebSRT would change that.



I would indeed expect a fairly trusted relationship with the supplier.  
But
assuming your supplier changes from SRT to WebSRT support in their  
captions.
If they have two different file extensions, you will notice immediately  
and

there is a trigger to actually start implementing WebSRT support. If they
are the same file extension, that will cause the trouble I explained. If  
at
least there was a version identifier in existing SRT, then we wouldn't  
have

that trouble at all. But we've had this discussion.



 The core problem is that WebSRT is far too compatible with existing  
SRT

usage. Regardless of the file extension and MIME type used, it's quite
improbable that anyone will have different parsers for the same  
format.

Once
media players have been forced to handle the extra markup in WebSRT  
(e.g.

by
ignoring it, as many already do) the two formats will be the same, and
using
WebSRT markup in .srt files will just work, so that's what people will
do.
We may avoid being seen as arrogant format-hijackers, but the end  
result

is
two extensions and two different MIME types that mean exactly the same
thing.




It actually burns down to the question: do we want the simple SRT  
format

to
survive as its own format and be something that people can rely upon as
not
having weird stuff in it - or 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-23 Thread Silvia Pfeiffer
On Mon, Aug 23, 2010 at 6:55 PM, Philip Jägenstedt phil...@opera.comwrote:

 On Sat, 21 Aug 2010 01:32:49 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

  On Fri, Aug 20, 2010 at 10:53 PM, Philip Jägenstedt phil...@opera.com
 wrote:

  On Wed, 18 Aug 2010 00:42:04 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

  On Thu, Aug 12, 2010 at 6:09 PM, Philip Jägenstedt phil...@opera.com

 wrote:

 Yeah, so the only conforming solution is probably to use CSS3
 transition-delay property. That may not be the most elegant solution,
 but
 it
 works.


 So, it seems clear that in order to use an HTML parser we have to
 sacrifice some features or make them more verbose.



 That sounds like there are multiple problems, when in fact we are only
 talking about the single use case of timestamps.


 I was referring also to the voices markup which is made much more verbose.


Yeah, but that one actually makes sense to be integrated with existing ways
that class and CSS work.


I don't expect that SVG, canvas, images, etc will ever natively be made
 part of captions. Rather, I would hope that the metadata state together with
 scripts is used. If we think that e.g. images in captions are an important
 use case, then WebSRT is not a good solution.


I believe they will be. But since we are only looking at the ways in which
captions and subtitles are used currently, we haven't accepted this as an
important use case, which is fair enough. I am considering likely future use
though, which is always hard to argue.



 If we allow arbitrary HTML and expect browsers to handle it well, it adds
 some complexity. For example, any videos and images in the cue would have to
 be fully loaded and ready to be decoded by the time the cue is to be shown,
 which I really don't want to implement the logic for. Simply having an
 iframe-like container where the document is replaced for each cue wouldn't
 be enough, rather one would have to create one document per cue during
 parsing and wait for all of those to finish loading before beginning
 playback. I'm not sure, but I'm guessing that amounts to significant memory
 overhead.


I have to leave that discussion to others, since I don't know enough about
how the plumbing used in browsers works together. My expectation was that
most of the plumbing with innerHTML already exists and the loading/display
of the cue will be in parallel to the video playback, so it won't hold back
the main page, even if e.g. a img element cannot be loaded in time.




 Honestly, using the existing small mess around SRT as an excuse to turn it
 into a huge mess doesn't seem a good argument to me.


 I'm just saying that SRT isn't a plain text format today and anyone who's
 able to assume it is can only do so because they control the input.

 Deployed SRT uses i, b, font and u. WebSRT adds ruby, rt and
 1...infinity, extensions which are very much in line with the existing
 format and already works in many players (in the sense that they are
 ignored, not rendered). I wouldn't call that a huge mess.


And removes font and u. And adds a whole swag of other functionality,
which are not in line with the existing format. Just picking a part of the
WebSRT specification that may work if the software was written sanely isn't
really a fair argument for compatibility. But anyway, I think at this stage
we can only agree to disagree about whether SRT ad WebSRT are compatible.
:-)




  Aside: WebSRT can't contain binary data, only UTF-8 encoded text.



 It sure can. Just base-64 encode it. I'm not saying it's a good thing, but
 if somebody really has an urge...


 Sure, this would be a metadata track. Sites have no reason to offer
 download links to it, and if anyone gets hold of such a file it would
 quickly be evident that it's useless.


After a user has seen the crap on screen. I'm just saying: it's a legal
WebSRT file and really not compatible with any existing infrastructure for
SRT.



  If we define WebSRT in a way that can handle 99% of existing content and
 degrade gracefully (enough) when using new features in old software, it
 seems reasonable to do. If lots of software developers cry foul, then
 perhaps we should reconsider. It seems to me, though, that actually
 researching and defining a good algorithm for parsing SRT would be of use
 to
 others than just browsers.


 How is that different from moving away from SRT. If everyone has to change
 their parsing of SRT to accommodate a new spec, then that is a new format.


 Not everyone has to change their parsers immediately, many will continue to
 work. However, if someone wants to support SRT in a compatible way, it's
 very helpful to have a spec, assuming that WebSRT is actually compatible
 enough with existing SRT content.

 This is quite similar to HTML4 vs HTML5. There are lots of mostly
 compatible HTML parsers, but HTML5 defines a single parsing algorithm, and
 slow convergence towards that is a good thing.


No, no, no! It is not at all 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-23 Thread Julian Reschke

On 24.08.2010 04:32, Silvia Pfeiffer wrote:

...
P.S. I do wonder if anyone other than us is still following this thread. ;-)

 ...

I do. It seems that embrace  extend is somewhat unfriendly unless the 
original SRT community is ok with it. If it's not, then make sure that 
the formats can be distinguished, and that there are distinct media types.


Best regards, Julian


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-20 Thread Philip Jägenstedt
On Wed, 18 Aug 2010 00:42:04 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


On Thu, Aug 12, 2010 at 6:09 PM, Philip Jägenstedt  
phil...@opera.comwrote:



On Thu, 12 Aug 2010 02:11:55 +0200, Silvia Pfeiffer 
silviapfeiff...@gmail.com wrote:

 On Thu, Aug 12, 2010 at 1:26 AM, Philip Jägenstedt phil...@opera.com

wrote:

 On Wed, 11 Aug 2010 15:38:32 +0200, Silvia Pfeiffer 

silviapfeiff...@gmail.com wrote:

 On Wed, Aug 11, 2010 at 10:30 PM, Philip Jägenstedt  
phil...@opera.com



wrote:

 On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer 


silviapfeiff...@gmail.com wrote:


  Going with HTML in the cues, we either have to drop voices and  
inner


timestamps or invent new markup, as HTML can't express either. I  
don't

think
either of those are really good solutions, so right now I'm not
convinced
that reusing the innerHTML parser is a good way forward.




I don't see a need for the voices - they already have markup in HTML,
see
above. But I do wonder about the timestamps. I'd much rather keep the
innerHTML parser if we can, but I don't know enough about how the
timestamps
could be introduced in a non-breakable manner. Maybe with a data-
attribute?
Maybe span data-t=00:00:02.100.../span?


data- attributes are reserved for use by scripts on the same page,  
but we
*could* of course introduce new elements or attributes for this  
purpose.
However, adding features to HTML only for use in WebSRT seems a bit  
odd.





I'd rather avoid adding features to HTML only for WebSRT. Ian turned  
the

timestamps into ProcessingInstructions

http://www.whatwg.org/specs/web-apps/current-work/websrt.html#websrt-cue-text-dom-construction-rules
.
Could we introduce something like ?t at=00:00:02.100? without
breaking
the innerHTML parser?



It appears that the innerHTML parser in at least Opera and Firefox  
handles

PIs in some manner, see test at 
http://software.hixie.ch/utilities/js/live-dom-viewer/saved/587



Chrome and Safari don't though.




However, it isn't valid HTML, validator.nu says Saw ?. Probable cause:
Attempt to use an XML processing instruction in HTML. (XML processing
instructions are not supported in HTML.)




Yeah, so the only conforming solution is probably to use CSS3
transition-delay property. That may not be the most elegant solution,  
but it

works.


So, it seems clear that in order to use an HTML parser we have to  
sacrifice some features or make them more verbose. The whole of the WebSRT  
parser isn't very big or complicated, so I don't think implementation cost  
is a strong argument for reusing the HTML parser, especially since at  
least the timing syntax needs a separate parser.



 OTOH, if you say that it will take a short time for popular software to

start ignoring the extra WebSRT stuff, well, in this case they have
implemented WebSRT support in its most basic form and then there is no
problem any more anyway. They will then accept the new files and their
extensions and mime types and there is explicit support rather than the
dodgy question of whether these SRT files will provide crap or not.  
During

a
transition period, we will make all software that currently supports  
SRT
become unstable and unreliable. I don't think that's the right way to  
deal
with an existing ecosystem. Coming in as the big brother, claiming  
their
underspecified format, throwing in incompatible features, and saying:  
just

deal with it. It's just not the cavalier thing to do.



I agree that it seems (and is) quite selfish, but am not sure the
alternatives are any better, see below. About unstable and  
unreliable, I

think there are really only two kind of errors we will see:

1. Some cues being ignored due to trailing settings after the timestamp.



Some files may decide at this point that the files are not conformant and
fail.




2. Markup being interpreted as plain text.

Both already can and do happen with existing use of SRT, which is  
annoying

but better than no subtitles at all.



It's a bit more than just annoying to users. If there are automated
processes involved that print that stuff on tape for example, you can  
burn
through a lot of material and money before realising that your input  
files

are broken and if you cannot get software support for the new files
implemented, you may need to implement costly manual checking of the  
files.


SRT as it is today can and does contain broken timestamps, missing  
linebreaks and at least i, b, u and font ... markup, some of which  
is broken. If anyone is able to to rely on their input as being  
well-formed enough as to be put through automatic but costly processes,  
they'd have to have very good control of where their input comes from. I  
can't see how WebSRT would change that.


The core problem is that WebSRT is far too compatible with existing  
SRT

usage. Regardless of the file extension and MIME type used, it's quite
improbable that anyone will have different parsers for the same format.  
Once
media 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-18 Thread Julian Reschke

On 18.08.2010 00:43, Silvia Pfeiffer wrote:

On Wed, Aug 18, 2010 at 5:12 AM, Julian Reschke julian.resc...@gmx.de
mailto:julian.resc...@gmx.de wrote:

On 12.08.2010 10:09, Philip Jägenstedt wrote:

...

The core problem is that WebSRT is far too compatible with
existing SRT usage. Regardless of the file extension and MIME
type used, it's quite improbable that anyone will have different
parsers for the same format. Once media players have been forced
to handle the extra markup in WebSRT (e.g. by ignoring it, as
many already do) the two formats will be the same, and using
WebSRT markup in .srt files will just work, so that's what
people will do. We may avoid being seen as arrogant
format-hijackers, but the end result is two extensions and two
different MIME types that mean exactly the same thing.

  ...

(just observing...)

So when something that used to be plain text now carries markup,
what's the compatibility story for plain text that happens to
contain markup characters, such as ,  or ?

Best regards, Julian


I assume you mean: what happens to text that contains such characters?
In most SRT systems, such stuff will just be displayed verbatim.


Yes, in SRT. But in WebSRT? Isn't there a compatibility problem when the 
format just switches from plain text to possibly escaped text?


(I recall the problems with title handling in RSS, and I want to make 
sure that people have considered this issue)


Best regards, Julian


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-17 Thread Julian Reschke

On 12.08.2010 10:09, Philip Jägenstedt wrote:

...
The core problem is that WebSRT is far too compatible with existing SRT 
usage. Regardless of the file extension and MIME type used, it's quite improbable that 
anyone will have different parsers for the same format. Once media players have been 
forced to handle the extra markup in WebSRT (e.g. by ignoring it, as many already do) the 
two formats will be the same, and using WebSRT markup in .srt files will just work, so 
that's what people will do. We may avoid being seen as arrogant format-hijackers, but the 
end result is two extensions and two different MIME types that mean exactly the same 
thing.

 ...

(just observing...)

So when something that used to be plain text now carries markup, what's 
the compatibility story for plain text that happens to contain markup 
characters, such as ,  or ?


Best regards, Julian


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-17 Thread Silvia Pfeiffer
On Thu, Aug 12, 2010 at 6:09 PM, Philip Jägenstedt phil...@opera.comwrote:

 On Thu, 12 Aug 2010 02:11:55 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

  On Thu, Aug 12, 2010 at 1:26 AM, Philip Jägenstedt phil...@opera.com
 wrote:

  On Wed, 11 Aug 2010 15:38:32 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

  On Wed, Aug 11, 2010 at 10:30 PM, Philip Jägenstedt phil...@opera.com

 wrote:

  On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer 

 silviapfeiff...@gmail.com wrote:


   Going with HTML in the cues, we either have to drop voices and inner

 timestamps or invent new markup, as HTML can't express either. I don't
 think
 either of those are really good solutions, so right now I'm not
 convinced
 that reusing the innerHTML parser is a good way forward.



 I don't see a need for the voices - they already have markup in HTML,
 see
 above. But I do wonder about the timestamps. I'd much rather keep the
 innerHTML parser if we can, but I don't know enough about how the
 timestamps
 could be introduced in a non-breakable manner. Maybe with a data-
 attribute?
 Maybe span data-t=00:00:02.100.../span?


 data- attributes are reserved for use by scripts on the same page, but we
 *could* of course introduce new elements or attributes for this purpose.
 However, adding features to HTML only for use in WebSRT seems a bit odd.



 I'd rather avoid adding features to HTML only for WebSRT. Ian turned the
 timestamps into ProcessingInstructions

 http://www.whatwg.org/specs/web-apps/current-work/websrt.html#websrt-cue-text-dom-construction-rules
 .
 Could we introduce something like ?t at=00:00:02.100? without
 breaking
 the innerHTML parser?


 It appears that the innerHTML parser in at least Opera and Firefox handles
 PIs in some manner, see test at 
 http://software.hixie.ch/utilities/js/live-dom-viewer/saved/587


Chrome and Safari don't though.



 However, it isn't valid HTML, validator.nu says Saw ?. Probable cause:
 Attempt to use an XML processing instruction in HTML. (XML processing
 instructions are not supported in HTML.)



Yeah, so the only conforming solution is probably to use CSS3
transition-delay property. That may not be the most elegant solution, but it
works.



  OTOH, if you say that it will take a short time for popular software to
 start ignoring the extra WebSRT stuff, well, in this case they have
 implemented WebSRT support in its most basic form and then there is no
 problem any more anyway. They will then accept the new files and their
 extensions and mime types and there is explicit support rather than the
 dodgy question of whether these SRT files will provide crap or not. During
 a
 transition period, we will make all software that currently supports SRT
 become unstable and unreliable. I don't think that's the right way to deal
 with an existing ecosystem. Coming in as the big brother, claiming their
 underspecified format, throwing in incompatible features, and saying: just
 deal with it. It's just not the cavalier thing to do.


 I agree that it seems (and is) quite selfish, but am not sure the
 alternatives are any better, see below. About unstable and unreliable, I
 think there are really only two kind of errors we will see:

 1. Some cues being ignored due to trailing settings after the timestamp.


Some files may decide at this point that the files are not conformant and
fail.



 2. Markup being interpreted as plain text.

 Both already can and do happen with existing use of SRT, which is annoying
 but better than no subtitles at all.


It's a bit more than just annoying to users. If there are automated
processes involved that print that stuff on tape for example, you can burn
through a lot of material and money before realising that your input files
are broken and if you cannot get software support for the new files
implemented, you may need to implement costly manual checking of the files.




 The core problem is that WebSRT is far too compatible with existing SRT
 usage. Regardless of the file extension and MIME type used, it's quite
 improbable that anyone will have different parsers for the same format. Once
 media players have been forced to handle the extra markup in WebSRT (e.g. by
 ignoring it, as many already do) the two formats will be the same, and using
 WebSRT markup in .srt files will just work, so that's what people will do.
 We may avoid being seen as arrogant format-hijackers, but the end result is
 two extensions and two different MIME types that mean exactly the same
 thing.


It actually burns down to the question: do we want the simple SRT format to
survive as its own format and be something that people can rely upon as not
having weird stuff in it - or do we not. I believe that it's important
that it survives. WebSRT can have absolutely anything in it, including code
and binary data, even if that stuff would not be interpreted in a browser,
but handed on to the JavaScript API for a JavaScript routine to do something
with it. 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-17 Thread Silvia Pfeiffer
On Wed, Aug 18, 2010 at 5:12 AM, Julian Reschke julian.resc...@gmx.dewrote:

 On 12.08.2010 10:09, Philip Jägenstedt wrote:

 ...

 The core problem is that WebSRT is far too compatible with existing SRT
 usage. Regardless of the file extension and MIME type used, it's quite
 improbable that anyone will have different parsers for the same format. Once
 media players have been forced to handle the extra markup in WebSRT (e.g. by
 ignoring it, as many already do) the two formats will be the same, and using
 WebSRT markup in .srt files will just work, so that's what people will do.
 We may avoid being seen as arrogant format-hijackers, but the end result is
 two extensions and two different MIME types that mean exactly the same
 thing.

  ...

 (just observing...)

 So when something that used to be plain text now carries markup, what's the
 compatibility story for plain text that happens to contain markup
 characters, such as ,  or ?

 Best regards, Julian


I assume you mean: what happens to text that contains such characters? In
most SRT systems, such stuff will just be displayed verbatim.

Cheers,
Silvia.


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-12 Thread Philip Jägenstedt
On Thu, 12 Aug 2010 02:11:55 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


On Thu, Aug 12, 2010 at 1:26 AM, Philip Jägenstedt  
phil...@opera.comwrote:



On Wed, 11 Aug 2010 15:38:32 +0200, Silvia Pfeiffer 
silviapfeiff...@gmail.com wrote:

 On Wed, Aug 11, 2010 at 10:30 PM, Philip Jägenstedt phil...@opera.com

wrote:

 On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer 

silviapfeiff...@gmail.com wrote:



 Going with HTML in the cues, we either have to drop voices and inner

timestamps or invent new markup, as HTML can't express either. I don't
think
either of those are really good solutions, so right now I'm not  
convinced

that reusing the innerHTML parser is a good way forward.




I don't see a need for the voices - they already have markup in HTML,  
see

above. But I do wonder about the timestamps. I'd much rather keep the
innerHTML parser if we can, but I don't know enough about how the
timestamps
could be introduced in a non-breakable manner. Maybe with a data-
attribute?
Maybe span data-t=00:00:02.100.../span?



data- attributes are reserved for use by scripts on the same page, but  
we

*could* of course introduce new elements or attributes for this purpose.
However, adding features to HTML only for use in WebSRT seems a bit odd.



I'd rather avoid adding features to HTML only for WebSRT. Ian turned the
timestamps into ProcessingInstructions
http://www.whatwg.org/specs/web-apps/current-work/websrt.html#websrt-cue-text-dom-construction-rules.
Could we introduce something like ?t at=00:00:02.100? without
breaking
the innerHTML parser?


It appears that the innerHTML parser in at least Opera and Firefox handles  
PIs in some manner, see test at  
http://software.hixie.ch/utilities/js/live-dom-viewer/saved/587


However, it isn't valid HTML, validator.nu says Saw ?. Probable cause:  
Attempt to use an XML processing instruction in HTML. (XML processing  
instructions are not supported in HTML.)



  That would make text/srt and text/websrt synonymous, which is kind of



pointless.





No, it's only pointless if you are a browser vendor. For everyone  
else

it
is
a huge advantage to be able to choose between a guaranteed simple  
format

and
a complex format with all the bells and whistles.



 The advantages of taking text/srt is that all existing software to
create


SRT can be used to create WebSRT




That's not strictly true. If they load a WebSRT file that was  
created by

some other software for further editing and that WebSRT file uses
advanced
WebSRT functionality, the authoring software will break.


Right, especially settings appended after the timestamps are quite  
likely

to be stripped when saving the file.




Or may even break the software if it's badly implemented, or may end up
inside the cue text - just like the other control instructions which  
will

end up as plain text inside the cue. You won't believe how many people
have
pointed out to me that my SRT test parser exposed an i tag markup in  
the

cue text rather than interpreting it, when I was experimenting with
applying
SRT cues in a HTML div without touching the cue text content.  
Extraneous

markup is really annoying.



Indeed, but given the option of seeing no subtitles at all and seeing  
some
markup from time to time, which do you prefer? For a long time I was  
using a

media player that didn't handle HTML in SRT and wasn't very amused at
seeing i and similar, but it was sure better than no subtitles at  
all. I

doubt it will take long for popular software to start ignoring things
trailing the timestamp and things in square brackets, which is all you  
need

for basic compatibility. Some of the tested software already does so.



Hmm... not sure if I'd prefer to see the crap or rather be forced to run  
it
through a stripping tool first. I think what would happen is that I'd  
start

watching the movie, then notice the crap, get annoyed, stop it, run a
stripping tool, restart the movie. I'd probably prefer noticing that  
before

I start the movie, which would happen if the file was a different format.
But it does take a bit of expert knowledge to know that websrt can be
easily converted to srt and to have such a stripping tool installed, I  
give

you that.


Indeed, it never struck me to take the time to strip away the extra  
markup, even though I would have known how. Instead I waited until my  
media player could do the job for me.



OTOH, if you say that it will take a short time for popular software to
start ignoring the extra WebSRT stuff, well, in this case they have
implemented WebSRT support in its most basic form and then there is no
problem any more anyway. They will then accept the new files and their
extensions and mime types and there is explicit support rather than the
dodgy question of whether these SRT files will provide crap or not.  
During a

transition period, we will make all software that currently supports SRT
become unstable and unreliable. I don't think that's 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-11 Thread Anne van Kesteren
On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:

That's a good approach and will reduce the need for breaking
backwards-compatibility. In an xml-based format that need is 0, while  
with a text format where the structure is ad-hoc, that need can never be  
reduced to 0. That's what I am concerned about and that's why I think we  
need a version identifier. If we end up never using/changing the version  
identifier, the

better so. But I'd much rather we have it now and can identify what
specification a file adheres to than not being able to do so later.


XML is also text-based. ;-) But more seriously, if we ever need to make  
changes that would completely break backwards compatibility we should just  
use a new format rather than fit it into an existing one. That is the  
approach we have for most formats (and APIs) on the web (CSS, HTML,  
XMLHttpRequest) and so far a version identifier need (or need for a  
replacement) has not yet arisen.


Might be worth reading through some of:  
http://www.w3.org/2002/09/wbs/40318/issues-4-84-objection-poll/results



On Tue, Aug 10, 2010 at 7:49 PM, Philip Jägenstedt  
phil...@opera.comwrote:

That would make text/srt and text/websrt synonymous, which is kind of
pointless.


No, it's only pointless if you are a browser vendor. For everyone else  
it is a huge advantage to be able to choose between a guaranteed simple  
format and a complex format with all the bells and whistles.


But it is not complex at all and everyone else supports most of the  
extensions the WebSRT format has.



--
Anne van Kesteren
http://annevankesteren.nl/


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-11 Thread Silvia Pfeiffer
On Wed, Aug 11, 2010 at 5:04 PM, Anne van Kesteren ann...@opera.com wrote:

 On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

 That's a good approach and will reduce the need for breaking
 backwards-compatibility. In an xml-based format that need is 0, while with
 a text format where the structure is ad-hoc, that need can never be reduced
 to 0. That's what I am concerned about and that's why I think we need a
 version identifier. If we end up never using/changing the version
 identifier, the
 better so. But I'd much rather we have it now and can identify what
 specification a file adheres to than not being able to do so later.


 XML is also text-based. ;-)


I mean unstructured text. ;-)



 But more seriously, if we ever need to make changes that would completely
 break backwards compatibility we should just use a new format rather than
 fit it into an existing one.



That's exactly the argument I am using for why WebSRT should be a new format
and not take over the SRT space. They are different enough to not just be
versions of each other. That's actually what I care about a lot more than a
version field.



 That is the approach we have for most formats (and APIs) on the web (CSS,
 HTML, XMLHttpRequest) and so far a version identifier need (or need for a
 replacement) has not yet arisen.


There are Web formats with a version attribute, such as Atom, RSS and even
HTTP has a version number. Also, I can see that structured formats with a
clear path for how extensions would be included may not need such a version
attribute. WebSRT is not such a structured format, which is what makes all
the difference. For example, you simply cannot put a new element outside the
root element in XML, but you can easily put a new element anywhere in WebSRT
- which might actually make a lot of sense if you think e.g. about adding
SVG and CSS inline in future.



 Might be worth reading through some of:
 http://www.w3.org/2002/09/wbs/40318/issues-4-84-objection-poll/results


I guess you mostly wanted me to read
http://berjon.com/blog/2009/12/xmlbp-naive-versioning.html . :-)
It's a nice discussion with some good experiences. Interesting that we need
quirks mode to deal with versioning issues.

It doesn't take into account good practice in software development, though,
where there is a minor version number and a major version number. A change
of the minor version number is ignored by apps that need to display
something - it just gives a hint that new features were introduced that
shouldn't break anything. It's basically metadata to give a hint to
applications where it really matters, e.g. if an application relies on new
features to be available. A change of major version number, however,
essentially means it's a new format and thus breaks existing stuff to allow
the world to move forwards within the same namespace and experience
framework.

But let's get this resolved. I don't care enough about this to make a fuss.

So ... if we do everything possible to make WebSRT flexible for future
changes (which is what Philip proposed) and agree that if we cannot extend
WebSRT in a backwards compatible manner, we will create a new format, I can
live without a version attribute.

I am only a little weary of this, because already we are trying to make SRT
and WebSRT the same format when there is no compatibility (see below).



  On Tue, Aug 10, 2010 at 7:49 PM, Philip Jägenstedt phil...@opera.com
 wrote:

 That would make text/srt and text/websrt synonymous, which is kind of
 pointless.


 No, it's only pointless if you are a browser vendor. For everyone else it
 is a huge advantage to be able to choose between a guaranteed simple format
 and a complex format with all the bells and whistles.


 But it is not complex at all and everyone else supports most of the
 extensions the WebSRT format has.



All of the WebSRT extensions that do not exist in {basic SRT , b , i}
 are not supported by anyone yet.
Existing SRT authoring tools, media players, transcoding tools, etc. do not
support the cue settings (see
http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-cue-settings),
or parsing of random text in the cues, or the voice markers. So, I disagree
with everyone else supports most of the extensions of the WebSRT format.

Also, what I man with the word complex is actually a good thing: a format
that supports lots of requirements that go beyond the basic ones. Thus, it's
actually a good thing to have a simple format (i.e. SRT) and a complex
(maybe rather: rich? capable?) format (i.e. WebSRT).

Cheers,
Silvia.


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-11 Thread Anne van Kesteren
On Wed, 11 Aug 2010 10:30:23 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:
On Wed, Aug 11, 2010 at 5:04 PM, Anne van Kesteren ann...@opera.com  
wrote:
That is the approach we have for most formats (and APIs) on the web  
(CSS, HTML, XMLHttpRequest) and so far a version identifier need (or  
need for a replacement) has not yet arisen.


There are Web formats with a version attribute, such as Atom, RSS and  
even HTTP has a version number.


None of these have really executed a successful version strategy though.  
Syndication in particular is quite bad, we should learn from that.


See e.g. http://diveintomark.org/archives/2004/02/04/incompatible-rss



Also, I can see that structured formats with a
clear path for how extensions would be included may not need such a  
version
attribute. WebSRT is not such a structured format, which is what makes  
all the difference. For example, you simply cannot put a new element  
outside the root element in XML, but you can easily put a new element  
anywhere in WebSRT - which might actually make a lot of sense if you  
think e.g. about adding

SVG and CSS inline in future.


There is all kinds of ways we could address this. For instance, we could  
add a feature that makes a line ignored and use that in the future for new  
features. While players are transitioning to WebSRT they will ensure that  
they do not break with future versions of the format. There might be  
enough extensibility in the current WebSRT parsing rules for this, I have  
not checked.



It doesn't take into account good practice in software development,  
though, where there is a minor version number and a major version  
number. A change of the minor version number is ignored by apps that  
need to display

something - it just gives a hint that new features were introduced that
shouldn't break anything. It's basically metadata to give a hint to
applications where it really matters, e.g. if an application relies on  
new features to be available. A change of major version number, however,
essentially means it's a new format and thus breaks existing stuff to  
allow the world to move forwards within the same namespace and  
experience

framework.


What works for software products does not work for formats with universal  
deployment on which we want to get interoperability between various  
vendors. They are very distinct.




But it is not complex at all and everyone else supports most of the
extensions the WebSRT format has.


All of the WebSRT extensions that do not exist in {basic SRT , b , i}
are not supported by anyone yet.
Existing SRT authoring tools, media players, transcoding tools, etc. do  
not

support the cue settings (see
http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-cue-settings),
or parsing of random text in the cues, or the voice markers. So, I  
disagree with everyone else supports most of the extensions of the  
WebSRT format.


Do they throw an error or do they just ignore the settings? If the latter  
it does not seem like a problem. If the former authors will probably not  
use these features for a while until they are better supported.



Also, what I man with the word complex is actually a good thing: a  
format that supports lots of requirements that go beyond the basic ones.  
Thus, it's actually a good thing to have a simple format (i.e. SRT) and  
a complex

(maybe rather: rich? capable?) format (i.e. WebSRT).


I don't think so. It just makes things more complex for authors (learn two  
formats, have to convert formats (i.e. change mime) in order to use new  
features (which could be as simple as a ruby fragment for some Japanese  
track), more complex for implementors (need two separate implementations  
as to not encourage authors to use features of the more complex one in the  
less complex one), more complex for conformance checkers (need more code),  
etc. Seems highly suboptimal to me.



--
Anne van Kesteren
http://annevankesteren.nl/


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-11 Thread Anne van Kesteren
On Wed, 11 Aug 2010 13:35:30 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:
On Wed, Aug 11, 2010 at 7:31 PM, Anne van Kesteren ann...@opera.com  
wrote:
While players are transitioning to WebSRT they will ensure that they do  
not break with future versions of the format.


That's impossible, since we do not know what future versions will look  
like and what features we may need.


If that is impossible it would be impossible for HTML and CSS too. And  
clearly it is not.




I'm pretty sure that several will break. We cannot just test a handful of
available applications and if they don't break assume none will. In fact,
all existing applications that get loaded with a WebSRT file with  
extended features will display text with stuff that is not expected - in  
particular if the metadata case is used. And wrong rendering is bad,  
e.g. if it's

part of a production process, burnt onto the video, and shipped to
hearing-impaired customers. Or stored in an archive.


Sure, that's why the tools should be updated to support the standard  
format instead rather than each having their own variant of SRT.


(And if they really just take in text like that they should at least run  
some kind of validation so not all kinds of garbage can get in.)



I don't think so. It just makes things more complex for authors (learn  
two formats,


I see that as an advantage: I can learn the simple format and be off to a
running start immediately. Then, when I find out that I need more  
features, I can build on top of already existing knowledge for the  
richer format and can convert my old files through a simple renaming of  
the resources.


Or could you learn the simple format from a tutorial that only teaches  
that and when you see someone else using more complex features you can  
just copy and paste them and use them directly. This is pretty much how  
the web works.




have to convert formats (i.e. change mime) in order to use new features
(which could be as simple as a ruby fragment for some Japanese track)


If I know from the start that I need these features, I will immediately
learn WebSRT.


But you don't.



, more complex for implementors (need two separate implementations as to
not encourage authors to use features of the more complex one in the  
less
complex one), more complex for conformance checkers (need more code),  
etc.

Seems highly suboptimal to me.


That's already part of Ian's proposal: it already supports multiple
different approaches of parsing cues. No extra complexity here.


Actually that is not true. There is only one approach to parsing in Ian's  
proposal.



My theory is: we only implement support for WebSRT in the browser - that  
it happens to also support SRT is a positive side effect. It works for  
the Web - and it works for the existing SRT communities and platforms.  
They know
they have to move to WebSRT in the long run, but right now they can get  
away with simple SRT support and still deliver for the Web. And they  
have a

growth path into a new file format that provides richer features.


This is the proposal. That they are the same format should not matter.


--
Anne van Kesteren
http://annevankesteren.nl/


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-11 Thread Philip Jägenstedt
On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


On Tue, Aug 10, 2010 at 7:49 PM, Philip Jägenstedt  
phil...@opera.comwrote:



On Tue, 10 Aug 2010 01:34:02 +0200, Silvia Pfeiffer 
silviapfeiff...@gmail.com wrote:

 On Tue, Aug 10, 2010 at 12:04 AM, Philip Jägenstedt phil...@opera.com

wrote:

 On Sat, 07 Aug 2010 09:57:39 +0200, Silvia Pfeiffer 

silviapfeiff...@gmail.com wrote:



I guess this is in support of Henri's proposal of parsing the cue  
using

the
HTML fragment parser (same as innerHTML)? That would be easy to
implement,
but how do we then mark up speakers? Using span  
class=narrator/span

around each cue is very verbose. HTML isn't very good for marking up
dialog,
which is quite a limitation when dealing with subtitles...



I actually think that the span @class mechanism is much more flexible
than
what we have in WebSRT right now. If we want multiple speakers to be  
able

to
speak in the same subtitle, then that's not possible in WebSRT. It's a
little more verbose in HTML, but not massively.

We might be able to add a special markup similar to the [timestamp]
markup
that Hixie introduced for Karaoke. This is beyond the innerHTML parser  
and

I
am not sure if it breaks it. But if it doesn't, then maybe we can also
introduce a [voice] marker to be used similarly?



An HTML parser parsing 1 or 00:01:30 will produce text nodes 1  
and
00:01:30. Without having read the HTML parsing algorithm I guess  
that
elements need to begin with a letter or similar. So, it's not possible  
to
(ab)use the HTML parser to handle inner timestamps of numerical voices,  
we'd

have to replace those with something else, probably more verbose.




I have checked the parse spec and
http://www.whatwg.org/specs/web-apps/current-work/#tag-open-state indeed
implies that a tag starting with a number is a parse error. Both, the
timestamps and the voice markers thus seem problems when going with an
innerHTML parser. Is there a way to resolve this? I mean: I'd quite  
happily

drop the voice markers for a span @class but I am not sure what to do
about the timestamps. We could do what I did in WMML and introduce a t
element with the timestamp as a @at attribute, but that is again more
verbose. We could also introduce an @at attribute in span which would  
then

at least end up in the DOM and can be dealt with specially.


What should numerical voices be replaced with? Personally I'd much rather  
write philip and silvia to mark up a conversation between us two, as I  
think it'd be quite hard to keep track of the numbers if editing subtitles  
with many different speakers. However, going with that and using an HTML  
parser is quite a hack. Names like mark and li may already have  
special parsing rules or default CSS.


Going with HTML in the cues, we either have to drop voices and inner  
timestamps or invent new markup, as HTML can't express either. I don't  
think either of those are really good solutions, so right now I'm not  
convinced that reusing the innerHTML parser is a good way forward.


 Think for example about the case where we had a requirement that a  
double
newline starts a new cue, but now we want to introduce a means where  
the

double newline is escaped and can be made part of a cue.

Other formats keep track of their version, such as MS Word files. It  
is to

be hoped that most new features can be introduced without breaking
backwards
compatibility and we can write the parsing requirements such that  
certain
things will be ignored, but in and of itself, WebSRT doesn't provide  
for
this extensibility. Right now, there is for example extensibility with  
the
WebSRT settings parsing (that's the stuff behind the timestamps)  
where

further setting:value settings can be introduced. But for example the
introduction of new cue identifiers (that's the  marker at the  
start

of
a cue) would be difficult without a version string, since anything that
doesn't match the given list will just be parsed as cue-internal tag  
and

thus end up as part of the cue text where plain text parsing is used.



The bug I filed suggested allowing arbitrary voices, to simplify the  
parser
and to make future extensions possible. For a web format I think this  
is a
better approach format than versioning. I haven't done a full review of  
the
parser, but there are probably more places where it could be more  
forgiving

so as to allow future tweaking.




That's a good approach and will reduce the need for breaking
backwards-compatibility. In an xml-based format that need is 0, while  
with a
text format where the structure is ad-hoc, that need can never be  
reduced to
0. That's what I am concerned about and that's why I think we need a  
version

identifier. If we end up never using/changing the version identifier, the
better so. But I'd much rather we have it now and can identify what
specification a file adheres to than not being able to do so later.


Perhaps I'm too 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-11 Thread Silvia Pfeiffer
On Wed, Aug 11, 2010 at 9:49 PM, Anne van Kesteren ann...@opera.com wrote:

 On Wed, 11 Aug 2010 13:35:30 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

 On Wed, Aug 11, 2010 at 7:31 PM, Anne van Kesteren ann...@opera.com
 wrote:

 While players are transitioning to WebSRT they will ensure that they do
 not break with future versions of the format.


 That's impossible, since we do not know what future versions will look
 like and what features we may need.


 If that is impossible it would be impossible for HTML and CSS too. And
 clearly it is not.


HTML and CSS have predefined structures within which their languages grow
and are able to grow. WebSRT has newlines to structure the format, which is
clearly not very useful for extensibility. No matter how we turn this, the
xml background or HTML and the name-value background of CSS provide them
with in-built extensibility, which WebSRT does not have.




  I'm pretty sure that several will break. We cannot just test a handful of
 available applications and if they don't break assume none will. In fact,
 all existing applications that get loaded with a WebSRT file with extended
 features will display text with stuff that is not expected - in particular
 if the metadata case is used. And wrong rendering is bad, e.g. if it's
 part of a production process, burnt onto the video, and shipped to
 hearing-impaired customers. Or stored in an archive.


 Sure, that's why the tools should be updated to support the standard format
 instead rather than each having their own variant of SRT.


They don't have their own variant of SRT - they only have their own parsers.
Some will tolerate crap at the end of the -- line. Others won't. That's
no break of conformance to the basic spec as given in
http://en.wikipedia.org/wiki/SubRip#SubRip_text_file_format . They all
interoperate on the basic SRT format. But they don't interoperate on the
WebSRT format. That's why WebSRT has to be a new format.




 (And if they really just take in text like that they should at least run
 some kind of validation so not all kinds of garbage can get in.)


That's not a requirement of the spec. It's requirement is to render
whatever characters are given in cues. That's why it is so simple.




  I don't think so. It just makes things more complex for authors (learn two
 formats,


 I see that as an advantage: I can learn the simple format and be off to a
 running start immediately. Then, when I find out that I need more
 features, I can build on top of already existing knowledge for the richer
 format and can convert my old files through a simple renaming of the
 resources.


 Or could you learn the simple format from a tutorial that only teaches that
 and when you see someone else using more complex features you can just copy
 and paste them and use them directly. This is pretty much how the web works.


Sure. All I need to do is rename the file. Not much trouble at all. Better
than believing I can just copy stuff from others since it's apparently the
same format and then it breaks the SRT environment that I already have and
that works.




  have to convert formats (i.e. change mime) in order to use new features
 (which could be as simple as a ruby fragment for some Japanese track)


 If I know from the start that I need these features, I will immediately
 learn WebSRT.


 But you don't.


Why? If I write Japanese subtitles and my tutorial tells me they are not
supported in SRT, but only in WebSRT, then I go for WebSRT. Done.




  , more complex for implementors (need two separate implementations as to
 not encourage authors to use features of the more complex one in the less
 complex one), more complex for conformance checkers (need more code),
 etc.
 Seems highly suboptimal to me.


 That's already part of Ian's proposal: it already supports multiple
 different approaches of parsing cues. No extra complexity here.


 Actually that is not true. There is only one approach to parsing in Ian's
 proposal.



A the moment, cues can have one of two different types of content:
(see
http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#syntax-0

6. The cue payload: either WebSRT cue
texthttp://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-cue-text
 or WebSRT metadata
texthttp://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#websrt-metadata-text
.

So that means in essence two different parsers.




  My theory is: we only implement support for WebSRT in the browser - that
 it happens to also support SRT is a positive side effect. It works for the
 Web - and it works for the existing SRT communities and platforms. They know
 they have to move to WebSRT in the long run, but right now they can get
 away with simple SRT support and still deliver for the Web. And they have a
 growth path into a new file format that provides richer features.


 This is the proposal. That they are the same format should not matter.


It matters to other 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-11 Thread Silvia Pfeiffer
On Wed, Aug 11, 2010 at 10:30 PM, Philip Jägenstedt phil...@opera.comwrote:

 On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

  On Tue, Aug 10, 2010 at 7:49 PM, Philip Jägenstedt phil...@opera.com
 wrote:

 I have checked the parse spec and
 http://www.whatwg.org/specs/web-apps/current-work/#tag-open-state indeed
 implies that a tag starting with a number is a parse error. Both, the
 timestamps and the voice markers thus seem problems when going with an
 innerHTML parser. Is there a way to resolve this? I mean: I'd quite
 happily
 drop the voice markers for a span @class but I am not sure what to do
 about the timestamps. We could do what I did in WMML and introduce a t
 element with the timestamp as a @at attribute, but that is again more
 verbose. We could also introduce an @at attribute in span which would
 then
 at least end up in the DOM and can be dealt with specially.


 What should numerical voices be replaced with? Personally I'd much rather
 write philip and silvia to mark up a conversation between us two, as I
 think it'd be quite hard to keep track of the numbers if editing subtitles
 with many different speakers. However, going with that and using an HTML
 parser is quite a hack. Names like mark and li may already have special
 parsing rules or default CSS.


In HTML it is span class=philip../span and span
class=silvia.../span. I don't see anything wrong with that. And it's
only marginally longer than philip ... /philip and silvia.../silvia.



 Going with HTML in the cues, we either have to drop voices and inner
 timestamps or invent new markup, as HTML can't express either. I don't think
 either of those are really good solutions, so right now I'm not convinced
 that reusing the innerHTML parser is a good way forward.


I don't see a need for the voices - they already have markup in HTML, see
above. But I do wonder about the timestamps. I'd much rather keep the
innerHTML parser if we can, but I don't know enough about how the timestamps
could be introduced in a non-breakable manner. Maybe with a data- attribute?
Maybe span data-t=00:00:02.100.../span?




   Think for example about the case where we had a requirement that a double

 newline starts a new cue, but now we want to introduce a means where the
 double newline is escaped and can be made part of a cue.

 Other formats keep track of their version, such as MS Word files. It is
 to
 be hoped that most new features can be introduced without breaking
 backwards
 compatibility and we can write the parsing requirements such that
 certain
 things will be ignored, but in and of itself, WebSRT doesn't provide for
 this extensibility. Right now, there is for example extensibility with
 the
 WebSRT settings parsing (that's the stuff behind the timestamps) where
 further setting:value settings can be introduced. But for example the
 introduction of new cue identifiers (that's the  marker at the start
 of
 a cue) would be difficult without a version string, since anything that
 doesn't match the given list will just be parsed as cue-internal tag and
 thus end up as part of the cue text where plain text parsing is used.


 The bug I filed suggested allowing arbitrary voices, to simplify the
 parser
 and to make future extensions possible. For a web format I think this is
 a
 better approach format than versioning. I haven't done a full review of
 the
 parser, but there are probably more places where it could be more
 forgiving
 so as to allow future tweaking.


 That's a good approach and will reduce the need for breaking
 backwards-compatibility. In an xml-based format that need is 0, while with
 a
 text format where the structure is ad-hoc, that need can never be reduced
 to
 0. That's what I am concerned about and that's why I think we need a
 version
 identifier. If we end up never using/changing the version identifier, the
 better so. But I'd much rather we have it now and can identify what
 specification a file adheres to than not being able to do so later.


 Perhaps I'm too influenced by HTML and its failed attempts at versioning,
 but I think that if you want to know which version of a spec a document is
 written against, you can run it through a parser for each version. This
 doesn't tell you the author intent, but I'm not sure that's very interesting
 to know. If the author thinks it's important, perhaps it can be put in a
 comment in the header.


I was most concerned about non-backwards-compatible changes here, but let's
not repeat the discussion I had with Anne. Let's rather focus on making sure
we have some means of extending WebSRT in future, should the need arise.




   On the other hand, keeping the same extension and (unregistered) MIME
 type

 as SRT has plenty of benefits, such as immediately being able to use
 existing SRT files in browsers without changing their file extension or
 MIME
 type.



 There is no harm for browsers to accept both MIME types if they are sure
 they can parse 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-11 Thread Anne van Kesteren
On Wed, 11 Aug 2010 15:09:34 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:

HTML and CSS have predefined structures within which their languages grow
and are able to grow. WebSRT has newlines to structure the format, which  
is clearly not very useful for extensibility. No matter how we turn  
this, the xml background or HTML and the name-value background of CSS  
provide them

with in-built extensibility, which WebSRT does not have.


The parser has the bad cue loop concept for ignoring supposedly bogus  
lines. Seems extensible to me.



Sure, that's why the tools should be updated to support the standard  
format instead rather than each having their own variant of SRT.


They don't have their own variant of SRT - they only have their own  
parsers.


That comes down to the same thing in my opinion. This is like saying  
browsers did not all have their own variant of HTML4.



Some will tolerate crap at the end of the -- line. Others won't.  
That's no break of conformance to the basic spec as given in

http://en.wikipedia.org/wiki/SubRip#SubRip_text_file_format . They all
interoperate on the basic SRT format. But they don't interoperate on the
WebSRT format. That's why WebSRT has to be a new format.


By that reasoning HTML5 would have had to be a new format too. And CSS 2.1  
as opposed to CSS 2, etc.




(And if they really just take in text like that they should at least run
some kind of validation so not all kinds of garbage can get in.)


That's not a requirement of the spec. It's requirement is to render
whatever characters are given in cues. That's why it is so simple.


But it is not so simple because various extensions are out there in the  
wild and are used so the concerns you have with respect to WebSRT already  
apply.



Sure. All I need to do is rename the file. Not much trouble at all.  
Better than believing I can just copy stuff from others since it's  
apparently the same format and then it breaks the SRT environment that I  
already have and that works.


At least with the copy approach you would still see something in your SRT  
environment. The ruby bits would just be ignored or some such.




That's already part of Ian's proposal: it already supports multiple
different approaches of parsing cues. No extra complexity here.


Actually that is not true. There is only one approach to parsing in  
Ian's proposal.


A the moment, cues can have one of two different types of content:
(see  
http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#syntax-0


[...]

So that means in essence two different parsers.


Per the parser section there is only one. See the end of

http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#parsing-0


--
Anne van Kesteren
http://annevankesteren.nl/


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-11 Thread Silvia Pfeiffer
On Wed, Aug 11, 2010 at 11:45 PM, Anne van Kesteren ann...@opera.comwrote:

 On Wed, 11 Aug 2010 15:09:34 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

 HTML and CSS have predefined structures within which their languages grow
 and are able to grow. WebSRT has newlines to structure the format, which
 is clearly not very useful for extensibility. No matter how we turn this,
 the xml background or HTML and the name-value background of CSS provide them
 with in-built extensibility, which WebSRT does not have.


 The parser has the bad cue loop concept for ignoring supposedly bogus
 lines. Seems extensible to me.



Hmm, that's for ignoring lines that don't match the -- pattern. It could
work: ignore anything that's inside a WebSRT file and not a cue.

I tend to think of caption files as composed of the following broad
components:
* header-data that is information that applies to the complete file, which
tends to be setup data (such as language, charset, stylesheet link etc) and
metadata (name-value pairs)
* a list of cues, which have their own structure:
  ** start and end time
  ** per-cue header-type data such as more setup data, positioning, text
size etc
  ** the cue text itself (in various structured formats, potentially with
time markers for roll-on presentation)
* comments that can be made at any location

As long as we can make sure we're extensible within these broader areas, I
*think* we should be ok.




  Sure, that's why the tools should be updated to support the standard
 format instead rather than each having their own variant of SRT.


 They don't have their own variant of SRT - they only have their own
 parsers.


 That comes down to the same thing in my opinion. This is like saying
 browsers did not all have their own variant of HTML4.


From an author's point of view, they were not writing multiple different Web
pages, but only trying to accommodate the quirks of each browser in one
page. So, no, I wouldn't regard them as having different versions of HTML4.




  Some will tolerate crap at the end of the -- line. Others won't. That's
 no break of conformance to the basic spec as given in
 http://en.wikipedia.org/wiki/SubRip#SubRip_text_file_format . They all
 interoperate on the basic SRT format. But they don't interoperate on the
 WebSRT format. That's why WebSRT has to be a new format.


 By that reasoning HTML5 would have had to be a new format too. And CSS 2.1
 as opposed to CSS 2, etc.


They interoperate by their sheer structure. It has been made sure that old
browsers will ignore the new additions because there is a structured means
to grow theres. So, no, I believe they are different cases.




  (And if they really just take in text like that they should at least run
 some kind of validation so not all kinds of garbage can get in.)


 That's not a requirement of the spec. It's requirement is to render
 whatever characters are given in cues. That's why it is so simple.


 But it is not so simple because various extensions are out there in the
 wild and are used so the concerns you have with respect to WebSRT already
 apply.


There are two version out there: the plain ones without markup and the ones
with i,b,u and font. Nothing else exists. Those could be called
quirks of the same format. I would prefer if SRT meant only the stuff
without any markup at all, which is supported by everyone who supports SRT.
The thing is, WebSRT isn't even backwards compatible with the quirky SRT
extension: it doesn't support u and font. So, it's neither backwards nor
forwards compatible.



  Sure. All I need to do is rename the file. Not much trouble at all. Better
 than believing I can just copy stuff from others since it's apparently the
 same format and then it breaks the SRT environment that I already have and
 that works.


 At least with the copy approach you would still see something in your SRT
 environment. The ruby bits would just be ignored or some such.


Preferably, I would be using a captioning application which will make me
aware that I am just now adding features that the format the I used for
saving doesn't support. So it gives me the choice of either losing those
features or upgrading to the better format. It's what all text processors
do, too, so people are used to it. And they know to stick to the more
capable formats.




  That's already part of Ian's proposal: it already supports multiple
 different approaches of parsing cues. No extra complexity here.


 Actually that is not true. There is only one approach to parsing in Ian's
 proposal.


 A the moment, cues can have one of two different types of content:
 (see
 http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#syntax-0

 [...]


 So that means in essence two different parsers.


 Per the parser section there is only one. See the end of


 http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#parsing-0


Yeah, I think there's something missing in the spec.

Cheers,
Silvia.


Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-11 Thread Philip Jägenstedt
On Wed, 11 Aug 2010 15:38:32 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


On Wed, Aug 11, 2010 at 10:30 PM, Philip Jägenstedt  
phil...@opera.comwrote:



On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer 
silviapfeiff...@gmail.com wrote:

 On Tue, Aug 10, 2010 at 7:49 PM, Philip Jägenstedt phil...@opera.com

wrote:

I have checked the parse spec and
http://www.whatwg.org/specs/web-apps/current-work/#tag-open-state  
indeed

implies that a tag starting with a number is a parse error. Both, the
timestamps and the voice markers thus seem problems when going with an
innerHTML parser. Is there a way to resolve this? I mean: I'd quite
happily
drop the voice markers for a span @class but I am not sure what to do
about the timestamps. We could do what I did in WMML and introduce a  
t

element with the timestamp as a @at attribute, but that is again more
verbose. We could also introduce an @at attribute in span which would
then
at least end up in the DOM and can be dealt with specially.



What should numerical voices be replaced with? Personally I'd much  
rather
write philip and silvia to mark up a conversation between us two,  
as I
think it'd be quite hard to keep track of the numbers if editing  
subtitles

with many different speakers. However, going with that and using an HTML
parser is quite a hack. Names like mark and li may already have  
special

parsing rules or default CSS.



In HTML it is span class=philip../span and span
class=silvia.../span. I don't see anything wrong with that. And it's
only marginally longer than philip ... /philip and  
silvia.../silvia.



Going with HTML in the cues, we either have to drop voices and inner
timestamps or invent new markup, as HTML can't express either. I don't  
think
either of those are really good solutions, so right now I'm not  
convinced

that reusing the innerHTML parser is a good way forward.



I don't see a need for the voices - they already have markup in HTML, see
above. But I do wonder about the timestamps. I'd much rather keep the
innerHTML parser if we can, but I don't know enough about how the  
timestamps
could be introduced in a non-breakable manner. Maybe with a data-  
attribute?

Maybe span data-t=00:00:02.100.../span?


data- attributes are reserved for use by scripts on the same page, but we  
*could* of course introduce new elements or attributes for this purpose.  
However, adding features to HTML only for use in WebSRT seems a bit odd.



 That would make text/srt and text/websrt synonymous, which is kind of

pointless.




No, it's only pointless if you are a browser vendor. For everyone else  
it

is
a huge advantage to be able to choose between a guaranteed simple  
format

and
a complex format with all the bells and whistles.



 The advantages of taking text/srt is that all existing software to  
create

SRT can be used to create WebSRT




That's not strictly true. If they load a WebSRT file that was created  
by
some other software for further editing and that WebSRT file uses  
advanced

WebSRT functionality, the authoring software will break.



Right, especially settings appended after the timestamps are quite  
likely

to be stripped when saving the file.



Or may even break the software if it's badly implemented, or may end up
inside the cue text - just like the other control instructions which will
end up as plain text inside the cue. You won't believe how many people  
have
pointed out to me that my SRT test parser exposed an i tag markup in  
the
cue text rather than interpreting it, when I was experimenting with  
applying

SRT cues in a HTML div without touching the cue text content. Extraneous
markup is really annoying.


Indeed, but given the option of seeing no subtitles at all and seeing some  
markup from time to time, which do you prefer? For a long time I was using  
a media player that didn't handle HTML in SRT and wasn't very amused at  
seeing i and similar, but it was sure better than no subtitles at all. I  
doubt it will take long for popular software to start ignoring things  
trailing the timestamp and things in square brackets, which is all you  
need for basic compatibility. Some of the tested software already does  
so.


 and servers that already send text/srt don't need to be updated. In  
either

case I think we should support only one mime type.




What's the harm in supporting two mime types but using the same parser  
to

parse them?



Most content will most likely be plain old SRT without voices, ruby or
similar. People will create them using existing software with the .srt
extension and serve them using the text/srt MIME type. When they later
decide to add some ruby or similar, it will just work without  
changing the
extension or MIME type. The net result is that text/srt and text/websrt  
mean

exactly the same thing, making it a wasted effort.



From a Web browser perspective, yes. But not from a caption authoring
perspective. At first, I would author a SRT file. 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-11 Thread Silvia Pfeiffer
On Thu, Aug 12, 2010 at 1:26 AM, Philip Jägenstedt phil...@opera.comwrote:

 On Wed, 11 Aug 2010 15:38:32 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

  On Wed, Aug 11, 2010 at 10:30 PM, Philip Jägenstedt phil...@opera.com
 wrote:

  On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:


  Going with HTML in the cues, we either have to drop voices and inner
 timestamps or invent new markup, as HTML can't express either. I don't
 think
 either of those are really good solutions, so right now I'm not convinced
 that reusing the innerHTML parser is a good way forward.



 I don't see a need for the voices - they already have markup in HTML, see
 above. But I do wonder about the timestamps. I'd much rather keep the
 innerHTML parser if we can, but I don't know enough about how the
 timestamps
 could be introduced in a non-breakable manner. Maybe with a data-
 attribute?
 Maybe span data-t=00:00:02.100.../span?


 data- attributes are reserved for use by scripts on the same page, but we
 *could* of course introduce new elements or attributes for this purpose.
 However, adding features to HTML only for use in WebSRT seems a bit odd.


I'd rather avoid adding features to HTML only for WebSRT. Ian turned the
timestamps into ProcessingInstructions
http://www.whatwg.org/specs/web-apps/current-work/websrt.html#websrt-cue-text-dom-construction-rules.
Could we introduce something like ?t at=00:00:02.100? without
breaking
the innerHTML parser?




   That would make text/srt and text/websrt synonymous, which is kind of

 pointless.



 No, it's only pointless if you are a browser vendor. For everyone else
 it
 is
 a huge advantage to be able to choose between a guaranteed simple format
 and
 a complex format with all the bells and whistles.



  The advantages of taking text/srt is that all existing software to
 create

 SRT can be used to create WebSRT



 That's not strictly true. If they load a WebSRT file that was created by
 some other software for further editing and that WebSRT file uses
 advanced
 WebSRT functionality, the authoring software will break.


 Right, especially settings appended after the timestamps are quite likely
 to be stripped when saving the file.



 Or may even break the software if it's badly implemented, or may end up
 inside the cue text - just like the other control instructions which will
 end up as plain text inside the cue. You won't believe how many people
 have
 pointed out to me that my SRT test parser exposed an i tag markup in the
 cue text rather than interpreting it, when I was experimenting with
 applying
 SRT cues in a HTML div without touching the cue text content. Extraneous
 markup is really annoying.


 Indeed, but given the option of seeing no subtitles at all and seeing some
 markup from time to time, which do you prefer? For a long time I was using a
 media player that didn't handle HTML in SRT and wasn't very amused at
 seeing i and similar, but it was sure better than no subtitles at all. I
 doubt it will take long for popular software to start ignoring things
 trailing the timestamp and things in square brackets, which is all you need
 for basic compatibility. Some of the tested software already does so.


Hmm... not sure if I'd prefer to see the crap or rather be forced to run it
through a stripping tool first. I think what would happen is that I'd start
watching the movie, then notice the crap, get annoyed, stop it, run a
stripping tool, restart the movie. I'd probably prefer noticing that before
I start the movie, which would happen if the file was a different format.
But it does take a bit of expert knowledge to know that websrt can be
easily converted to srt and to have such a stripping tool installed, I give
you that.

OTOH, if you say that it will take a short time for popular software to
start ignoring the extra WebSRT stuff, well, in this case they have
implemented WebSRT support in its most basic form and then there is no
problem any more anyway. They will then accept the new files and their
extensions and mime types and there is explicit support rather than the
dodgy question of whether these SRT files will provide crap or not. During a
transition period, we will make all software that currently supports SRT
become unstable and unreliable. I don't think that's the right way to deal
with an existing ecosystem. Coming in as the big brother, claiming their
underspecified format, throwing in incompatible features, and saying: just
deal with it. It's just not the cavalier thing to do.




   and servers that already send text/srt don't need to be updated. In
 either

 case I think we should support only one mime type.



 What's the harm in supporting two mime types but using the same parser
 to
 parse them?


 Most content will most likely be plain old SRT without voices, ruby or
 similar. People will create them using existing software with the .srt
 extension and serve them using the text/srt MIME type. When 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-10 Thread Philip Jägenstedt
On Tue, 10 Aug 2010 01:34:02 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:


On Tue, Aug 10, 2010 at 12:04 AM, Philip Jägenstedt  
phil...@opera.comwrote:



On Sat, 07 Aug 2010 09:57:39 +0200, Silvia Pfeiffer 
silviapfeiff...@gmail.com wrote:

 Hi Philip,


On Sat, Aug 7, 2010 at 1:50 AM, Philip Jägenstedt phil...@opera.com
wrote:


 I'm not sure of the best solution. I'd quite like the ability to use
arbitrary voices, e.g. to use the names/initials of the speaker rather
than
a number, or to use e.g. shouting in combination with CSS :before {
content 'Shouting: ' } or similar to adapt the display for different
audiences (accessibility, basically).





I agree. I think we can go back to usingspan and @class and @id and  
that

would solve it all.



I guess this is in support of Henri's proposal of parsing the cue using  
the
HTML fragment parser (same as innerHTML)? That would be easy to  
implement,
but how do we then mark up speakers? Using span  
class=narrator/span
around each cue is very verbose. HTML isn't very good for marking up  
dialog,

which is quite a limitation when dealing with subtitles...




I actually think that the span @class mechanism is much more flexible  
than
what we have in WebSRT right now. If we want multiple speakers to be  
able to

speak in the same subtitle, then that's not possible in WebSRT. It's a
little more verbose in HTML, but not massively.

We might be able to add a special markup similar to the [timestamp]  
markup
that Hixie introduced for Karaoke. This is beyond the innerHTML parser  
and I

am not sure if it breaks it. But if it doesn't, then maybe we can also
introduce a [voice] marker to be used similarly?


An HTML parser parsing 1 or 00:01:30 will produce text nodes 1 and  
00:01:30. Without having read the HTML parsing algorithm I guess that  
elements need to begin with a letter or similar. So, it's not possible to  
(ab)use the HTML parser to handle inner timestamps of numerical voices,  
we'd have to replace those with something else, probably more verbose.


  * there is no version number on the format, thus it will be difficult  
to



introduce future changes.


I think we shouldn't have a version number, for the same reason that  
CSS

and HTML don't really have versions. If we evolve the WebSRT spec, it
should
be in a backwards-compatible way.




CSS and HTML are structured formats where you ignore things that you
cannot
interpret. But the parsing is fixed and extensions play within this
parsing
framework. I have my doubts that is possible with WebSRT. Already one
extension that we are discussion here will break parsing: the  
introduction

of structured headers. Because there is no structured way of extending
WebSRT, I believe the best way to communicate whether it is backwards
compatible is through a version number. We can change the minor  
versions

if
the compatibility is not broken - it communicates though what features  
are

being used - and we can change the major version of compatibility is
broken.



Similarly, I think that the WebSRT parser should be designed to ignore
things that it doesn't recognize, in particular unknown voices (if we  
keep

those). Requiring parsers to fail when the version number is increased



oh, you misunderstood me: I am not saying that parser have to fail - it's
good if they don't. But I am saying that if we make a change to the
specification that is not backwards compatible with the previous one and
will thus invariably break parsers, we have to notify parsers somehow  
such
that if they get parse errors they can e.g. notify the user that this is  
a
new version of the WebSRT format which their software doesn't support  
yet.


A browser won't bother their users by saying hey, there was something in  
this page I didn't understand, as users won't know what to do to fix it.



Think for example about the case where we had a requirement that a double
newline starts a new cue, but now we want to introduce a means where the
double newline is escaped and can be made part of a cue.

Other formats keep track of their version, such as MS Word files. It is  
to
be hoped that most new features can be introduced without breaking  
backwards

compatibility and we can write the parsing requirements such that certain
things will be ignored, but in and of itself, WebSRT doesn't provide for
this extensibility. Right now, there is for example extensibility with  
the

WebSRT settings parsing (that's the stuff behind the timestamps) where
further setting:value settings can be introduced. But for example the
introduction of new cue identifiers (that's the  marker at the start  
of

a cue) would be difficult without a version string, since anything that
doesn't match the given list will just be parsed as cue-internal tag and
thus end up as part of the cue text where plain text parsing is used.


The bug I filed suggested allowing arbitrary voices, to simplify the  
parser and to make future extensions possible. 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-10 Thread Silvia Pfeiffer
On Tue, Aug 10, 2010 at 7:49 PM, Philip Jägenstedt phil...@opera.comwrote:

 On Tue, 10 Aug 2010 01:34:02 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

  On Tue, Aug 10, 2010 at 12:04 AM, Philip Jägenstedt phil...@opera.com
 wrote:

  On Sat, 07 Aug 2010 09:57:39 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:


 I guess this is in support of Henri's proposal of parsing the cue using
 the
 HTML fragment parser (same as innerHTML)? That would be easy to
 implement,
 but how do we then mark up speakers? Using span class=narrator/span
 around each cue is very verbose. HTML isn't very good for marking up
 dialog,
 which is quite a limitation when dealing with subtitles...


 I actually think that the span @class mechanism is much more flexible
 than
 what we have in WebSRT right now. If we want multiple speakers to be able
 to
 speak in the same subtitle, then that's not possible in WebSRT. It's a
 little more verbose in HTML, but not massively.

 We might be able to add a special markup similar to the [timestamp]
 markup
 that Hixie introduced for Karaoke. This is beyond the innerHTML parser and
 I
 am not sure if it breaks it. But if it doesn't, then maybe we can also
 introduce a [voice] marker to be used similarly?


 An HTML parser parsing 1 or 00:01:30 will produce text nodes 1 and
 00:01:30. Without having read the HTML parsing algorithm I guess that
 elements need to begin with a letter or similar. So, it's not possible to
 (ab)use the HTML parser to handle inner timestamps of numerical voices, we'd
 have to replace those with something else, probably more verbose.



I have checked the parse spec and
http://www.whatwg.org/specs/web-apps/current-work/#tag-open-state indeed
implies that a tag starting with a number is a parse error. Both, the
timestamps and the voice markers thus seem problems when going with an
innerHTML parser. Is there a way to resolve this? I mean: I'd quite happily
drop the voice markers for a span @class but I am not sure what to do
about the timestamps. We could do what I did in WMML and introduce a t
element with the timestamp as a @at attribute, but that is again more
verbose. We could also introduce an @at attribute in span which would then
at least end up in the DOM and can be dealt with specially.

Just for those who think it's a fancy karaoke feature and isn't really
required: it's actually also a useful feature for captions, in particular
when recording live captions that are usually paint-on. Requirement CC-14
on http://www.w3.org/WAI/PF/HTML/wiki/Media_Accessibility_Requirements also
refers to this need and 608/708 captions provide this functionality, too.




 Similarly, I think that the WebSRT parser should be designed to ignore
 things that it doesn't recognize, in particular unknown voices (if we
 keep
 those). Requiring parsers to fail when the version number is increased



 oh, you misunderstood me: I am not saying that parser have to fail - it's
 good if they don't. But I am saying that if we make a change to the
 specification that is not backwards compatible with the previous one and
 will thus invariably break parsers, we have to notify parsers somehow such
 that if they get parse errors they can e.g. notify the user that this is a
 new version of the WebSRT format which their software doesn't support yet.


 A browser won't bother their users by saying hey, there was something in
 this page I didn't understand, as users won't know what to do to fix it.



I'm not overly worried about browsers. They will just display the wrong
text. They are not normally an authoring or transcoding application. I am
more worried about non-browser applications here, in particular those where
interpreting the text the wrong way will lead to disaster, such as the wrong
data in an archive etc.




  Think for example about the case where we had a requirement that a double
 newline starts a new cue, but now we want to introduce a means where the
 double newline is escaped and can be made part of a cue.

 Other formats keep track of their version, such as MS Word files. It is to
 be hoped that most new features can be introduced without breaking
 backwards
 compatibility and we can write the parsing requirements such that certain
 things will be ignored, but in and of itself, WebSRT doesn't provide for
 this extensibility. Right now, there is for example extensibility with the
 WebSRT settings parsing (that's the stuff behind the timestamps) where
 further setting:value settings can be introduced. But for example the
 introduction of new cue identifiers (that's the  marker at the start
 of
 a cue) would be difficult without a version string, since anything that
 doesn't match the given list will just be parsed as cue-internal tag and
 thus end up as part of the cue text where plain text parsing is used.


 The bug I filed suggested allowing arbitrary voices, to simplify the parser
 and to make future extensions possible. For a web format I think this is a
 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-09 Thread Philip Jägenstedt
On Sat, 07 Aug 2010 09:57:39 +0200, Silvia Pfeiffer  
silviapfeiff...@gmail.com wrote:



Hi Philip,

On Sat, Aug 7, 2010 at 1:50 AM, Philip Jägenstedt phil...@opera.com  
wrote:



* there is a possibility to provide script that just affects the

time-synchronized text resource



I agree that some metadata would be useful, more on that below. I'm not
sure why we would want to run scripts inside the text document, though,  
when

that can be accomplished by using the TimedTrack API from the containing
page.




Scripts inside a timed text document would only be useful for  
applications

that use the track not in conjunction with a Web page.


Do you mean that media players could include a JavaScript engine just for  
supporting scripts in WebSRT? Not to say that it can't happen, but it  
seems a bit unlikely.



2. There is a natural mapping of WebSRT into in-band text tracks.
Each cue naturally maps into a encoding page (just like a WMML cue  
does,

too). But in WebSRT, because the setup information is not brought in a
hierarchical element surrounding all cues, it is easier to just chuck
anything that comes before the first cue into an encoding header page.  
For

WMML, this problem can be solved, but it is less natural.



I really like the idea of letting everything before the first timestamp  
in

WebSRT be interpreted as the header. I'd want to use it like this:

# author: Fan Subber
# voices: 1 Boy
# 2 Girl

01:23:45.678 -- 01:23:46.789
1 Hello

01:23:48.910 -- 01:23:49.101
2 Hello

It's not critical that the format of the header be machine-readable,  
but we

could of course make up a key-value syntax, use JSON, or something else.




I disagree. I think it's absolutely necessary that the format of the  
header
be machine-readable. Just like EXIF in images is machine readable or ID3  
in

MP3 is machine-readable. It would be counter-productive not to have it
machine-readable, in particular useless to archiving and media management
solutions.


OK, so maybe key-values?

Author: Fan Subber
Voice: 1 Boy
Voice: 2 Girl

01:23:45.678 -- 01:23:46.789
1 Hello

This looks a bit like HTTP headers. (I'm not sure I'd actually want to  
allow multiple occurrences of the same key, in practice that seems to  
result in inconsistencies in how people mark up multiple authors.)



I'm not sure of the best solution. I'd quite like the ability to use
arbitrary voices, e.g. to use the names/initials of the speaker rather  
than

a number, or to use e.g. shouting in combination with CSS :before {
content 'Shouting: ' } or similar to adapt the display for different
audiences (accessibility, basically).




I agree. I think we can go back to usingspan and @class and @id and  
that

would solve it all.


I guess this is in support of Henri's proposal of parsing the cue using  
the HTML fragment parser (same as innerHTML)? That would be easy to  
implement, but how do we then mark up speakers? Using span  
class=narrator/span around each cue is very verbose. HTML isn't very  
good for marking up dialog, which is quite a limitation when dealing with  
subtitles...



* there is no language specification for a WebSRT resource; while this
will
not be a problem when used in conjunction with a track element, it  
still
is a problem when the resource is used just by itself, in particular  
as a

hint for font selection and speech synthesis.



The language inside the WebSRT file wouldn't end up being used for  
anything
by a browser, as it needs to know the language before downloading it to  
know
whether or not to download it at all. Still, I'd like a header section  
in

WebSRT. I think the parser is already defined so that it would ignore
garbage before the first cue, so this is more a matter of making it  
legal

syntax.



Not quite. Some metadata in the header can make sense to also expose to  
the

Web page.

I agree that we need a structured header section in WebSRT.


Fair enough, we should revisit this when deciding on how to expose  
metadata in media resources in general.


 * there is no means to identify which parser is required in the cues  
(is

it
plain text, minimal markup, or anything?) and therefore it is not
possible for an application to know how it should parse the cues.



All the types that are actually for visual rendering are parsed in the  
same

way, aren't they? Of course there's no way for non-browsers to know that
metadata tracks aren't interesting to look at as subtitles, but I think
showing the user the garbage is a quicker to communicate that the file  
isn't

for direct viewing than hiding the text or similar.




The spec says that files of kind descriptions and metadata are not
displayed. It seems though that the parsing section will try two  
interfaces:
HTML and plain. I think there is a disconnect there. If we already know  
that

it's not parsable in HTML, why even try?


I was confused. The parsing algorithm does the same thing regardless of  
what kind of text track it is 

Re: [whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-09 Thread Silvia Pfeiffer
On Tue, Aug 10, 2010 at 12:04 AM, Philip Jägenstedt phil...@opera.comwrote:

 On Sat, 07 Aug 2010 09:57:39 +0200, Silvia Pfeiffer 
 silviapfeiff...@gmail.com wrote:

  Hi Philip,

 On Sat, Aug 7, 2010 at 1:50 AM, Philip Jägenstedt phil...@opera.com
 wrote:

  * there is a possibility to provide script that just affects the

 time-synchronized text resource


 I agree that some metadata would be useful, more on that below. I'm not
 sure why we would want to run scripts inside the text document, though,
 when
 that can be accomplished by using the TimedTrack API from the containing
 page.




 Scripts inside a timed text document would only be useful for applications
 that use the track not in conjunction with a Web page.


 Do you mean that media players could include a JavaScript engine just for
 supporting scripts in WebSRT? Not to say that it can't happen, but it seems
 a bit unlikely.



Yes, it's indeed an out there feature and I am not worried about having it
now. I just mentioned it as a simple possibility for extension.




  2. There is a natural mapping of WebSRT into in-band text tracks.

 Each cue naturally maps into a encoding page (just like a WMML cue does,
 too). But in WebSRT, because the setup information is not brought in a
 hierarchical element surrounding all cues, it is easier to just chuck
 anything that comes before the first cue into an encoding header page.
 For
 WMML, this problem can be solved, but it is less natural.


 I really like the idea of letting everything before the first timestamp
 in
 WebSRT be interpreted as the header. I'd want to use it like this:

 # author: Fan Subber
 # voices: 1 Boy
 # 2 Girl

 01:23:45.678 -- 01:23:46.789
 1 Hello

 01:23:48.910 -- 01:23:49.101
 2 Hello

 It's not critical that the format of the header be machine-readable, but
 we
 could of course make up a key-value syntax, use JSON, or something else.




 I disagree. I think it's absolutely necessary that the format of the
 header
 be machine-readable. Just like EXIF in images is machine readable or ID3
 in
 MP3 is machine-readable. It would be counter-productive not to have it
 machine-readable, in particular useless to archiving and media management
 solutions.


 OK, so maybe key-values?

 Author: Fan Subber
 Voice: 1 Boy
 Voice: 2 Girl


 01:23:45.678 -- 01:23:46.789
 1 Hello

 This looks a bit like HTTP headers. (I'm not sure I'd actually want to
 allow multiple occurrences of the same key, in practice that seems to result
 in inconsistencies in how people mark up multiple authors.)



Yes, anything that can replicate the name-value possibilities of the meta
element should be fine.
Multiple occurrences make sense for some fields and not for others.
I wonder if we would need to make a defined list of what should go in here
or just define a general mechanism. HTML has a general mechanism (with
meta) while most subtitle formats have a defined set of fileds, e.g.
http://en.wikipedia.org/wiki/LRC_%28file_format%29 (ID3 tags) or
http://www.matroska.org/technical/specs/subtitles/ssa.html (SSA headers).




  I'm not sure of the best solution. I'd quite like the ability to use
 arbitrary voices, e.g. to use the names/initials of the speaker rather
 than
 a number, or to use e.g. shouting in combination with CSS :before {
 content 'Shouting: ' } or similar to adapt the display for different
 audiences (accessibility, basically).




 I agree. I think we can go back to usingspan and @class and @id and that
 would solve it all.


 I guess this is in support of Henri's proposal of parsing the cue using the
 HTML fragment parser (same as innerHTML)? That would be easy to implement,
 but how do we then mark up speakers? Using span class=narrator/span
 around each cue is very verbose. HTML isn't very good for marking up dialog,
 which is quite a limitation when dealing with subtitles...



I actually think that the span @class mechanism is much more flexible than
what we have in WebSRT right now. If we want multiple speakers to be able to
speak in the same subtitle, then that's not possible in WebSRT. It's a
little more verbose in HTML, but not massively.

We might be able to add a special markup similar to the [timestamp] markup
that Hixie introduced for Karaoke. This is beyond the innerHTML parser and I
am not sure if it breaks it. But if it doesn't, then maybe we can also
introduce a [voice] marker to be used similarly?




   * there is no means to identify which parser is required in the cues (is

 it
 plain text, minimal markup, or anything?) and therefore it is not
 possible for an application to know how it should parse the cues.


 All the types that are actually for visual rendering are parsed in the
 same
 way, aren't they? Of course there's no way for non-browsers to know that
 metadata tracks aren't interesting to look at as subtitles, but I think
 showing the user the garbage is a quicker to communicate that the file
 isn't
 for direct viewing than hiding the text or similar.



[whatwg] Fwd: Discussing WebSRT and alternatives/improvements

2010-08-07 Thread Silvia Pfeiffer
Hi Philip,

On Sat, Aug 7, 2010 at 1:50 AM, Philip Jägenstedt phil...@opera.com wrote:

 If @profile should have any influence on the parser it sounds like this
 isn't actually XML at all. In particular, the HTML would have to be
 well-formed XML, but would still end up in the null namespace.


Yeah, you are right  - I suppose I was trying to imitate the flexibility of
WebSRT there with an anything option.


 I guess simply cloning the child nodes of cue and changing their
 namespace to  before inserting them into an iframe-like document might work,
 but would be quite odd, I think you'll agree.



Yes, it's no different to WebSRT in that respect.



 * there is a possibility to provide script that just affects the
 time-synchronized text resource


 I agree that some metadata would be useful, more on that below. I'm not
 sure why we would want to run scripts inside the text document, though, when
 that can be accomplished by using the TimedTrack API from the containing
 page.



Scripts inside a timed text document would only be useful for applications
that use the track not in conjunction with a Web page.




  The cue elements have a start and end time attribute and contain
 innerHTML, thus there is already parsing code available in Web browsers to
 deal with this content. Any Web content can be introduced into a cue and
 the Web browsers will already be able to render it.


 Yes, but if the HTML parser can't be used for all of WMML, it makes the
 parser quite odd, being neither XML or HTML. I think that realistically the
 best way to make an XML-like format is to simply use XML.



OK. Then everything that's not supposed to be parsed inside a cue would be
escaped. I guess that works, too.




 2. There is a natural mapping of WebSRT into in-band text tracks.
 Each cue naturally maps into a encoding page (just like a WMML cue does,
 too). But in WebSRT, because the setup information is not brought in a
 hierarchical element surrounding all cues, it is easier to just chuck
 anything that comes before the first cue into an encoding header page. For
 WMML, this problem can be solved, but it is less natural.


 I really like the idea of letting everything before the first timestamp in
 WebSRT be interpreted as the header. I'd want to use it like this:

 # author: Fan Subber
 # voices: 1 Boy
 # 2 Girl

 01:23:45.678 -- 01:23:46.789
 1 Hello

 01:23:48.910 -- 01:23:49.101
 2 Hello

 It's not critical that the format of the header be machine-readable, but we
 could of course make up a key-value syntax, use JSON, or something else.



I disagree. I think it's absolutely necessary that the format of the header
be machine-readable. Just like EXIF in images is machine readable or ID3 in
MP3 is machine-readable. It would be counter-productive not to have it
machine-readable, in particular useless to archiving and media management
solutions.





 I'm not sure of the best solution. I'd quite like the ability to use
 arbitrary voices, e.g. to use the names/initials of the speaker rather than
 a number, or to use e.g. shouting in combination with CSS :before {
 content 'Shouting: ' } or similar to adapt the display for different
 audiences (accessibility, basically).



I agree. I think we can go back to usingspan and @class and @id and that
would solve it all.




  4. It's a light-weight format in that it is not very verbose.
 It is nice for hand-authoring if you don't have to write so much. This is
 particularly true for the simple case. E.g. if new-lines that you author
 are
 automatically kept as newlines when interpreted. The drawbacks here are
 that
 as soon as you include more complicated markup into the cues (e.g. HTML
 markup or a SVG image), you're not allowed to put empty lines into it
 because they have a special meaning. So, while it is true that the number
 of
 characters for WebSRT will always be less than for any markup-based
 format,
 this may be really annoying in any of the cases that need more than plain
 text.


 It would be easy to just let the parser consume all lines until the next
 timestamp, but do you really want to separate two lines with a blank line?
 If the two lines aren't really related, one could instead have two cues with
 different vertical positioning.


In marked-up content for readability I would at least not want every newline
to impose a new display line. But I suppose since it's of kind metadata
anyway, that wouldn't happen. So, I see - it's not such a big issue.





  Point 2 is possible in WMML through encoding all outer markup in a
 header
 and the cues in the data packets.


 To be clear, this would be a new codec type for the container, since I'm
 not aware of any that allow stating that the cue text is HTML. The same is
 true of WebSRT, muxing it into e.g. WebM would require the ability to
 express the kind from track kind=captions (although in practice such
 metadata in binary files ends up almost always being incorrect).



All text tracks that are encoded