On Tue, 24 Aug 2010 04:32:21 +0200, Silvia Pfeiffer <silviapfeiff...@gmail.com> wrote:

On Mon, Aug 23, 2010 at 6:55 PM, Philip Jägenstedt <phil...@opera.com>wrote:

 Aside: WebSRT can't contain binary data, only UTF-8 encoded text.



It sure can. Just base-64 encode it. I'm not saying it's a good thing, but
if somebody really has an urge...


Sure, this would be a metadata track. Sites have no reason to offer
download links to it, and if anyone gets hold of such a file it would
quickly be evident that it's useless.


After a user has seen the crap on screen. I'm just saying: it's a legal
WebSRT file and really not compatible with any existing infrastructure for
SRT.

A fair point. The alternatives I can see are (1) using an incompatible format so that the user sees nothing or (2) adding a header that indicates that the track is metadata.

In order to tell the user to stop wasting their time with this file, I think (1) is clearly worse. (2) is absolutely an option, but it will only make a difference to software that understands this header and if the header is optional it will likely often be omitted. A dialog saying "this is a metadata track, you can't watch it" is slightly friendlier than a screen full of crap, but they are both pretty effective at getting the message across.

If we define WebSRT in a way that can handle >99% of existing content and
degrade gracefully (enough) when using new features in old software, it
seems reasonable to do. If lots of software developers cry foul, then
perhaps we should reconsider. It seems to me, though, that actually
researching and defining a good algorithm for parsing SRT would be of use
to
others than just browsers.


How is that different from moving away from SRT. If everyone has to change their parsing of SRT to accommodate a new spec, then that is a new format.


Not everyone has to change their parsers immediately, many will continue to
work. However, if someone wants to support SRT in a compatible way, it's
very helpful to have a spec, assuming that WebSRT is actually compatible
enough with existing SRT content.

This is quite similar to HTML4 vs HTML5. There are lots of mostly
compatible HTML parsers, but HTML5 defines a single parsing algorithm, and
slow convergence towards that is a good thing.


No, no, no! It is not at all similar to HTML4 and HTML5. A Web browser
cannot suddenly stop working for a Web page, just because it has some extra functionality in it. Thus, the HTML format has been developed such that it
can be extended without breaking existing stuff. We can guarantee that no
browser will break because that is the way in which the format has been
specified.

No such thing has happened for SRT and there is simply no way to guarantee
that all new WebSRT files will work in all existing SRT software, because
SRT has not been specified as a extensible format and because there is no
agreement between all parties that have implemented SRT support as to how
extensions should be made.

We can introduce such a thing for WebSRT, but we cannot claim it for SRT.

You are right, existing SRT parsers are probably far less interoperable than HTML parsers were before HTML5.

Existing content demands that SRT parsers handle at least <i>, <b>, <font> and <u> in some manner, even if it is by ignoring it. Any parsers that treat SRT as plain text don't even work with todays content, so I don't think they should be considered at all. The question, then, is if parsers that handle the mentioned markup also ignore <1>, <ruby> and <rt>. I haven't tested it, but I assume that some will ignore it and some won't. How many percent of the media player market would have to handle this correctly for these extensions to be OK, in your opinion?

If the SRT ecosystem is so fragile that it cannot tolerate any extension
whatsoever, then we should stay far away from it. It just seems that's not
the case.


How do we know that everyone that uses SRT now really wants to use WebSRT
instead and wants to take part in the new ecosystem that we are introducing?
We make some pretty big assumptions about what everyone who is not a Web
browser vendor wants to do with SRT. That doesn't make the existing SRT
ecosystem fragile - but it makes it an existing environment that needs to be
respected.

At this point, what is your recommendation? The following ideas have been on the table:

* Change the file extension to something other than .srt.

I don't have an opinion, browsers ignore the file extension anyway.

* Change the MIME type to something other than text/srt.

I doubt it makes any difference, as most software that deal with SRT today have no concept of MIME types. No matter what I'd want exactly 1 MIME type or alternatively make browsers ignore the MIME type completely.

* Add a header to WebSRT to make it uniquely identifiable.

The header would have to be mandatory and browsers would have to reject files that don't have it. Such files would be compatible with some existing software and break some, depending on how they sniff. We could also put metadata in such a header.

* Make something deliberately incompatible with SRT.

It doesn't make a big difference to browsers implementing the format. We'd be replacing something that mostly works in existing players with something that never works.



Here's the SRT research I promised: http://blog.foolip.org/2010/08/20/srt-research/

--
Philip Jägenstedt
Core Developer
Opera Software

Reply via email to