Re: [whatwg] Timed tracks: feedback compendium

Philip Jägenstedt Fri, 22 Oct 2010 04:09:32 -0700

On Fri, 22 Oct 2010 11:45:24 +0200, Simon Pieters <sim...@opera.com> wrote:

On Fri, 22 Oct 2010 11:21:44 +0200, Silvia Pfeiffer<silviapfeiff...@gmail.com> wrote:

Since the attributes in <track> are a hint, probably what is available
in the file should overrule what is in the <track> attributes. It is
the same for the @charset attribute, which is overruled to utf-8 for
WebSRT IIRC.


No, charset="" overrules the encoding for WebSRT per spec.


We should just remove charset="" from the spec.

* add a means to add comments

e.g.
// Lines starting with // are comments
So far the web two comment syntaxes:  and /* CSSstyle
*/, so if we need comments I think we should pick one of these.


Actually there are three more in javascript:

// line comment
<!-- line comment
--> line comment

http://wiki.whatwg.org/wiki/Web_ECMAScript#HTML_comments

I'm not fussed. I thought your analysis pointed to //, which is also
nicer because it takes the full line into account without a need for
end tags. Also, it is common from C++ and other programming languages.
But I don't really mind - we just need a decision and reasons for why.

Using  is a bad idea since the WebSRT syntax already uses -->. Idon't see the need for multiline comments.

Right. If we must have comments I think I'd prefer /* ... */ since bothCSS and JavaScript have it, and I can't see that single-line comments willbe easier from a parser perspective.

Anyway, I agree that at least a magic header like "WebSRT" is neededbecause
of the horrors of legacy SRT parsing.
I don't see why we can't just consume the legacy and support it inWebSRT. Part of the point with WebSRT is to support the legacy. If wedon't want to support the legacy, then the format can be made a lotcleaner.

Did you read<http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2010-October/028799.html>and look at <http://ale5000.altervista.org/subtitles.htm>?

Do you think it's a good idea to make WebSRT an extension of ale5000-SRT?My opinion is that it's not a very good idea, which of course we cansimplify some aspects of the format. For example, we don't need to allowboth , and . as the millisecond separator, and the time parsing in generalcan be made more sane.

Breaking SRT compat means that we can
go back to requiring UTF-8 as the encoding. However, UTF-8 doescomplicatethe magic header a bit due to the possibility of a BOM [1]. While itwould
be nice to forbid the use of a BOM, I expect we'd then see lots of
frustration from authors who's editors automatically insert it...

[1] http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
I'm happy to enforce UTF-8 on WebSRT. The @charset can work for other
formats. I didn't know about the BOM problem - but having read it, I
would think it makes sense to forbid it. What tools do and how they
deal with erroneous files is a different matter.
Forbidding it would be the frustration. Consider editing a WebSRT filein Notepad, and then suddenly it doesn't work anymore. Instead we shouldallow the BOM. (WebSRT already allows the BOM.)

This means that it's tricker to use "WebSRT" as the magic bytes, but Iagree it's probably the better trade-off.


--
Philip Jägenstedt
Core Developer
Opera Software

Re: [whatwg] Timed tracks: feedback compendium

Reply via email to