Re: [Gen-art] Gen-art last call review of draft-ietf-codec-opus-12.txt (completed)

2012-05-18 Thread Elwyn Davies
Hi, Jean-Marc and co-authors

On Thu, 2012-05-17 at 12:49 -0400, Jean-Marc Valin wrote:
 Hi Elwyn,
 
 We're submitted a -14 draft to address your comments. Again, see the
 response to each of these issues in the attached document.

I think we are done.  Thanks once again for the rapid response.

Regards,
Elwyn

PS
I still prefer octets.

/E
 
 Thanks again,
 
   Jean-Marc
 
 
 On 12-05-16 05:26 PM, Elwyn Davies wrote:
  Hi, Jean-Marc.
  
  ... and thanks for the super-quick response!  You have been quite busy.
  
  I have had a look through the new draft and I think the additions help
  considerably with comprehension for the naive (and to give new
  implementers a way in.)
 



Re: [Gen-art] Gen-art last call review of draft-ietf-codec-opus-12.txt (completed)

2012-05-17 Thread Cullen Jennings

I think the authors just about have a -14 draft ready but I wanted to comment 
on one topic inline 


On May 16, 2012, at 3:26 PM, Elwyn Davies wrote:

 Hi, Jean-Marc.
 
 ... and thanks for the super-quick response!  You have been quite busy.
 
 I have had a look through the new draft and I think the additions help
 considerably with comprehension for the naive (and to give new
 implementers a way in.)
 
 I'll leave you to negotiate with the RFC Editor over the Wikipaedia
 references.  To quote the RFC Style guide
 http://www.rfc-editor.org/rfc-style-guide/rfc-style
 Section 4.8, item (x) References, last section:
 URLs and DNS names in RFCs
 
  The use of URLs in RFCs is discouraged, because many URLs are not
  stable references.  Exceptions may be made for normative
  references in those cases where the URL is demonstrably the most
  stable reference available.  References to long-lived files on
  ietf.org and rfc-editor.org are generally acceptable.
 They are certainly convenient *as long as they remain in place and
 aren't corrupted*.
 
 I found a couple of trivial editorial nits in the changes:
 s4.3.3 (in the added text):
 The CELT layer, however, can adapt over a very wide range of rates,
 and thus has a large number of codebooks sizes
 s/codebooks/codebook/
 
 s4.3.3, para after Table 57: s?the maximums in bit/sample are
 precomputed?the maximums in bits/sample are precomputed?
 
 Also suggest:
 s4.3: Add reference for Bark scale: Zwicker, E. (1961), Subdivision of
 the audible frequency range into critical bands, The Journal of the
 Acoustical Society of America, 33, Feb., 1961.
 
 A few responses in line below (agreed pieces elided): 
 
 Regards,
 Elwyn  Davies
 
 
 On Tue, 2012-05-15 at 20:33 -0400, Jean-Marc Valin wrote:
 Hi Elwyn,
 
 Thanks for the very thorough review. We've addressed your issues and
 submitted draft version -13. See our response to each of the issues you
 raised (aggregated from all the authors) in the attached document.
 
 Thanks very much to all the authors.
 
 Cheers,
 
  Jean-Marc
 
 
 
 Elwyn Davies wrote:
 Major issues: 
 Can we accept the code as normative?  If not how do we proceed?
 
 The issue with code being normative was specifically addressed in 
 the guidelines document for this WG (RFC 6569).
 
 Yes.  I failed to read this although Robert Sparks did point out to me
 that the code was normative - but he didn't think he said this was
 agreed in advance (or maybe I didn't understand what he was saying).
 
 To be honest I would like to have had the time to tie the text to the
 code but realistically that is several weeks or even months work to do
 it properly - without that I feel that I have only done half a job. I
 may decide that it interests me enough to have a superficial go next
 week but no promises!
 
 
 Minor issues:
 Contrast between decoder descriptions of LP part and CELT part:  The
 SILK descriptions go into gory detail on the values used in lots of
 tables, etc., whereas the CELT part has a very limited treatment of
 the
 numeric values used (assuming reliance on finding the values in the
 reference implementation, either explictly or implicitly).  There
 are
 things to be said for both techniques.  I was wondering (while
 reading
 the SILK description) if the authors have any means of automatically
 generating the tables from the code in the SILK part (or vice versa)
 to
 avoid double maintenance. On the other hand, there are pieces of the
 CELT decoder description (especially in s4.3.3 where knowing numbers
 of
 bands, etc.) where some actual numbers would help comprehension.
 
 
 We have made many changes to section 4.3 (and 4.3.3 specifically) to
 address
 the specific issues below. As for the tables, they are not generated
 automatically.
 
 I think this is addressed satisfactorily now.  There is still some
 difference but it is much reduced and not so glaring now. The addition
 of the band tables really helps.
 
 
 
 s2 (and more generally):  The splitting of the signal in the
 frequency
 domain into signal (components?) below and above 8kHz presumably
 requires that the signal is subjected to a Discrete Fourier
 Transform to
 allow the signal to be split.  I think sometging is needed in s2 to
 explain how this is managed (or if I don't understand, to explain
 why it
 isn't necessary).
 
 No DFT is used. The lower band is obtained through resampling (which
 is already described) and the higher band is obtained by not coding
 the lower band with CELT (the text says that CELT starts at band 17 in
 hybrid mode). The explanation was reworded to make this as clear as
 possible at this point in the text.
 
 [I thought I had reworded this comment in the 2nd version to talk about
 MDCT but no matter]. 
 Yes, para 5 of s2 does say that the bands are discarded.  I think it
 would useful to have a concrete statement in the new text added to 

Re: [Gen-art] Gen-art last call review of draft-ietf-codec-opus-12.txt (completed)

2012-05-17 Thread Elwyn Davies
Hi, Jean-Marc.

... and thanks for the super-quick response!  You have been quite busy.

I have had a look through the new draft and I think the additions help
considerably with comprehension for the naive (and to give new
implementers a way in.)

I'll leave you to negotiate with the RFC Editor over the Wikipaedia
references.  To quote the RFC Style guide
http://www.rfc-editor.org/rfc-style-guide/rfc-style
Section 4.8, item (x) References, last section:
 URLs and DNS names in RFCs
 
   The use of URLs in RFCs is discouraged, because many URLs are not
   stable references.  Exceptions may be made for normative
   references in those cases where the URL is demonstrably the most
   stable reference available.  References to long-lived files on
   ietf.org and rfc-editor.org are generally acceptable.
They are certainly convenient *as long as they remain in place and
aren't corrupted*.

I found a couple of trivial editorial nits in the changes:
s4.3.3 (in the added text):
 The CELT layer, however, can adapt over a very wide range of rates,
 and thus has a large number of codebooks sizes
s/codebooks/codebook/

s4.3.3, para after Table 57: s?the maximums in bit/sample are
precomputed?the maximums in bits/sample are precomputed?

Also suggest:
s4.3: Add reference for Bark scale: Zwicker, E. (1961), Subdivision of
the audible frequency range into critical bands, The Journal of the
Acoustical Society of America, 33, Feb., 1961.

A few responses in line below (agreed pieces elided): 

Regards,
Elwyn  Davies


On Tue, 2012-05-15 at 20:33 -0400, Jean-Marc Valin wrote:
 Hi Elwyn,
 
 Thanks for the very thorough review. We've addressed your issues and
 submitted draft version -13. See our response to each of the issues you
 raised (aggregated from all the authors) in the attached document.

Thanks very much to all the authors.
 
 Cheers,
 
   Jean-Marc
 


 Elwyn Davies wrote:
  Major issues: 
  Can we accept the code as normative?  If not how do we proceed?
 
 The issue with code being normative was specifically addressed in 
 the guidelines document for this WG (RFC 6569).
 
Yes.  I failed to read this although Robert Sparks did point out to me
that the code was normative - but he didn't think he said this was
agreed in advance (or maybe I didn't understand what he was saying).

To be honest I would like to have had the time to tie the text to the
code but realistically that is several weeks or even months work to do
it properly - without that I feel that I have only done half a job. I
may decide that it interests me enough to have a superficial go next
week but no promises!

 
  Minor issues:
  Contrast between decoder descriptions of LP part and CELT part:  The
  SILK descriptions go into gory detail on the values used in lots of
  tables, etc., whereas the CELT part has a very limited treatment of
 the
  numeric values used (assuming reliance on finding the values in the
  reference implementation, either explictly or implicitly).  There
 are
  things to be said for both techniques.  I was wondering (while
 reading
  the SILK description) if the authors have any means of automatically
  generating the tables from the code in the SILK part (or vice versa)
 to
  avoid double maintenance. On the other hand, there are pieces of the
  CELT decoder description (especially in s4.3.3 where knowing numbers
 of
  bands, etc.) where some actual numbers would help comprehension.
  
 
 We have made many changes to section 4.3 (and 4.3.3 specifically) to
 address
 the specific issues below. As for the tables, they are not generated
 automatically.

I think this is addressed satisfactorily now.  There is still some
difference but it is much reduced and not so glaring now. The addition
of the band tables really helps.

 
 
  s2 (and more generally):  The splitting of the signal in the
 frequency
  domain into signal (components?) below and above 8kHz presumably
  requires that the signal is subjected to a Discrete Fourier
 Transform to
  allow the signal to be split.  I think sometging is needed in s2 to
  explain how this is managed (or if I don't understand, to explain
 why it
  isn't necessary).
 
 No DFT is used. The lower band is obtained through resampling (which
 is already described) and the higher band is obtained by not coding
 the lower band with CELT (the text says that CELT starts at band 17 in
 hybrid mode). The explanation was reworded to make this as clear as
 possible at this point in the text.

[I thought I had reworded this comment in the 2nd version to talk about
MDCT but no matter]. 
Yes, para 5 of s2 does say that the bands are discarded.  I think it
would useful to have a concrete statement in the new text added to s4.3
that bands 0 to 16 are discarded in hybrid mode (thereby making the 17
in the band boost section more obvious) [There is a comment below that
you have added some text about band 17 in 

Re: [Gen-art] Gen-art last call review of draft-ietf-codec-opus-12.txt (completed)

2012-05-17 Thread Jean-Marc Valin
Hi Elwyn,

We're submitted a -14 draft to address your comments. Again, see the
response to each of these issues in the attached document.

Thanks again,

Jean-Marc


On 12-05-16 05:26 PM, Elwyn Davies wrote:
 Hi, Jean-Marc.
 
 ... and thanks for the super-quick response!  You have been quite busy.
 
 I have had a look through the new draft and I think the additions help
 considerably with comprehension for the naive (and to give new
 implementers a way in.)

On 12-05-16 05:26 PM, Elwyn Davies wrote:
 The CELT layer, however, can adapt over a very wide range of rates,
 and thus has a large number of codebooks sizes
 s/codebooks/codebook/

Fixed.

 s4.3.3, para after Table 57: s?the maximums in bit/sample are
 precomputed?the maximums in bits/sample are precomputed?

Fixed.

 Also suggest:
 s4.3: Add reference for Bark scale: Zwicker, E. (1961), Subdivision of
 the audible frequency range into critical bands, The Journal of the
 Acoustical Society of America, 33, Feb., 1961.

Done.

 No DFT is used. The lower band is obtained through resampling (which
 is already described) and the higher band is obtained by not coding
 the lower band with CELT (the text says that CELT starts at band 17 in
 hybrid mode). The explanation was reworded to make this as clear as
 possible at this point in the text.
 
 [I thought I had reworded this comment in the 2nd version to talk about
 MDCT but no matter]. 
 Yes, para 5 of s2 does say that the bands are discarded.  I think it
 would useful to have a concrete statement in the new text added to s4.3
 that bands 0 to 16 are discarded in hybrid mode (thereby making the 17
 in the band boost section more obvious) [There is a comment below that
 you have added some text about band 17 in section 4.3 but I can't see
 it].

Sorry, we started working on the revision as soon as you sent the first part
of the review, and then we just copied the new parts (didn't notice some of
the review changed).

Also, the reason you didn't see the new explanations about band 17 in s4.3 is 
that we moved them to s2 para 5, to help make the explanation of the signal 
splitting clearer, but forgot to update the response to indicate that (sorry 
about that). However, it probably does make sense to explain it in both places, 
so the sentence,
  In hybrid mode, the first 17 bands (up to 8 kHz) are not coded.
has been added to s4.3 as well.

 As explained in the LBRR text, a 10 ms frame will only contain 10 ms
 LBRR data even if the previous frame was 20 ms, so there's 10 ms
 missing.
 Indeed - that there would be a hole was clear.  The 'How' referred to
 how would it be concealed.  Having read further by now this may be down
 to Packet Loss Concealment - so maybe all it needs is a foward ref to
 s4.4. 

Reference added.

 We believe that in the field of audio codecs, the mention of byte
 without
 further context is well understood to mean 8 bits.
 
 True. But this is a matter of IETF style.  The style is to use octets
 where we mean 8 bit bytes. I think you now have a mixture!
 


Indeed, there's a bit of inconsistency here. Considering Cullen's email,
the document now uses byte consistently.

 s4.2.7.5.1, para 1: s/This indexes an element in a coarse codebook,
 selects the PDFs for the second stage of the VQ/This indexes an
 element in a coarse codebook that selects the PDFs for the
 second stage
 of the VQ/

 The text as written is correct. The index I1 is what selects the PDFs
 for the second stage, not the vector from the coarse codebook in
 Tables 23 and 24. I.e., it's saying, This does A, B, and C.
 
 OK.  I think it might be clearer if the three things were separated out
 as a list.  Now you point it out I can read it correctly but it
 triggered minor confusion - worth turning the three things into bullet
 points.

This is not a bad idea. I agree it helps make things clearer. Done.

 NEW:  s4.3: Add reference for Bark scale: Zwicker, E. (1961),
 Subdivision of the audible frequency range into critical bands, The
 Journal of the Acoustical Society of America, 33, Feb., 1961.

Done (as stated above).

 s4.3.3: (was specified as s 4.3.2.3 whcj was wrong) Paragraph on
 decoding band boosts:  Might be improved by using
 equations rather than the wordy descriptions used at present.
 
 Any thoughts on this one

Oops, that one slipped through. While most of the text is actually
describing an algorithm rather than an equation, it was possible to simplify
the part about the quanta with an equation. The text now reads:
For each band from the coding start (0 normally, but 17 in Hybrid mode)
to the coding end (which changes depending on the signaled bandwidth), the 
boost quanta
in units of 1/8 bit is calculated as: quanta = min(8*N, max(48, N)).

 s4.3.3: snip.

 Added an explanation of band 17 
 
 I don't think this happened.

See above -- operator error. It's fixed now.

 The tests in s6.1 are measuring the quality of the decoded signal
 against the 
 reference decoded signal, not against 

Re: Gen-art last call review of draft-ietf-codec-opus-12.txt (completed)

2012-05-16 Thread Jean-Marc Valin
Hi Elwyn,

Thanks for the very thorough review. We've addressed your issues and
submitted draft version -13. See our response to each of the issues you
raised (aggregated from all the authors) in the attached document.

Cheers,

Jean-Marc

On 12-05-14 09:13 AM, Elwyn Davies wrote:
 I am the assigned Gen-ART reviewer for this draft. For background on 
 Gen-ART, please see the FAQ at 
 http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq. 
 
 Please resolve these comments along with any other Last Call comments 
 you may receive. 
 
 Document: draft-ietf-codec-opus-12.txt
 Reviewer:  Elwyn Davies
 Review Date:  14 May 2012 (completed)
 IETF LC End Date: 10 May 2012
 IESG Telechat date: (if known) -
 
Elwyn Davies wrote:
 Major issues: 
 Can we accept the code as normative?  If not how do we proceed?

The issue with code being normative was specifically addressed in 
the guidelines document for this WG (RFC 6569).

 Minor issues:
 Contrast between decoder descriptions of LP part and CELT part:  The
 SILK descriptions go into gory detail on the values used in lots of
 tables, etc., whereas the CELT part has a very limited treatment of the
 numeric values used (assuming reliance on finding the values in the
 reference implementation, either explictly or implicitly).  There are
 things to be said for both techniques.  I was wondering (while reading
 the SILK description) if the authors have any means of automatically
 generating the tables from the code in the SILK part (or vice versa) to
 avoid double maintenance. On the other hand, there are pieces of the
 CELT decoder description (especially in s4.3.3 where knowing numbers of
 bands, etc.) where some actual numbers would help comprehension.
 

We have made many changes to section 4.3 (and 4.3.3 specifically) to address
the specific issues below. As for the tables, they are not generated
automatically.


 s2 (and more generally):  The splitting of the signal in the frequency
 domain into signal (components?) below and above 8kHz presumably
 requires that the signal is subjected to a Discrete Fourier Transform to
 allow the signal to be split.  I think sometging is needed in s2 to
 explain how this is managed (or if I don't understand, to explain why it
 isn't necessary).

No DFT is used. The lower band is obtained through resampling (which is already 
described) and the higher band is obtained by not coding the lower band with 
CELT (the text says that CELT starts at band 17 in hybrid mode). The 
explanation was reworded to make this as clear as possible at this point in the 
text.

 s4.1: Despite having found a copy of the range coding paper and tried to
 read it, I found the whole business of range coding opaque.  Given that
 this is (I think) fairly novel, some additional less opaque description
 could be very helpful to people trying to understand what is going on.
 In particular knowledge of how entropy coding works is pretty much a
 prerequisite.

Added a link to the Wikipedia article on range coding.

 s4.2.5, para 3:
 When switching from 20 ms to 10 ms, the 10 ms
 20 ms Opus frame, potentially leaving a hole that needs to be
 concealed from even a single packet loss.
 How?

As explained in the LBRR text, a 10 ms frame will only contain 10 ms LBRR data 
even if the previous frame was 20 ms, so there's 10 ms missing.

 s4.2.5, para 4:
 In order to properly produce LBRR frames under all conditions, an
 encoder might need to buffer up to 60 ms of audio and re-encode it
 during these transitions.  However, the reference implementation opts
 to disable LBRR frames at the transition point for simplicity.

 Should this be phrased in RFC 2119 requiremenmts language?  The first
 part sounds like a SHOULD with the second part being the get out , but
 its not entirely clear what the consequences are other then simplicity.

There really is no SHOULD there. We're just describing possible strategies for 
encoding. In general, we have avoided 2119 language as much as possible for the 
encoder. However, the sentence, Since transitions are relatively infrequent in 
normal usage, this does not have a significant impact on packet loss 
robustness, was added to alleviate any concerns about the consequences of this 
decision.

 ==
  MINOR ISSUES ABOVE submitted as part 1 of review
 ==
 general/s11.2:  Several references [Hadamard], [Viterbi}, etc., are to
 Wikipaedia pages.  Whilst these are convenient (and only illustrative)
 they are not guaranteed to be very stable.  Better (i.e., more stable)
 references are desirable.

We believe that the links themselves should be quite stable. We could 
also link specific revisions in Wikipedia, but the quality of the content
is more likely to get better than worse. Considering
that these references are informational, we believe that Wikipedia is a
good 

Re: Gen-art last call review of draft-ietf-codec-opus-12.txt (completed)

2012-05-15 Thread Elwyn Davies
On Mon, 2012-05-14 at 17:08 -0600, Cullen Jennings wrote:
 Thank you kindly for the detailed review. More inline ...
 
 On May 14, 2012, at 9:06 AM, Elwyn Davies wrote:
 
  Summary: 
  Before offering some views on the document, let me say that this piece
  of work seems to be a tour de force on behalf of its developers.  It is
  certainly one of the (if not the) most technically complex pieces of
  work that has been presented to the IETF,
 
 And I would like to say thank you for the detailed review - having read the 
 whole draft several times, I know that ones eyes can start to glaze over on 
 what seems like something almost as long as the NFS spec ... so thanks for 
 the review. 
 
Not sure I would say 'it was my pleasure' but it certainly taught me a
thing or two!

 I did want to comment on the one major issues 
 
  
  
  Major issues: 
  Can we accept the code as normative?  If not how do we proceed?
 
 RFC 6569 says that the code needs to be the normative specification so
 I think this issues is pretty much resolved by that RFC. 
 
 As a bit of irrelevant history - this topic was discussed at various
 stages. If was discussed in the chartering process -
 draft-valin-codec-guidelines referred to by the charter said code
 would be normative. I don't mean by this that the charter said the
 code had to be normative but just that this conversation goes back to
 before the WG was formed. It was later discussed in the WG and
 resulted in WG consensus to having the code as normative. Just as
 background on a few of the reasons that this was decided this way,
 many people felt that the spec would be much longer, and much harder
 to implement or review if the text was normative. Code is a very
 compact representation of what happens on boundary edge conditions and
 eliminates many types of misinterpretations. I do understand normative
 code introduces other types of issues but it is a fairly common
 practice in specifying codecs to use normative code.
 
 Cullen Codec Co-Chair

I don't have a problem with this in principle: however as a reviewer I
get this nagging feeling that on a few days acquaintance, its impossible
to demonstrate that the text accurately, albeit non-normatively
describes the code.  And that makes me feel I have done a half job.
However, I also know that starting from scratch it would take me several
months to get anywhere near an accurate view (I know from considerable
experience with the DTN2 reference implementation and years ago the
source code of yacc - now that ages me!) 

If the wg, doc shepherd and the IESG are happy that the mapping is near
enough, so be it.  And, sorry, I didn't look at RFC 6569 during my
review.. probably shoyuld have done.

Regards,
Elwyn
 
 
 
 
 
 
 



Gen-art last call review of draft-ietf-codec-opus-12.txt (completed)

2012-05-14 Thread Elwyn Davies
I am the assigned Gen-ART reviewer for this draft. For background on 
Gen-ART, please see the FAQ at 
http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq. 

Please resolve these comments along with any other Last Call comments 
you may receive. 

Document: draft-ietf-codec-opus-12.txt
Reviewer:  Elwyn Davies
Review Date:  14 May 2012 (completed)
IETF LC End Date: 10 May 2012
IESG Telechat date: (if known) -

Summary: 
Before offering some views on the document, let me say that this piece
of work seems to be a tour de force on behalf of its developers.  It is
certainly one of the (if not the) most technically complex pieces of
work that has been presented to the IETF, and is far more mathematically
complex and rigorous than our usual run of standards.  It is also
perhaps a vindication of the rough concensus and running code
philosophy. So congratulations to the authors and the companies that
have supported this work.  But back to the review

I came to this review with a very outdated view of what goes on inside
codecs, so it was rather a shock to the system.  Taking that into
account, I think that some additional high level, up front description
of the two parts (SILK and CELT) and the range encoding system would
make it easier for future (naive) readers and potential implementers to
understand what is going on.  For example, after struggling with the
description of CELT in the decoder (s4.3) I found that going off and
reading the slides on CELT presented at IETF 79 helped considerably.
Such additions might also help future dissemination of the technique by
giving potential users a quick overview.  It would also make it easier
to justify the decision to postpone the high level (and highly
mathematical/technical) view to section 5 after the detailed description
of the decoder.

I also found that the descriptions of the SILK and CELT decoders
appeared to be written with a different perspective (see detailed
comments) doubtless by different authors.  This is not ideal and,
despite the expected further increase in page count, some additional
text in s4.3 seems desirable.

By far the most difficult to parse section is 4.3.3 talking about bit
allocation in CELT.  Despite considerable study, I didn't feel I had
really got to grips with how this worked.  An example might help.

Finally, the most contentious aspect of this document is the requirement
to treat the code as normative.  There are just over 30,000 physical
lines of code and I haven't checked them hardly at all.  Slocount
reckons this represents 7.14 man years of effort.  As with the document,
there is a discrepancy between CELT and SILK with the latter code being
more heavily commented, especially as regards routine parameters.  A
proper validation of the claim that the code implements the description
would take several weeks of time: I guess that we have to take the
document shepherd's assurances on this.  One issue is that both code and
document are pretty much saturated with what we used to call 'magic
numbers'.  Although these are explained in the document, it does not
appear that this is always the case in the code.  I would also be
happier if the code contained Doxygen (or similar) dcoumentation
statements for all routines (rather than just the API).

So overall, the work is something that ought to be standardized but the
document needs further work - not quite ready for the IESG. 

Major issues: 
Can we accept the code as normative?  If not how do we proceed?

Minor issues:
Contrast between decoder descriptions of LP part and CELT part:  The
SILK descriptions go into gory detail on the values used in lots of
tables, etc., whereas the CELT part has a very limited treatment of the
numeric values used (assuming reliance on finding the values in the
reference implementation, either explictly or implicitly).  There are
things to be said for both techniques.  I was wondering (while reading
the SILK description) if the authors have any means of automatically
generating the tables from the code in the SILK part (or vice versa) to
avoid double maintenance. On the other hand, there are pieces of the
CELT decoder description (especially in s4.3.3 where knowing numbers of
bands, etc.) where some actual numbers would help comprehension.

s2 (and more generally):  The splitting of the signal in the frequency
domain into signal (components?) below and above 8kHz in the hybrid case
presumably requires that the output of the CELT encoder is windowed so
that only one of encoders gnerates output below 8kHZ.  I think something
is needed in s2 to explain how this is managed (presumably to do with
energy bands?).  I didn't find anything in s5 about what has to be done
for the encoder when running in hybrid mode which surprised me somewhat.

s4.1: Despite having found a copy of the range coding paper and tried to
read it, I found the description of range coding opaque (there are some
more words later in s5 but this is a bit late for understanding the
decoder).  

Re: Gen-art last call review of draft-ietf-codec-opus-12.txt (completed)

2012-05-14 Thread SM

Hi Elwyn,
At 08:06 14-05-2012, Elwyn Davies wrote:

I am the assigned Gen-ART reviewer for this draft. For background on


[snip]


Summary:
Before offering some views on the document, let me say that this piece
of work seems to be a tour de force on behalf of its developers.  It is
certainly one of the (if not the) most technically complex pieces of
work that has been presented to the IETF, and is far more mathematically
complex and rigorous than our usual run of standards.  It is also


Let me say that this is one of the best Gen-ART reviews I have 
read.  It is not easy to review draft-ietf-codec-opus-12.  I agree 
that it is the most technically complex piece of work within all IETF 
areas.  The normative choice made by the WG is highly unusual.  The 
alternative would be much more work; it might even be an unworkable option.


Regards,
-sm 



Re: Gen-art last call review of draft-ietf-codec-opus-12.txt (completed)

2012-05-14 Thread Cullen Jennings

Thank you kindly for the detailed review. More inline ...

On May 14, 2012, at 9:06 AM, Elwyn Davies wrote:

 Summary: 
 Before offering some views on the document, let me say that this piece
 of work seems to be a tour de force on behalf of its developers.  It is
 certainly one of the (if not the) most technically complex pieces of
 work that has been presented to the IETF,

And I would like to say thank you for the detailed review - having read the 
whole draft several times, I know that ones eyes can start to glaze over on 
what seems like something almost as long as the NFS spec ... so thanks for the 
review. 

I did want to comment on the one major issues 

 
 
 Major issues: 
 Can we accept the code as normative?  If not how do we proceed?

RFC 6569 says that the code needs to be the normative specification so I think 
this issues is pretty much resolved by that RFC. 

As a bit of irrelevant history - this topic was discussed at various stages. If 
was discussed in the chartering process - draft-valin-codec-guidelines referred 
to by the charter said code would be normative. I don't mean by this that the 
charter said the code had to be normative but just that this conversation goes 
back to before the WG was formed. It was later discussed in the WG and resulted 
in WG consensus to having the code as normative. Just as background on a few of 
the reasons that this was decided this way, many people felt that the spec 
would be much longer, and much harder to implement or review if the text was 
normative. Code is a very compact representation of what happens on boundary 
edge conditions and eliminates many types of misinterpretations. I do 
understand normative code introduces other types of issues but it is a fairly 
common practice in specifying codecs to use normative code.

Cullen Codec Co-Chair