Re: [FFmpeg-devel] [PATCH 4/6] avformat/mov: parse ISO-14496-12 ChannelLayout

2023-10-30 Thread Zhao Zhili


> On Feb 24, 2023, at 21:49, Jan Ekström  wrote:
> 
> On Fri, Feb 24, 2023 at 6:25 AM Zhao Zhili  > wrote:
>> 
>> From: Zhao Zhili mailto:zhiliz...@tencent.com>>
>> 
>> Signed-off-by: Zhao Zhili > >
> 
> Hah, I actually happened to recently start coding uncompressed audio
> support in mp4 myself, but what this commit is handling is what
> basically killed my version off since the channel layout box is
> required.
> 
> If you're interested you can check my take over at
> https://github.com/jeeb/ffmpeg/commits/pcmc_parsing_improvements .
> 
> Will comment on some things.

I only have an old copy of the spec, and I may have missed some comments
and made some mistakes. Please notify me in mailing list or personal email
(this one) if I didn’t something wrong.

I have network issue with IRC, can only read the archives if I get the time.
I don’t work on open source for daily jobs.

> 
>> ---
>> libavformat/mov.c  |  79 +++-
>> libavformat/mov_chan.c | 265 +
>> libavformat/mov_chan.h |  26 
>> 3 files changed, 369 insertions(+), 1 deletion(-)
>> 
>> diff --git a/libavformat/mov.c b/libavformat/mov.c
>> index b125343f84..1db869aa2e 100644
>> --- a/libavformat/mov.c
>> +++ b/libavformat/mov.c
>> @@ -940,6 +940,82 @@ static int mov_read_chan(MOVContext *c, AVIOContext 
>> *pb, MOVAtom atom)
>> return 0;
>> }
>> 
>> +static int mov_read_chnl(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>> +{
>> +int64_t end = av_sat_add64(avio_tell(pb), atom.size);
>> +int stream_structure;
>> +int ret = 0;
>> +AVStream *st;
>> +
>> +if (c->fc->nb_streams < 1)
>> +return 0;
>> +st = c->fc->streams[c->fc->nb_streams-1];
>> +
>> +/* skip version and flags */
>> +avio_skip(pb, 4);
> 
> We should really not do this any more. Various FullBoxes have multiple
> versions or depend on the flags. See how I have added FullBox things
> recently, although I would prefer us to have a generic macro/function
> setup for this where you then get the version and flags as arguments
> or whatever in the future.
> 
> For this specific box, there are now versions 0 and 1 defined since
> circa 2018-2019 or so (visible at least in 14496-12 2022)
> 
> Since ISO/IEC has changed the rules for free specifications (against
> the wishes of various spec authors) and all that jazz, this is how
> it's defined in what I have on hand:
> 
> 12.2.4  Channel layout
> 
> 12.2.4.1  Definition
> 
> Box Types:  'chnl'
> Container: Audio sample entry
> Mandatory: No
> Quantity: Zero or one
> 
> This box may appear in an audio sample entry to document the
> assignment of channels in the audio
> stream. It is recommended to use this box to convey the base channel
> count for the DownMixInstructions
> box and other DRC-related boxes specified in ISO/IEC 23003-4.
> The channel layout can be all or part of a standard layout (from an
> enumerated list), or a custom layout
> (which also allows a track to contribute part of an overall layout).
> A stream may contain channels, objects, neither, or both. A stream
> that is neither channel nor object
> structured can implicitly be rendered in a variety of ways.
> 
> 12.2.4.2  Syntax
> 
> aligned(8) class ChannelLayout extends FullBox('chnl', version, flags=0) {
>   if (version==0) {
>  unsigned int(8) stream_structure;
>  if (stream_structure & channelStructured) {
> unsigned int(8) definedLayout;
>  if (definedLayout==0) {
>for (i = 1 ; i <= layout_channel_count ; i++) {
>   //  layout_channel_count comes from the sample entry
>   unsigned int(8) speaker_position;
>   if (speaker_position == 126) {   // explicit position
>  signed int (16) azimuth;
>  signed int (8)  elevation;
>   }
>}
> } else {
>unsigned int(64)   omittedChannelsMap;
>  // a ‘1’ bit indicates ‘not in this track’
> }
>  }
>  if (stream_structure & objectStructured) {
> unsigned int(8) object_count;
>  }
>   } else {
>  unsigned int(4) stream_structure;
>  unsigned int(4) format_ordering;
>  unsigned int(8) baseChannelCount;
>  if (stream_structure & channelStructured) {
> unsigned int(8) definedLayout;
> if (definedLayout==0) {
>unsigned int(8) layout_channel_count;
>for (i = 1 ; i <= layout_channel_count ; i++) {
>   unsigned int(8) speaker_position;
>   if (speaker_position == 126) {   // explicit position
>  signed int (16) azimuth;
>  signed int (8)  elevation;
>   }
>}
> } else {
>int(4) reserved = 0;
>unsigned int(3) channel_order_definition;
>unsigned int(1) omitted_channels_present;
>if (omitted_channels_present == 1) {
>  

Re: [FFmpeg-devel] [PATCH 4/6] avformat/mov: parse ISO-14496-12 ChannelLayout

2023-10-30 Thread Zhao Zhili


> On Feb 24, 2023, at 21:49, Jan Ekström  wrote:
> 
> On Fri, Feb 24, 2023 at 6:25 AM Zhao Zhili  > wrote:
>> 
>> From: Zhao Zhili mailto:zhiliz...@tencent.com>>
>> 
>> Signed-off-by: Zhao Zhili > >
> 
> Hah, I actually happened to recently start coding uncompressed audio
> support in mp4 myself, but what this commit is handling is what
> basically killed my version off since the channel layout box is
> required.
> 
> If you're interested you can check my take over at
> https://github.com/jeeb/ffmpeg/commits/pcmc_parsing_improvements .
> 
> Will comment on some things.

I only have an old copy of the spec, and I may have missed some comments
and made some mistakes. Please notify me in mailing list or personal email
(this one) if I didn’t something wrong.

I have network issue with IRC, can only read the archives if I get the time.
I don’t work on open source for daily jobs.

> 
>> ---
>> libavformat/mov.c  |  79 +++-
>> libavformat/mov_chan.c | 265 +
>> libavformat/mov_chan.h |  26 
>> 3 files changed, 369 insertions(+), 1 deletion(-)
>> 
>> diff --git a/libavformat/mov.c b/libavformat/mov.c
>> index b125343f84..1db869aa2e 100644
>> --- a/libavformat/mov.c
>> +++ b/libavformat/mov.c
>> @@ -940,6 +940,82 @@ static int mov_read_chan(MOVContext *c, AVIOContext 
>> *pb, MOVAtom atom)
>> return 0;
>> }
>> 
>> +static int mov_read_chnl(MOVContext *c, AVIOContext *pb, MOVAtom atom)
>> +{
>> +int64_t end = av_sat_add64(avio_tell(pb), atom.size);
>> +int stream_structure;
>> +int ret = 0;
>> +AVStream *st;
>> +
>> +if (c->fc->nb_streams < 1)
>> +return 0;
>> +st = c->fc->streams[c->fc->nb_streams-1];
>> +
>> +/* skip version and flags */
>> +avio_skip(pb, 4);
> 
> We should really not do this any more. Various FullBoxes have multiple
> versions or depend on the flags. See how I have added FullBox things
> recently, although I would prefer us to have a generic macro/function
> setup for this where you then get the version and flags as arguments
> or whatever in the future.
> 
> For this specific box, there are now versions 0 and 1 defined since
> circa 2018-2019 or so (visible at least in 14496-12 2022)
> 
> Since ISO/IEC has changed the rules for free specifications (against
> the wishes of various spec authors) and all that jazz, this is how
> it's defined in what I have on hand:
> 
> 12.2.4  Channel layout
> 
> 12.2.4.1  Definition
> 
> Box Types:  'chnl'
> Container: Audio sample entry
> Mandatory: No
> Quantity: Zero or one
> 
> This box may appear in an audio sample entry to document the
> assignment of channels in the audio
> stream. It is recommended to use this box to convey the base channel
> count for the DownMixInstructions
> box and other DRC-related boxes specified in ISO/IEC 23003-4.
> The channel layout can be all or part of a standard layout (from an
> enumerated list), or a custom layout
> (which also allows a track to contribute part of an overall layout).
> A stream may contain channels, objects, neither, or both. A stream
> that is neither channel nor object
> structured can implicitly be rendered in a variety of ways.
> 
> 12.2.4.2  Syntax
> 
> aligned(8) class ChannelLayout extends FullBox('chnl', version, flags=0) {
>   if (version==0) {
>  unsigned int(8) stream_structure;
>  if (stream_structure & channelStructured) {
> unsigned int(8) definedLayout;
>  if (definedLayout==0) {
>for (i = 1 ; i <= layout_channel_count ; i++) {
>   //  layout_channel_count comes from the sample entry
>   unsigned int(8) speaker_position;
>   if (speaker_position == 126) {   // explicit position
>  signed int (16) azimuth;
>  signed int (8)  elevation;
>   }
>}
> } else {
>unsigned int(64)   omittedChannelsMap;
>  // a ‘1’ bit indicates ‘not in this track’
> }
>  }
>  if (stream_structure & objectStructured) {
> unsigned int(8) object_count;
>  }
>   } else {
>  unsigned int(4) stream_structure;
>  unsigned int(4) format_ordering;
>  unsigned int(8) baseChannelCount;
>  if (stream_structure & channelStructured) {
> unsigned int(8) definedLayout;
> if (definedLayout==0) {
>unsigned int(8) layout_channel_count;
>for (i = 1 ; i <= layout_channel_count ; i++) {
>   unsigned int(8) speaker_position;
>   if (speaker_position == 126) {   // explicit position
>  signed int (16) azimuth;
>  signed int (8)  elevation;
>   }
>}
> } else {
>int(4) reserved = 0;
>unsigned int(3) channel_order_definition;
>unsigned int(1) omitted_channels_present;
>if (omitted_channels_present == 1) {
>  

Re: [FFmpeg-devel] [PATCH 4/6] avformat/mov: parse ISO-14496-12 ChannelLayout

2023-02-24 Thread Zhao Zhili
On Fri, 2023-02-24 at 15:49 +0200, Jan Ekström wrote:
> On Fri, Feb 24, 2023 at 6:25 AM Zhao Zhili  wrote:
> > 
> > From: Zhao Zhili 
> > 
> > Signed-off-by: Zhao Zhili 
> 
> Hah, I actually happened to recently start coding uncompressed audio
> support in mp4 myself, but what this commit is handling is what
> basically killed my version off since the channel layout box is
> required.
> 
> If you're interested you can check my take over at
> https://github.com/jeeb/ffmpeg/commits/pcmc_parsing_improvements .

Sorry I didn't notice your work on this issue. I have cherry-picked
the first two patches from your branch in v2. Is it OK for you?

It's tediousFor the channel layout supports. Some of the layouts aren't
supported yet, and some of the details are unclear. Please help review
and improve this part.
 
> 
> Will comment on some things.
> 
> > ---
> >  libavformat/mov.c  |  79 +++-
> >  libavformat/mov_chan.c | 265 +
> >  libavformat/mov_chan.h |  26 
> >  3 files changed, 369 insertions(+), 1 deletion(-)
> > 
> > diff --git a/libavformat/mov.c b/libavformat/mov.c
> > index b125343f84..1db869aa2e 100644
> > --- a/libavformat/mov.c
> > +++ b/libavformat/mov.c
> > @@ -940,6 +940,82 @@ static int mov_read_chan(MOVContext *c, AVIOContext 
> > *pb, MOVAtom atom)
> >  return 0;
> >  }
> > 
> > +static int mov_read_chnl(MOVContext *c, AVIOContext *pb, MOVAtom atom)
> > +{
> > +int64_t end = av_sat_add64(avio_tell(pb), atom.size);
> > +int stream_structure;
> > +int ret = 0;
> > +AVStream *st;
> > +
> > +if (c->fc->nb_streams < 1)
> > +return 0;
> > +st = c->fc->streams[c->fc->nb_streams-1];
> > +
> > +/* skip version and flags */
> > +avio_skip(pb, 4);
> 
> We should really not do this any more. Various FullBoxes have multiple
> versions or depend on the flags. See how I have added FullBox things
> recently, although I would prefer us to have a generic macro/function
> setup for this where you then get the version and flags as arguments
> or whatever in the future.

I have added version and flags check, and only supports version 0 with
patch v2. Welcome to add version 1 supports :)

I agree with the idea to cleanup the handling of version and flags for
future proof.

> 
> For this specific box, there are now versions 0 and 1 defined since
> circa 2018-2019 or so (visible at least in 14496-12 2022)
> 
> Since ISO/IEC has changed the rules for free specifications (against
> the wishes of various spec authors) and all that jazz, this is how
> it's defined in what I have on hand:
> 
> 12.2.4  Channel layout
> 
> 12.2.4.1  Definition
> 
> Box Types:  'chnl'
> Container: Audio sample entry
> Mandatory: No
> Quantity: Zero or one
> 
> This box may appear in an audio sample entry to document the
> assignment of channels in the audio
> stream. It is recommended to use this box to convey the base channel
> count for the DownMixInstructions
> box and other DRC-related boxes specified in ISO/IEC 23003-4.
> The channel layout can be all or part of a standard layout (from an
> enumerated list), or a custom layout
> (which also allows a track to contribute part of an overall layout).
> A stream may contain channels, objects, neither, or both. A stream
> that is neither channel nor object
> structured can implicitly be rendered in a variety of ways.
> 
> 12.2.4.2  Syntax
> 
> aligned(8) class ChannelLayout extends FullBox('chnl', version, flags=0) {
>if (version==0) {
>   unsigned int(8) stream_structure;
>   if (stream_structure & channelStructured) {
>  unsigned int(8) definedLayout;
>   if (definedLayout==0) {
> for (i = 1 ; i <= layout_channel_count ; i++) {
>//  layout_channel_count comes from the sample entry
>unsigned int(8) speaker_position;
>if (speaker_position == 126) {   // explicit position
>   signed int (16) azimuth;
>   signed int (8)  elevation;
>}
> }
>  } else {
> unsigned int(64)   omittedChannelsMap;
>   // a ‘1’ bit indicates ‘not in this track’
>  }
>   }
>   if (stream_structure & objectStructured) {
>  unsigned int(8) object_count;
>   }
>} else {
>   unsigned int(4) stream_structure;
>   unsigned int(4) format_ordering;
>   unsigned int(8) baseChannelCount;
>   if (stream_structure & channelStructured) {
>  unsigned int(8) definedLayout;
>  if (definedLayout==0) {
> unsigned int(8) layout_channel_count;
> for (i = 1 ; i <= layout_channel_count ; i++) {
>unsigned int(8) speaker_position;
>if (speaker_position == 126) {   // explicit position
>   signed int (16) azimuth;
>   signed int (8)  elevation;
>}
> }
>  } else {
> int(4) r

Re: [FFmpeg-devel] [PATCH 4/6] avformat/mov: parse ISO-14496-12 ChannelLayout

2023-02-24 Thread Zhao Zhili
On Fri, 2023-02-24 at 10:42 +0100, Tomas Härdin wrote:
> fre 2023-02-24 klockan 20:25 +0800 skrev Zhao Zhili:
> > +    if (!layout) {
> > +    uint8_t positions[64] = {};
> 
> Is there a maximum number of channels defined somewhere? stsd supports
> up to 65535.

AV_CHANNEL_ORDER_NATIVE supports up to 63 different channels.
Patchset v2 adds AVChannelCustom support.

> 
> > +    // stream carries objects
> > +    if (stream_structure & 2) {
> > +    int obj_count = avio_r8(pb);
> > +    av_log(c->fc, AV_LOG_TRACE, "'chnl' with object_count %d\n",
> > obj_count);
> > +    }
> > +
> > +    avio_seek(pb, end, SEEK_SET);
> 
> I feel we should complain loudly if there's bytes not accounted for, at
> least when (stream_structure & 2) == 0

Patchset v2 adds log message when skipping unknown bytes.
After a second thought, I made a mistake that unknown bytes
comes after ChannelLayout belonging to AudioSampleEntryV1,
not inside ChannelLayout. Will drop the check and seek in v3.

class AudioSampleEntryV1(codingname) extends SampleEntry (codingname){
...

ChannelLayout();
// we permit any number of DownMix or DRC boxes:
DownMixInstructions() [];
DRCCoefficientsBasic() [];
...

> 
> The rest I can't say much about
> 
> /Tomas
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH 4/6] avformat/mov: parse ISO-14496-12 ChannelLayout

2023-02-24 Thread Jan Ekström
On Fri, Feb 24, 2023 at 6:25 AM Zhao Zhili  wrote:
>
> From: Zhao Zhili 
>
> Signed-off-by: Zhao Zhili 

Hah, I actually happened to recently start coding uncompressed audio
support in mp4 myself, but what this commit is handling is what
basically killed my version off since the channel layout box is
required.

If you're interested you can check my take over at
https://github.com/jeeb/ffmpeg/commits/pcmc_parsing_improvements .

Will comment on some things.

> ---
>  libavformat/mov.c  |  79 +++-
>  libavformat/mov_chan.c | 265 +
>  libavformat/mov_chan.h |  26 
>  3 files changed, 369 insertions(+), 1 deletion(-)
>
> diff --git a/libavformat/mov.c b/libavformat/mov.c
> index b125343f84..1db869aa2e 100644
> --- a/libavformat/mov.c
> +++ b/libavformat/mov.c
> @@ -940,6 +940,82 @@ static int mov_read_chan(MOVContext *c, AVIOContext *pb, 
> MOVAtom atom)
>  return 0;
>  }
>
> +static int mov_read_chnl(MOVContext *c, AVIOContext *pb, MOVAtom atom)
> +{
> +int64_t end = av_sat_add64(avio_tell(pb), atom.size);
> +int stream_structure;
> +int ret = 0;
> +AVStream *st;
> +
> +if (c->fc->nb_streams < 1)
> +return 0;
> +st = c->fc->streams[c->fc->nb_streams-1];
> +
> +/* skip version and flags */
> +avio_skip(pb, 4);

We should really not do this any more. Various FullBoxes have multiple
versions or depend on the flags. See how I have added FullBox things
recently, although I would prefer us to have a generic macro/function
setup for this where you then get the version and flags as arguments
or whatever in the future.

For this specific box, there are now versions 0 and 1 defined since
circa 2018-2019 or so (visible at least in 14496-12 2022)

Since ISO/IEC has changed the rules for free specifications (against
the wishes of various spec authors) and all that jazz, this is how
it's defined in what I have on hand:

12.2.4  Channel layout

12.2.4.1  Definition

Box Types:  'chnl'
Container: Audio sample entry
Mandatory: No
Quantity: Zero or one

This box may appear in an audio sample entry to document the
assignment of channels in the audio
stream. It is recommended to use this box to convey the base channel
count for the DownMixInstructions
box and other DRC-related boxes specified in ISO/IEC 23003-4.
The channel layout can be all or part of a standard layout (from an
enumerated list), or a custom layout
(which also allows a track to contribute part of an overall layout).
A stream may contain channels, objects, neither, or both. A stream
that is neither channel nor object
structured can implicitly be rendered in a variety of ways.

12.2.4.2  Syntax

aligned(8) class ChannelLayout extends FullBox('chnl', version, flags=0) {
   if (version==0) {
  unsigned int(8) stream_structure;
  if (stream_structure & channelStructured) {
 unsigned int(8) definedLayout;
  if (definedLayout==0) {
for (i = 1 ; i <= layout_channel_count ; i++) {
   //  layout_channel_count comes from the sample entry
   unsigned int(8) speaker_position;
   if (speaker_position == 126) {   // explicit position
  signed int (16) azimuth;
  signed int (8)  elevation;
   }
}
 } else {
unsigned int(64)   omittedChannelsMap;
  // a ‘1’ bit indicates ‘not in this track’
 }
  }
  if (stream_structure & objectStructured) {
 unsigned int(8) object_count;
  }
   } else {
  unsigned int(4) stream_structure;
  unsigned int(4) format_ordering;
  unsigned int(8) baseChannelCount;
  if (stream_structure & channelStructured) {
 unsigned int(8) definedLayout;
 if (definedLayout==0) {
unsigned int(8) layout_channel_count;
for (i = 1 ; i <= layout_channel_count ; i++) {
   unsigned int(8) speaker_position;
   if (speaker_position == 126) {   // explicit position
  signed int (16) azimuth;
  signed int (8)  elevation;
   }
}
 } else {
int(4) reserved = 0;
unsigned int(3) channel_order_definition;
unsigned int(1) omitted_channels_present;
if (omitted_channels_present == 1) {
   unsigned int(64)   omittedChannelsMap;
 // a ‘1’ bit indicates ‘not in this track’
}
 }
  }
  if (stream_structure & objectStructured) {
 // object_count is derived from baseChannelCount
  }
   }
}

12.2.4.3  Semantics

version is an integer that specifies the version of this box (0 or 1).
When authoring, version 1 should be
preferred over version 0. Version 1 conveys the channel
ordering, which is not always the case for
version 0. Version 1 should be used to convey the base channel
count for DRC.

stream_structure is a field 

Re: [FFmpeg-devel] [PATCH 4/6] avformat/mov: parse ISO-14496-12 ChannelLayout

2023-02-24 Thread Tomas Härdin
fre 2023-02-24 klockan 20:25 +0800 skrev Zhao Zhili:
> +    if (!layout) {
> +    uint8_t positions[64] = {};

Is there a maximum number of channels defined somewhere? stsd supports
up to 65535.

> +    // stream carries objects
> +    if (stream_structure & 2) {
> +    int obj_count = avio_r8(pb);
> +    av_log(c->fc, AV_LOG_TRACE, "'chnl' with object_count %d\n",
> obj_count);
> +    }
> +
> +    avio_seek(pb, end, SEEK_SET);

I feel we should complain loudly if there's bytes not accounted for, at
least when (stream_structure & 2) == 0

The rest I can't say much about

/Tomas

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".