Re: [FFmpeg-devel] [PATCH v3] avcodec/jpeg2000dec: support of 2 fields in 1 AVPacket

Jerome Martinez Sat, 24 Feb 2024 20:14:39 -0800

On 24/02/2024 13:26, Tomas Härdin wrote:

[...]

It should be possible to have ffmpeg set up the necessary plumbing
for
this.

But is it how it works elsewhere in FFmpeg? Would such complex and
deep
modifications be accepted by others?

Good question. I would propose something like the following:


1) detect the use of SEPARATE_FIELDS and set a flag in AVStream

As in practice and in that case (2 jp2k codestreams per AVPacket) it isonly a tip (because we can autodetect and there are many buggy files inthe wild) for the jpeg2000 decoder, I was planning to add that later ina separate patch, but attached is a version with the flag.

2) allocate AVFrame for the size of the resulting *frame*

So keeping what is already there.

3a) if the codec is inherently interlaced, call the decoder once
3b) if the codec is not inherently interlaced, call the decoder twice,
with appropriate stride, and keep track of the number of bytes decoded
so far so we know what offset to start the second decode from



The place I see for that is in decode_simple_internal().

But it is a very hot place I don't like to modify, and it seems to mesome extra code for 99.9999% (or even more 9s) of files which don't needsuch feature, with more risk to forget this specific feature during afuture dev e.g. not obvious to change also in ff_thread_decode_framewhen touching this part.I also needed to add a dedicated AVStream field for saying that thedecoder is able to manage this functionality (and is needed there).

What is the added value to call the decoder twice from decode.c ratherthan recursive call (or a master function in the decoder calling thecurrent function twice, if preferred) inside the decoder only?As far as I understand, it would not help for other formats (only thesignaling propagation in AVStream helps and it is done by anotherAVStream field) and I personally highly prefer that such feature is asmuch as possible in a single place in each decoder rather than pieces abit everywhere, and each decoder needs to be upgraded anyway.

The codecs for which 3b) applies include at least:

* jpeg2000


Our use case.

* ffv1

FFV1 has its own flags internally for interlaced content (interleavedmethod only) and I expect no work for separated fields. the MXF/FFV1spec does not plan separated fields for FFV1, and there is no byte inthe essence label for that.

* rawvideo
* tiff

I didn't find specifications for the essence label UL corresponding andI have no file for that, as far as I understand it is highly theoreticalbut if it appears would be only a matter of mapping the MXF signaling tothe new AVStream field and supporting the feature in the decoders (evenif we implement the idea of calling the decoder twice, the decoder needsto be expanded for this feature).

So IMO no dev to do there too for the moment.

Jérôme

From f4311b718012a92590ce6168355ec118e02052a8 Mon Sep 17 00:00:00 2001
From: Jerome Martinez <[email protected]>
Date: Tue, 20 Feb 2024 16:04:11 +0100
Subject: [PATCH] avcodec/jpeg2000dec: support of 2 fields in 1 AVPacket

---
 libavcodec/avcodec.h       | 14 +++++++++
 libavcodec/codec_par.c     |  3 ++
 libavcodec/codec_par.h     |  5 ++++
 libavcodec/decode.c        |  3 ++
 libavcodec/defs.h          |  7 +++++
 libavcodec/jpeg2000dec.c   | 73 +++++++++++++++++++++++++++++++++++++++-------
 libavcodec/jpeg2000dec.h   |  6 ++++
 libavcodec/pthread_frame.c |  3 ++
 libavformat/mxfdec.c       | 14 +++++++++
 9 files changed, 118 insertions(+), 10 deletions(-)

diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
index 7fb44e28f4..38d63adc0f 100644
--- a/libavcodec/avcodec.h
+++ b/libavcodec/avcodec.h
@@ -2116,6 +2116,20 @@ typedef struct AVCodecContext {
      *   an error.
      */
     int64_t frame_num;
+
+    /**
+     * Video only. The way separate fields are wrapped in the container
+     * - decoding: tip from the demuxer
+     * - encoding: not (yet) used
+     */
+    enum AVFrameWrapping               frame_wrapping;
+
+    /**
+     * Video only. Indicate if running the decoder twice for a single AVFrame 
is supported
+     * - decoding: set by the decoder
+     * - encoding: not  used
+     */
+    int                                frame_wrapping_field_2_supported;
 } AVCodecContext;
 
 /**
diff --git a/libavcodec/codec_par.c b/libavcodec/codec_par.c
index abaac63841..3f26f9d4d6 100644
--- a/libavcodec/codec_par.c
+++ b/libavcodec/codec_par.c
@@ -51,6 +51,7 @@ static void codec_parameters_reset(AVCodecParameters *par)
     par->framerate           = (AVRational){ 0, 1 };
     par->profile             = AV_PROFILE_UNKNOWN;
     par->level               = AV_LEVEL_UNKNOWN;
+    par->frame_wrapping      = AV_WRAPPING_UNKNOWN;
 }
 
 AVCodecParameters *avcodec_parameters_alloc(void)
@@ -165,6 +166,7 @@ int avcodec_parameters_from_context(AVCodecParameters *par,
         par->sample_aspect_ratio = codec->sample_aspect_ratio;
         par->video_delay         = codec->has_b_frames;
         par->framerate           = codec->framerate;
+        par->frame_wrapping      = codec->frame_wrapping;
         break;
     case AVMEDIA_TYPE_AUDIO:
         par->format           = codec->sample_fmt;
@@ -252,6 +254,7 @@ int avcodec_parameters_to_context(AVCodecContext *codec,
         codec->sample_aspect_ratio    = par->sample_aspect_ratio;
         codec->has_b_frames           = par->video_delay;
         codec->framerate              = par->framerate;
+        codec->frame_wrapping         = par->frame_wrapping;
         break;
     case AVMEDIA_TYPE_AUDIO:
         codec->sample_fmt       = par->format;
diff --git a/libavcodec/codec_par.h b/libavcodec/codec_par.h
index f42dd3b1d5..1e53292553 100644
--- a/libavcodec/codec_par.h
+++ b/libavcodec/codec_par.h
@@ -136,6 +136,11 @@ typedef struct AVCodecParameters {
     enum AVFieldOrder                  field_order;
 
     /**
+     * Video only. The way separate fields are wrapped in the container
+     */
+    enum AVFrameWrapping               frame_wrapping;
+
+    /**
      * Video only. Additional colorspace characteristics.
      */
     enum AVColorRange                  color_range;
diff --git a/libavcodec/decode.c b/libavcodec/decode.c
index 2cfb3fcf97..979759d84a 100644
--- a/libavcodec/decode.c
+++ b/libavcodec/decode.c
@@ -428,6 +428,9 @@ static inline int decode_simple_internal(AVCodecContext 
*avctx, AVFrame *frame,
         consumed = ff_thread_decode_frame(avctx, frame, &got_frame, pkt);
     } else {
         consumed = codec->cb.decode(avctx, frame, &got_frame, pkt);
+        if (consumed >= 0 && avctx->frame_wrapping_field_2_supported) {
+            consumed = codec->cb.decode(avctx, frame, &got_frame, pkt);
+        }
 
         if (!(codec->caps_internal & FF_CODEC_CAP_SETS_PKT_DTS))
             frame->pkt_dts = pkt->dts;
diff --git a/libavcodec/defs.h b/libavcodec/defs.h
index 00d840ec19..8f7ecf81c5 100644
--- a/libavcodec/defs.h
+++ b/libavcodec/defs.h
@@ -204,6 +204,13 @@ enum AVFieldOrder {
     AV_FIELD_BT,          ///< Bottom coded first, top displayed first
 };
 
+enum AVFrameWrapping {
+    AV_WRAPPING_UNKNOWN,
+    AV_WRAPPING_FRAME,    ///< if interlaced content: lines are interleaved
+    AV_WRAPPING_FIELD_1,  ///< each field is an independent encoded item, 1 
field per AVPacket
+    AV_WRAPPING_FIELD_2,  ///< each field is an independent encoded item, 2 
fields per AVPacket
+};
+
 /**
  * @ingroup lavc_decoding
  */
diff --git a/libavcodec/jpeg2000dec.c b/libavcodec/jpeg2000dec.c
index 691cfbd891..c9b935d97b 100644
--- a/libavcodec/jpeg2000dec.c
+++ b/libavcodec/jpeg2000dec.c
@@ -194,6 +194,8 @@ static int get_siz(Jpeg2000DecoderContext *s)
     int ret;
     int o_dimx, o_dimy; //original image dimensions.
     int dimx, dimy;
+    int previous_width = s->width;
+    int previous_height = s->height;
 
     if (bytestream2_get_bytes_left(&s->g) < 36) {
         av_log(s->avctx, AV_LOG_ERROR, "Insufficient space for SIZ\n");
@@ -211,7 +213,7 @@ static int get_siz(Jpeg2000DecoderContext *s)
     s->tile_offset_y  = bytestream2_get_be32u(&s->g); // YT0Siz
     ncomponents       = bytestream2_get_be16u(&s->g); // CSiz
 
-    if (av_image_check_size2(s->width, s->height, s->avctx->max_pixels, 
AV_PIX_FMT_NONE, 0, s->avctx)) {
+    if (av_image_check_size2(s->width, s->height << (s->has_2_fields && 
s->height >= 0), s->avctx->max_pixels, AV_PIX_FMT_NONE, 0, s->avctx)) {
         avpriv_request_sample(s->avctx, "Large Dimensions");
         return AVERROR_PATCHWELCOME;
     }
@@ -301,6 +303,20 @@ static int get_siz(Jpeg2000DecoderContext *s)
             return AVERROR(ENOMEM);
     }
 
+    /* management of frames having 2 separate codestreams */
+    if (s->has_2_fields) {
+        s->height <<= 1;
+        s->image_offset_y <<= 1;
+        s->tile_offset_y <<= 1;
+        if (s->is_second_field && (s->width != previous_width || s->height != 
previous_height)) {
+            avpriv_request_sample(s->avctx, "Support of 2 JPEG 2000 
codestreams with different base characteristics");
+            return AVERROR_PATCHWELCOME;
+        }
+        if (s->image_offset_y || s->tile_offset_y || (s->tile_height << 1) != 
s->height) {
+            av_log(s->avctx, AV_LOG_WARNING, "Decoding of 2 fields having 
titles in 1 AVPacket was not tested\n");
+        }
+    }
+
     /* compute image size with reduction factor */
     o_dimx = ff_jpeg2000_ceildivpow2(s->width  - s->image_offset_x,
                                                s->reduction_factor);
@@ -2001,7 +2017,7 @@ static inline void tile_codeblocks(const 
Jpeg2000DecoderContext *s, Jpeg2000Tile
                                                                                
                   \
             y    = tile->comp[compno].coord[1][0] -                            
                   \
                    ff_jpeg2000_ceildiv(s->image_offset_y, s->cdy[compno]);     
                   \
-            line = (PIXEL *)picture->data[plane] + y * 
(picture->linesize[plane] / sizeof(PIXEL));\
+            line = (PIXEL *)picture->data[plane] + (y + (s->is_second_field ^ 
s->is_bottom_coded_first)) * (picture->linesize[plane] / sizeof(PIXEL));\
             for (; y < h; y++) {                                               
                   \
                 PIXEL *dst;                                                    
                   \
                                                                                
                   \
@@ -2028,7 +2044,7 @@ static inline void tile_codeblocks(const 
Jpeg2000DecoderContext *s, Jpeg2000Tile
                         dst += pixelsize;                                      
                   \
                     }                                                          
                   \
                 }                                                              
                   \
-                line += picture->linesize[plane] / sizeof(PIXEL);              
                   \
+                line += (picture->linesize[plane] << s->has_2_fields) / 
sizeof(PIXEL);            \
             }                                                                  
                   \
         }                                                                      
                   \
                                                                                
                   \
@@ -2441,6 +2457,9 @@ static av_cold int jpeg2000_decode_init(AVCodecContext 
*avctx)
 
     ff_jpeg2000dsp_init(&s->dsp);
     ff_jpeg2000_init_tier1_luts();
+    
+    s->has_2_fields = avctx->frame_wrapping == AV_WRAPPING_FIELD_2;
+    avctx->frame_wrapping_field_2_supported = s->has_2_fields;
 
     return 0;
 }
@@ -2450,9 +2469,10 @@ static int jpeg2000_decode_frame(AVCodecContext *avctx, 
AVFrame *picture,
 {
     Jpeg2000DecoderContext *s = avctx->priv_data;
     int ret;
+    int codestream_size;
 
     s->avctx     = avctx;
-    bytestream2_init(&s->g, avpkt->data, avpkt->size);
+    bytestream2_init(&s->g, avpkt->data + s->consumed, avpkt->size - 
s->consumed);
     s->curtileno = -1;
     memset(s->cdef, -1, sizeof(s->cdef));
 
@@ -2484,20 +2504,50 @@ static int jpeg2000_decode_frame(AVCodecContext *avctx, 
AVFrame *picture,
         ret = AVERROR_INVALIDDATA;
         goto end;
     }
+
+    /* management of frames having 2 separate codestreams */
+    if (s->has_2_fields && !s->is_second_field) {
+        switch (avctx->field_order) {
+        case AV_FIELD_BB:
+        case AV_FIELD_BT:
+            s->is_bottom_coded_first = 1;
+            break;
+        default:
+            s->is_bottom_coded_first = 0;
+        }
+    }
+
     if (ret = jpeg2000_read_main_headers(s))
         goto end;
+    codestream_size = avpkt->size - bytestream2_get_bytes_left(&s->g);
+
+    /* management of frames having 2 separate codestreams */
+    if (bytestream2_get_bytes_left(&s->g) > 1 && bytestream2_peek_be16(&s->g) 
== JPEG2000_SOC) {
+        if (!s->has_2_fields) {
+            /* 2 codestreams newly detected, adatping output frame structure 
for handling 2 codestreams and parsing again the headers (fast and never done 
if wrapper has the right tip) */
+            s->has_2_fields = 1;
+            jpeg2000_dec_cleanup(s);
+            return jpeg2000_decode_frame(avctx, picture, got_frame, avpkt);
+        }
+    } else if (s->has_2_fields && !s->is_second_field) {
+        /* 1 codestream newly detected, adatping output frame structure for 
handling 1 codestream and parsing again the headers (fast and never done if 
wrapper has the right tip) */
+        s->has_2_fields = 0;
+        s->is_bottom_coded_first = 0;
+        jpeg2000_dec_cleanup(s);
+        return jpeg2000_decode_frame(avctx, picture, got_frame, avpkt);
+    }
 
     if (s->sar.num && s->sar.den)
         avctx->sample_aspect_ratio = s->sar;
     s->sar.num = s->sar.den = 0;
 
     if (avctx->skip_frame >= AVDISCARD_ALL) {
-        jpeg2000_dec_cleanup(s);
-        return avpkt->size;
+        ret = codestream_size;
+        goto end;
     }
 
     /* get picture buffer */
-    if ((ret = ff_thread_get_buffer(avctx, picture, 0)) < 0)
+    if ((!s->has_2_fields || !s->is_second_field) && (ret = 
ff_thread_get_buffer(avctx, picture, 0)) < 0)
         goto end;
     picture->pict_type = AV_PICTURE_TYPE_I;
     picture->flags |= AV_FRAME_FLAG_KEY;
@@ -2518,17 +2568,20 @@ static int jpeg2000_decode_frame(AVCodecContext *avctx, 
AVFrame *picture,
 
     avctx->execute2(avctx, jpeg2000_decode_tile, picture, NULL, s->numXtiles * 
s->numYtiles);
 
-    jpeg2000_dec_cleanup(s);
-
     *got_frame = 1;
 
     if (s->avctx->pix_fmt == AV_PIX_FMT_PAL8)
         memcpy(picture->data[1], s->palette, 256 * sizeof(uint32_t));
 
-    return bytestream2_tell(&s->g);
+    ret = codestream_size;
 
 end:
     jpeg2000_dec_cleanup(s);
+
+    /* management of frames having 2 separate codestreams */
+    s->is_second_field = s->has_2_fields && !s->is_second_field && ret < 
avpkt->size && ret >= 0; /* next call will handle the second field */
+    s->consumed = s->is_second_field ? ret : 0;
+
     return ret;
 }
 
diff --git a/libavcodec/jpeg2000dec.h b/libavcodec/jpeg2000dec.h
index d0ca6e7a79..5ae94aafd8 100644
--- a/libavcodec/jpeg2000dec.h
+++ b/libavcodec/jpeg2000dec.h
@@ -114,6 +114,12 @@ typedef struct Jpeg2000DecoderContext {
 
     /*options parameters*/
     int             reduction_factor;
+    
+    /* field info */
+    int8_t          has_2_fields;
+    int8_t          is_bottom_coded_first;
+    int8_t          is_second_field;
+    int             consumed;
 } Jpeg2000DecoderContext;
 
 #endif //AVCODEC_JPEG2000DEC_H
diff --git a/libavcodec/pthread_frame.c b/libavcodec/pthread_frame.c
index 71e99a5728..e3a8815653 100644
--- a/libavcodec/pthread_frame.c
+++ b/libavcodec/pthread_frame.c
@@ -221,6 +221,9 @@ static attribute_align_arg void *frame_worker_thread(void 
*arg)
         av_frame_unref(p->frame);
         p->got_frame = 0;
         p->result = codec->cb.decode(avctx, p->frame, &p->got_frame, p->avpkt);
+        if (p->result >= 0 && avctx->frame_wrapping_field_2_supported) {
+            p->result = codec->cb.decode(avctx, p->frame, &p->got_frame, 
p->avpkt);
+        }
 
         if ((p->result < 0 || !p->got_frame) && p->frame->buf[0])
             av_frame_unref(p->frame);
diff --git a/libavformat/mxfdec.c b/libavformat/mxfdec.c
index e42975e7fd..af33d8cad4 100644
--- a/libavformat/mxfdec.c
+++ b/libavformat/mxfdec.c
@@ -2948,6 +2948,20 @@ static int mxf_parse_structural_metadata(MXFContext *mxf)
                 default:
                     break;
                 }
+                switch ((*essence_container_ul)[14]) {
+                case 3: /* I1: Interlaced Frame, 1 field/KLV */
+                    st->codecpar->frame_wrapping = AV_WRAPPING_FIELD_1;
+                    break;
+                case 4: /* I2: Interlaced Frame, 2 fields/KLV */
+                    st->codecpar->frame_wrapping = AV_WRAPPING_FIELD_2;
+                    break;
+                case 2: /* Cn: Clip- wrapped Picture Element */
+                case 6: /* P1: Frame- wrapped Picture Element */
+                    st->codecpar->frame_wrapping = AV_WRAPPING_FRAME;
+                    break;
+                default:
+                    break;
+                }
             }
 
             if (st->codecpar->codec_id == AV_CODEC_ID_PRORES) {
-- 
2.13.3.windows.1

_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v3] avcodec/jpeg2000dec: support of 2 fields in 1 AVPacket

Reply via email to