Uwe Freese (2018-12-29): > The downside is that the term "power" (or "strength") would be more generic > and would probably be better reusable for other modes. > > What do you think? Is "exponent" now ok, or should I change it back to > "power", "strength" or something alike?
I would say: do not bother. This can be decided when a new mode is added, and options can have several names. By the way, you would not be interested in developing a logo-detection filter, per chance? Something that detects is a certain rectangle is likely to contain a logo and computes its bounding box. > Again, I would be glad about comments and reviews. - Let me know what should > be changed and I'll create a new patch in some days. Here is a more in-depth review from me: > >From 83e79abb3311eb2dd92c63e8e15e0476af2f2891 Mon Sep 17 00:00:00 2001 > From: breaker27 <m...@uwe-freese.de> > Date: Wed, 26 Dec 2018 18:16:48 +0100 > Subject: [PATCH] avfilter/vf_delogo: add uglarm interpolation mode > > --- > doc/filters.texi | 28 +++++++ > libavfilter/vf_delogo.c | 189 > +++++++++++++++++++++++++++++++++++++++++++++--- > 2 files changed, 208 insertions(+), 9 deletions(-) > > diff --git a/doc/filters.texi b/doc/filters.texi > index 65ce25bc18..792560ad79 100644 > --- a/doc/filters.texi > +++ b/doc/filters.texi > @@ -7952,6 +7952,34 @@ Specify the thickness of the fuzzy edge of the > rectangle (added to > deprecated, setting higher values should no longer be necessary and > is not recommended. > > +@item mode > +Specify the interpolation mode used to remove the logo. > +It accepts the following values: > +@table @option > +@item xy > +Only the border pixels in straight x and y direction are considered > +to replace any logo pixel. This mode tends to flicker and to show > +horizontal and vertical lines. > +@item uglarm > +Consider the whole border for every logo pixel to replace. This mode > +uses the distance raised to the power of the given @var{exponent} as > +weight that decides to which amount every border pixel is taken into > +account. This results in a more blurred area, which tends to be less > +distracting. The uglarm mode was first implemented in VirtualDub's > +LogoAway filter and is also known from ffdshow and is the > +abbreviation for "Uwe's Great LogoAway Remove Mode". > +@end table > +The default value is @code{xy}. > + > +@item exponent > +Specify the exponent used to calculate the weight out of the > +distance in @code{uglarm} mode (weight = distance ^ @var{exponent}). > +The value @code{0} results in a logo area which has > +the same average color everywhere. The higher the value, the more > +relevant a near border pixel will get, meaning that the borders of > +the logo area are more similar to the surrounding pixels. The default > +value is @code{3}. > + > @item show > When set to 1, a green rectangle is drawn on the screen to simplify > finding the right @var{x}, @var{y}, @var{w}, and @var{h} parameters. > diff --git a/libavfilter/vf_delogo.c b/libavfilter/vf_delogo.c > index 065d093641..dcb0366e63 100644 > --- a/libavfilter/vf_delogo.c > +++ b/libavfilter/vf_delogo.c > @@ -2,6 +2,7 @@ > * Copyright (c) 2002 Jindrich Makovicka <makov...@gmail.com> > * Copyright (c) 2011 Stefano Sabatini > * Copyright (c) 2013, 2015 Jean Delvare <jdelv...@suse.com> > + * Copyright (c) 2018 Uwe Freese <m...@uwe-freese.de> > * > * This file is part of FFmpeg. > * > @@ -25,12 +26,16 @@ > * A very simple tv station logo remover > * Originally imported from MPlayer libmpcodecs/vf_delogo.c, > * the algorithm was later improved. > + * The "UGLARM" mode was first implemented 2001 by Uwe Freese for Virtual > + * Dub's LogoAway filter (by Krzysztof Wojdon), taken over into ffdshow's > + * logoaway filter (by Milan Cutka), from where it was ported to ffmpeg. > */ > > #include "libavutil/common.h" > #include "libavutil/imgutils.h" > #include "libavutil/opt.h" > #include "libavutil/pixdesc.h" > +#include "libavutil/avassert.h" We try to keep the includes in alphabetical order. > #include "avfilter.h" > #include "formats.h" > #include "internal.h" > @@ -50,6 +55,10 @@ > * @param logo_w width of the logo > * @param logo_h height of the logo > * @param band the size of the band around the processed area > + * @param *uglarmtable pointer to weight table in UGLARM interpolation > mode, > + * zero when x-y mode is used > + * @param *uglarmweightsum pointer to weight sum table in UGLARM > interpolation mode, > + * zero when x-y mode is used > * @param show show a rectangle around the processed area, useful for > * parameters tweaking > * @param direct if non-zero perform in-place processing > @@ -58,7 +67,8 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, > uint8_t *src, int src_linesize, > int w, int h, AVRational sar, > int logo_x, int logo_y, int logo_w, int logo_h, > - unsigned int band, int show, int direct) > + unsigned int band, double *uglarmtable, > + double *uglarmweightsum, int show, int direct) > { > int x, y; > uint64_t interp, weightl, weightr, weightt, weightb, weight; > @@ -89,6 +99,7 @@ static void apply_delogo(uint8_t *dst, int dst_linesize, > dst += (logo_y1 + 1) * dst_linesize; > src += (logo_y1 + 1) * src_linesize; > > + if (!uglarmtable) { > for (y = logo_y1+1; y < logo_y2; y++) { > left_sample = topleft[src_linesize*(y-logo_y1)] + > topleft[src_linesize*(y-logo_y1-1)] + > @@ -151,12 +162,125 @@ static void apply_delogo(uint8_t *dst, int > dst_linesize, > dst += dst_linesize; > src += src_linesize; > } > + } else { > + int bx, by; > + double interpd; > + > + for (y = logo_y1 + 1; y < logo_y2; y++) { > + for (x = logo_x1 + 1, > + xdst = dst + logo_x1 + 1, > + xsrc = src + logo_x1 + 1; x < logo_x2; x++, xdst++, xsrc++) { I think a line break after the semicolons would make this easier to understand. Also, since x and y always used subtracted to logo_x1 or logo_y1, I think changing the bounds of the loop would make the rest of the code more readable: for (x = 1; x < logo_w_minus_one; x++) > + > + if (show && (y == logo_y1 + 1 || y == logo_y2 - 1 || > + x == logo_x1 + 1 || x == logo_x2 - 1)) { Nit: the alignment is off. > + *xdst = 0; > + continue; > + } > + > + interpd = 0; > + > + for (bx = 0; bx < logo_w; bx++) { > + interpd += topleft[bx] * > + uglarmtable[abs(bx - (x - logo_x1)) + (y - logo_y1) > * (logo_w - 1)]; > + interpd += botleft[bx] * > + uglarmtable[abs(bx - (x - logo_x1)) + (logo_h - (y - > logo_y1) - 1) * (logo_w - 1)]; > + } This can be optimized, and since it is the inner loop of the filter, I think it is worth it. You can declare a pointer that will stay the same for the whole y loop: double *xweight = uglarmtable + (y - logo_y1) * (logo_w - 1); and then use it in that loop: interpd += ... * xweight[abs(bx - (x - logo_x1))]; It avoids a lot of multiplications. > + > + for (by = 1; by < logo_h - 1; by++) { > + interpd += topleft[by * src_linesize] * > + uglarmtable[(x - logo_x1) + abs(by - (y - logo_y1)) > * (logo_w - 1)]; > + interpd += topleft[by * src_linesize + (logo_w - 1)] * > + uglarmtable[logo_w - (x - logo_x1) - 1 + abs(by - (y > - logo_y1)) * (logo_w - 1)]; > + } This one is more tricky to optimize, because of the abs() within the multiplication, but it can be done by splitting the loop in two: for (by = 1; by < y; by++) { interpd += ... * yweight[x - logo_x1]; yweight -= logo_w_minus_one; } for (; by < logo_h_minus_one; by++) { interpd += ... * yweight[x - logo_x1]; yweight += logo_w_minus_one; } av_assert2(yweight == the_correct_value); In fact, I think even the x loop would be a little more readable if it was split in two like that. > + > + interp = (uint64_t)(interpd / > + uglarmweightsum[(x - logo_x1) - 1 + (y - logo_y1 - 1) * > (logo_w - 2)]); The cast to uint64_t seems suspicious. Can you explain? > + *xdst = interp; > + } > + > + dst += dst_linesize; > + src += src_linesize; > + } > + } > +} > + > +/** > + * Calculate the lookup tables to be used in UGLARM interpolation mode. > + * > + * @param *uglarmtable Pointer to table containing weights for each > possible > + * diagonal distance between a border pixel and an > inner > + * logo pixel. > + * @param *uglarmweightsum Pointer to a table containing the weight sum to > divide > + * by for each pixel within the logo area. > + * @param sar The sar to take into account when calculating > lookup > + * tables. > + * @param logo_w width of the logo > + * @param logo_h height of the logo > + * @param exponent exponent used in uglarm interpolation > + */ > +static void calc_uglarm_tables(double *uglarmtable, double *uglarmweightsum, > + AVRational sar, int logo_w, int logo_h, > + float exponent) > +{ > + double aspect = (double)sar.num / sar.den; Tiny optimization: double aspect2 = aspect * aspect; for use later. > + int x, y; > + > + /* uglarmtable will contain a weight for each possible diagonal distance > + * between a border pixel and an inner logo pixel. The maximum distance > in > + * each direction between border and an inner pixel can be logo_w - 1. > The > + * weight of a border pixel which is x,y pixels away is stored at > position > + * x + y * (logo_w - 1). */ > + for (y = 0; y < logo_h - 1; y++) > + for (x = 0; x < logo_w - 1; x++) { > + if (x + y != 0) { > + double d = pow(sqrt(x * x * aspect * aspect + y * y), > exponent); pow(sqrt(x * x * aspect2 + y * y), exponent / 2); > + uglarmtable[x + y * (logo_w - 1)] = 1.0 / d; > + } else { > + uglarmtable[x + y * (logo_w - 1)] = 1.0; > + } > + } > + > + /* uglarmweightsum will contain the sum of all weights which is used when > + * an inner pixel of the logo at position x,y is calculated out of the > + * border pixels. The aggregated value has to be divided by that. The > value > + * to use for the inner 1-based logo position x,y is stored at > + * (x - 1) + (y - 1) * (logo_w - 2). */ > + for (y = 1; y < logo_h - 1; y++) > + for (x = 1; x < logo_w - 1; x++) { > + double weightsum = 0; > + > + for (int bx = 0; bx < logo_w; bx++) { Our coding style forbids declaring variables on the fly like that. Same below. > + /* top border */ > + weightsum += uglarmtable[abs(bx - x) + y * (logo_w - 1)]; > + /* bottom border */ > + weightsum += uglarmtable[abs(bx - x) + (logo_h - y - 1) * > (logo_w - 1)]; > + } > + > + for (int by = 1; by < logo_h - 1; by++) { > + /* left border */ > + weightsum += uglarmtable[x + abs(by - y) * (logo_w - 1)]; > + /* right border */ > + weightsum += uglarmtable[(logo_w - x - 1) + abs(by - y) * > (logo_w - 1)]; > + } > + > + uglarmweightsum[(x - 1) + (y - 1) * (logo_w - 2)] = weightsum; > + } > } > > +enum mode { > + MODE_XY, > + MODE_UGLARM, > + MODE_NB > +}; > + > +#define MAX_PLANES 10 You could use FF_ARRAY_ELEMS(AVPixFmtDescriptor.comp), and the consistency would be guaranteed. Well, that syntax is not valid, but it can be expressed: #define MAX_PLANES FF_ARRAY_ELEMS(((AVPixFmtDescriptor *)0)->comp) But I have a better suggestion: #define MAX_SUB 2 double uglarmtable[MAX_SUB + 1][MAX_SUB + 1] and use uglarmtable[hsub][vsub]. That way, for YUVA420P for example, the table will be computed only once for Y and A and once for U and V. The assert is still needed with that solution, though. > + > typedef struct DelogoContext { > const AVClass *class; > - int x, y, w, h, band, show; > -} DelogoContext; > + int x, y, w, h, band, mode, show; > + float exponent; > + double *uglarmtable[MAX_PLANES], *uglarmweightsum[MAX_PLANES]; > +} DelogoContext; > > #define OFFSET(x) offsetof(DelogoContext, x) > #define FLAGS AV_OPT_FLAG_FILTERING_PARAM|AV_OPT_FLAG_VIDEO_PARAM > @@ -171,6 +295,10 @@ static const AVOption delogo_options[]= { > { "band", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { > .i64 = 0 }, 0, INT_MAX, FLAGS }, > { "t", "set delogo area band size", OFFSET(band), AV_OPT_TYPE_INT, { > .i64 = 0 }, 0, INT_MAX, FLAGS }, > #endif > + { "mode", "set the interpolation mode",OFFSET(mode), AV_OPT_TYPE_INT, { > .i64 = MODE_XY }, MODE_XY, MODE_NB-1, FLAGS, "mode" }, > + { "xy", "use pixels in straight x and y direction", > OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_XY }, 0, 0, FLAGS, "mode" > }, > + { "uglarm", "UGLARM mode, use full border", > OFFSET(mode), AV_OPT_TYPE_CONST, { .i64 = MODE_UGLARM }, 0, 0, FLAGS, "mode" > }, > + { "exponent","exponent used for UGLARM interpolation", OFFSET(exponent), > AV_OPT_TYPE_FLOAT, { .dbl = 3.0 }, 0, 6, FLAGS }, > { "show", "show delogo area", OFFSET(show), AV_OPT_TYPE_BOOL,{ > .i64 = 0 }, 0, 1, FLAGS }, > { NULL } > }; > @@ -215,8 +343,8 @@ static av_cold int init(AVFilterContext *ctx) > #else > s->band = 1; > #endif > - av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d show:%d\n", > - s->x, s->y, s->w, s->h, s->band, s->show); > + av_log(ctx, AV_LOG_VERBOSE, "x:%d y:%d, w:%d h:%d band:%d mode:%d > exponent:%f show:%d\n", > + s->x, s->y, s->w, s->h, s->band, s->mode, s->exponent, s->show); > > s->w += s->band*2; > s->h += s->band*2; > @@ -226,6 +354,18 @@ static av_cold int init(AVFilterContext *ctx) > return 0; > } > > +static av_cold void uninit(AVFilterContext *ctx) > +{ > + DelogoContext *s = ctx->priv; > + > + if (s->mode == MODE_UGLARM) { No need to test, we can free anyway. > + for (int plane = 0; plane < MAX_PLANES; plane++) { Forbidden variable declaration. > + av_freep(&s->uglarmtable[plane]); > + av_freep(&s->uglarmweightsum[plane]); > + } > + } > +} > + > static int config_input(AVFilterLink *inlink) > { > DelogoContext *s = inlink->dst->priv; > @@ -270,20 +410,50 @@ static int filter_frame(AVFilterLink *inlink, AVFrame > *in) > if (!sar.num) > sar.num = sar.den = 1; > > + if (s->mode == MODE_UGLARM) > + av_assert0(desc->nb_components <= MAX_PLANES); > + > for (plane = 0; plane < desc->nb_components; plane++) { > int hsub = plane == 1 || plane == 2 ? hsub0 : 0; > int vsub = plane == 1 || plane == 2 ? vsub0 : 0; > > + /* Up and left borders were rounded down, inject lost bits > + * into width and height to avoid error accumulation */ Not: the alignment is off. > + int logo_w = AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), hsub); > + int logo_h = AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), vsub); > + > + /* Init lookup tables once */ > + if (s->mode == MODE_UGLARM) { > + if (!s->uglarmtable[plane]) { > + s->uglarmtable[plane] = > + (double*)av_malloc((logo_w - 1) * (logo_h - 1) * > sizeof(double)); No casting of malloc returns in C, this is a c++ thing. Also, use av_malloc_array(). > + > + if (!s->uglarmtable[plane]) { > + return AVERROR(ENOMEM); > + } The braces could be dropped. > + > + s->uglarmweightsum[plane] = > + (double*)av_malloc((logo_w - 2) * (logo_h - 2) * > sizeof(double)); > + > + if (!s->uglarmweightsum[plane]) { > + return AVERROR(ENOMEM); > + } Same as above, of course. > + > + calc_uglarm_tables(s->uglarmtable[plane], > + s->uglarmweightsum[plane], > + sar, logo_w, logo_h, s->exponent); > + } > + } > + I think this whole block could and should be moved into config_input(). > apply_delogo(out->data[plane], out->linesize[plane], > in ->data[plane], in ->linesize[plane], > AV_CEIL_RSHIFT(inlink->w, hsub), > AV_CEIL_RSHIFT(inlink->h, vsub), > sar, s->x>>hsub, s->y>>vsub, > - /* Up and left borders were rounded down, inject lost > bits > - * into width and height to avoid error accumulation */ > - AV_CEIL_RSHIFT(s->w + (s->x & ((1<<hsub)-1)), hsub), > - AV_CEIL_RSHIFT(s->h + (s->y & ((1<<vsub)-1)), vsub), > + logo_w, logo_h, > s->band>>FFMIN(hsub, vsub), > + s->uglarmtable[plane], > + s->uglarmweightsum[plane], > s->show, direct); > } > > @@ -317,6 +487,7 @@ AVFilter ff_vf_delogo = { > .priv_size = sizeof(DelogoContext), > .priv_class = &delogo_class, > .init = init, > + .uninit = uninit, > .query_formats = query_formats, > .inputs = avfilter_vf_delogo_inputs, > .outputs = avfilter_vf_delogo_outputs, Regards, -- Nicolas George
signature.asc
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel